Machine Learning Interpretation of Optical Spectroscopy Using Peak-Sensitive Logistic Regression

A pytorch implementation for our PSE-LR.

Wang, Ziyang and Ranasinghe, Jeewan C and Wu, Wenjing and Chan, Dennis CY and Gomm, Ashley and Tanzi, Rudolph E and Zhang, Can and Zhang, Nanyin and Allen, Genevera I and Huang, Shengxi

Abstract: Optical spectroscopy, a noninvasive molecular sensing technique, offers valuable insights into material characterization, molecule identification, and biosample analysis. Despite the informativeness of high-dimensional optical spectra, their interpretation remains a challenge. Machine learning methods have gained prominence in spectral analyses, efficiently unveiling analyte compositions. However, these methods still face challenges in interpretability, particularly in generating clear feature importance maps that highlight the spectral features specific to each class of data. These limitations arise from feature noise, model complexity, and the lack of optimization for spectroscopy. In this work, we introduce a machine learning algorithm─logistic regression with peak-sensitive elastic-net regularization (PSE-LR)─tailored for spectral analysis. PSE-LR enables classification and interpretability by producing a peak-sensitive feature importance map, achieving an F1-score of 0.93 and a feature sensitivity of 1.0. Its performance is compared with other methods, including k-nearest neighbors (KNN), elastic-net logistic regression (E-LR), support vector machine (SVM), principal component analysis followed by linear discriminant analysis (PCA-LDA), XGBoost, and neural network (NN). Applying PSE-LR to Raman and photoluminescence (PL) spectra, we detected the receptor-binding domain (RBD) of SARS-CoV-2 spike protein in ultralow concentrations, identified neuroprotective solution (NPS) in brain samples, recognized WS2 monolayer and WSe2/WS2 heterobilayer, analyzed Alzheimer’s disease (AD) brains, and suggested potential disease biomarkers. Our findings demonstrate PSE-LR’s utility in detecting subtle spectral features and generating interpretable feature importance maps. It is beneficial for the spectral characterization of materials, molecules, and biosamples and applicable to other spectroscopic methods. This work also facilitates the development of nanodevices such as nanosensors and miniaturized spectrometers based on nanomaterials.

Architecture

Installation

Install dependencies

  pip install -r requirements.txt

Paper Reference

If you use this implementation, please cite our paper:

@article{wang2025machine,
  title={Machine Learning Interpretation of Optical Spectroscopy Using Peak-Sensitive Logistic Regression},
  author={Wang, Ziyang and Ranasinghe, Jeewan C and Wu, Wenjing and Chan, Dennis CY and Gomm, Ashley and Tanzi, Rudolph E and Zhang, Can and Zhang, Nanyin and Allen, Genevera I and Huang, Shengxi},
  journal={ACS nano},
  year={2025},
  publisher={ACS Publications},
  doi={10.1021/acsnano.4c16037}
}

Features

Peak-Sensitive Regularization: Specialized penalty term that preserves and enhances spectral peaks while maintaining smoothness
Elastic Net Regularization: Combines L1 and L2 penalties for feature selection and smoothness
Automatic Differentiation: Uses PyTorch for efficient gradient computation
scikit-learn Compatible: Follows scikit-learn's API conventions for easy integration
Optical Spectroscopy Focus: Optimized for analyzing optical spectra and peak detection

Installation

download PSE_LR.py

Dependencies

Python >= 3.6
NumPy >= 1.19.0
scikit-learn >= 0.24.0
PyTorch >= 1.7.0

Usage

Basic Usage for Optical Spectroscopy

from PSE_LR import PSE_LR
import numpy as np

# Load your optical spectroscopy data
# X should be your spectral data matrix (n_samples, n_wavelengths)
# y should be your class labels
X = np.load('spectra.npy')  # shape: (n_samples, n_wavelengths)
y = np.load('labels.npy')   # shape: (n_samples,)

# Initialize the model with recommended parameters for optical spectroscopy
model = PSE_LR(C=1.0, l1_ratio=0.5, d=10.0)

# Fit the model
model.fit(X, y)

# Make predictions
predictions = model.predict(X)

# Get accuracy
accuracy = model.score(X, y)

Advanced Usage for Spectral Analysis

# Custom initialization with domain-specific parameters
model = PSE_LR(C=0.1, l1_ratio=0.7, d=5.0)

# Fit with custom parameters
model.fit(X, y, epochs=500, learning_rate=1e-3)

# Access training history
losses = model.losses
accuracies = model.train_accuracies

# Get model parameters
params = model.get_params()

Parameters

C: float, default=1.0
- Inverse of regularization strength. Smaller values specify stronger regularization.
- Recommended range for optical spectroscopy: 0.1 to 1.0
l1_ratio: float, default=0.5
- The ElasticNet mixing parameter, with 0 <= l1_ratio <= 1.
- l1_ratio=0 corresponds to L2 penalty, l1_ratio=1 to L1.
- Recommended range for optical spectroscopy: 0.3 to 0.7
d: float, default=10.0
- Strength of the peak-sensitive penalty for smoothness.
- Higher values emphasize peak preservation.
- Recommended range for optical spectroscopy: 5.0 to 15.0

Applications in Optical Spectroscopy

PSE_LR is specifically designed for:

Optical spectrum classification
Peak detection and analysis
Spectral feature selection
Interpretation of spectroscopic data
Wavelength importance analysis

Example: Spectral Analysis and Visualization

import matplotlib.pyplot as plt
import numpy as np

# Assuming X contains optical spectra
model = PSE_LR(C=0.1, l1_ratio=0.3, d=15.0)
model.fit(X, y)

# Plot the learned weights (feature importance)
wavelengths = np.linspace(400, 800, X.shape[1])  # Example wavelength range
plt.figure(figsize=(12, 6))
plt.plot(wavelengths, model.weights.detach().numpy())
plt.title('Spectral Feature Importance')
plt.xlabel('Wavenuber')
plt.ylabel('Weight Value')
plt.grid(True)
plt.show()

# Plot training history
plt.figure(figsize=(12, 4))
plt.subplot(1, 2, 1)
plt.plot(model.losses)
plt.title('Training Loss')
plt.xlabel('Epoch')
plt.ylabel('Loss')

plt.subplot(1, 2, 2)
plt.plot(model.train_accuracies)
plt.title('Training Accuracy')
plt.xlabel('Epoch')
plt.ylabel('Accuracy')
plt.tight_layout()
plt.show()

Contributing

We welcome contributions to improve this implementation. Please feel free to submit a Pull Request.

License

This project is licensed under the MIT License - see the LICENSE file for details.

Contact

For questions about the implementation or the paper, please contact the authors or open an issue on GitHub.

Name		Name	Last commit message	Last commit date
Latest commit History 1 Commit
PSE_LR.jpg		PSE_LR.jpg
PSE_LR.py		PSE_LR.py
README.md		README.md
requirements.txt		requirements.txt

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

Machine Learning Interpretation of Optical Spectroscopy Using Peak-Sensitive Logistic Regression

Architecture

Installation

Paper Reference

Features

Installation

Dependencies

Usage

Basic Usage for Optical Spectroscopy

Advanced Usage for Spectral Analysis

Parameters

Applications in Optical Spectroscopy

Example: Spectral Analysis and Visualization

Contributing

License

Contact

About

Uh oh!

Releases

Packages

Uh oh!

Languages

tigerwang3133/PSE_LR

Folders and files

Latest commit

History

Repository files navigation

Machine Learning Interpretation of Optical Spectroscopy Using Peak-Sensitive Logistic Regression

Architecture

Installation

Paper Reference

Features

Installation

Dependencies

Usage

Basic Usage for Optical Spectroscopy

Advanced Usage for Spectral Analysis

Parameters

Applications in Optical Spectroscopy

Example: Spectral Analysis and Visualization

Contributing

License

Contact

About

Topics

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Languages

Packages