A new review published in Artificial Intelligence & Environment examines how machine learning is transforming non-targeted analysis workflows for detecting environmental pollutants, helping laboratories address persistent analytical limitations.
Environmental pollutants are highly diverse and include pharmaceuticals, pesticides, industrial additives, and their transformation products. Many lack commercially available reference standards, complicating identification and quantification using traditional analytical methods.
Non-targeted analysis based on liquid chromatography coupled with high-resolution mass spectrometry can detect thousands of chemical features in a single environmental sample. However, only a small fraction of these signals can typically be identified with confidence using existing spectral libraries.
“Less than a few percent of environmentally relevant compounds can currently be confidently identified using traditional workflows,” the authors explain. This data interpretation bottleneck has limited the full potential of high-resolution mass spectrometry in environmental science.
Machine learning offers a way forward. By applying predictive models to spectral data, researchers can expand identification capabilities beyond the constraints of conventional rule-based approaches.
Expanding high-resolution mass spectrometry with predictive modeling
Machine learning models can predict tandem mass spectra from known molecular structures, effectively expanding spectral libraries in silico and strengthening non-targeted analysis capabilities.
These tools can infer molecular formulas, structural fragments, and molecular fingerprints directly from experimental spectra, narrowing candidate structures and improving identification confidence.
The review also highlights generative modeling approaches that propose plausible chemical structures even when compounds are absent from existing databases. This capability is particularly important for emerging environmental pollutants and transformation products that have not been formally cataloged.
Orthogonal parameters, such as retention time and collision cross-section, further enhance structural confirmation. Neural network models can predict these properties across chromatographic and ion mobility platforms, reducing false positives and improving reliability in high-resolution mass spectrometry workflows.
Addressing quantification challenges in non-targeted analysis
Quantification presents an additional challenge in non-targeted analysis, particularly when authentic standards are unavailable. The review describes machine learning approaches that predict ionization efficiency and response factors from molecular structure and experimental conditions, enabling semi-quantitative analysis of environmental pollutants without requiring standards for every detected compound.
Reliable quantification remains essential for exposure assessment and environmental risk evaluation. The authors note that machine-learning–based prediction of ionization behavior offers a pathway to more scalable, standard-free quantification in large-scale screening programs.
Implications for environmental laboratories
Despite rapid progress, challenges remain, including model transferability across instruments, limited representation of environmental pollutants in training datasets, and the need for improved interpretability. The authors call for multimodal learning strategies that integrate molecular features with experimental parameters, as well as for expanded databases that more accurately reflect environmental chemical space.
Looking ahead, researchers envision integrated machine-learning–driven screening platforms that combine compound identification, property prediction, and quantification within unified non-targeted analysis workflows.
For laboratories conducting environmental monitoring, regulatory screening, or exposure assessment, advances in non-targeted analysis supported by high-resolution mass spectrometry and machine learning may improve scalability, reduce manual data interpretation, and enhance confidence in pollutant detection.
This article was created with the assistance of Generative AI and has undergone editorial review before publishing.












