Analytical chemistry techniques enable scientists and researchers to characterize the composition of matter, both qualitatively or quantitatively. These techniques are widely used in clinical, environmental, food science, and numerous other applications. The evolution of high-resolution instruments and hyphenated methods such as gas or liquid chromatography-mass spectrometry (GC/LC-MS), liquid chromatography-nuclear magnetic resonance spectroscopy (LC-NMR), among others, has led to even larger quantities of analytical data being generated. In the era of “big data,” organizations are now looking to leverage both historical and current data for data science purposes, including analytical chemistry. Analytical instruments produce enormous, complex data sets, containing a wealth of information with which analytical chemists can characterize the composition of matter. Despite this, data analysis capabilities and algorithms lag behind, limiting our ability to explore and utilize this data to its full extent. Artificial intelligence (AI) and machine learning are being implemented to address this challenge, and accelerate numerous other applications within analytical chemistry.
AI in the context of analytical chemistry
In the 1970s, the discipline of chemometrics, which is in part the use of mathematical and statistical methods to analyze chemical data to obtain maximum chemical information, was introduced. As personal computers became more widely available in the 1980s, there was an opportunity to make these complex mathematical methods more feasible, and commercialized statistical software emerged.
Chemometric techniques are ideal for analyzing chemical structures and spectra. For example, statistical methods of pattern recognition enable the extraction of specific structures, or the comparison of an unknown spectrum to a spectral library for identification. With the combination of high-resolution instrumentation and hyphenated techniques that produce enormous, complex data sets, there is a need for more sophisticated chemometric tools as many existing algorithms are limited in their ability to analyze this data.
Richard Lee, director, core technology at ACD/Labs, envisions AI being used in analytical chemistry to address the challenge of analyzing such data sets. “The primary use of analytical data is in sample characterization and ultimately identification, whether for a single component or mixture. A widely used application of AI across industries is image recognition. This method can also be applied for feature recognition in analytical chemistry for chromatograms and spectra,” explains Lee. “Mixture compositions are an example of pattern recognition where the system would be able to discern chemical composition based on retention times (or relative retention times) under certain chromatographic conditions.” In addition to chromatograms, feature and pattern recognition may also be applied to NMR, mass, or other spectra.
Key benefits for pharmaceutical and consumer product industries
According to Lee, one of the main reasons analytical chemistry laboratories are adopting AI is to “[gain] any advantage they can in a competitive landscape, and [adapt] to a faster development cycle, whether in pharmaceutical or consumer product industries.”
AI can address specific challenges pertaining to synthetic chemistry and ingredients applications by, for example, aiding in answering some important questions. “Evaluation of the success of chemical reactions is done by collecting analytical data to help the scientist decide ‘was my reaction successful and if so, how successful was it?’” explains Lee, adding, “from the perspective of consumer products, what is the best combination of ingredients/components/formulations that will best produce the product with desired characteristics?”
One of the challenges facing synthetic chemistry groups is where to concentrate their efforts—“how they can reduce the scope of chemical space and ultimately accelerate efforts for target generation based on analytical data,” says Lee. With well annotated data related to experimental design, AI can provide valuable insights into synthetic chemistry strategy. “The degree of successful reactions based on analytical results are critical and can lead chemistry toward a defined direction, but failed reactions can also be leveraged as part of the training set to guide away from specific chemistries,” explains Lee.
With access to historical data, AI can also aid consumer products manufacturers in achieving specific textures, consistencies, flavors, or fragrances. “While this is not currently implemented on a wide scale, research and development groups are working toward future implementations,” he notes.
Tackling complex mixtures
Certain applications in analytical chemistry may reap significant benefits with the integration of AI. One such example is the process of identifying components within a complex mixture. The process can become time consuming, and involves numerous analytical techniques including chromatographic separation (and potentially additional separation with ion mobility), and detection via ultraviolet, MS, NMR, or infrared methods. The experimental spectra obtained must then be compared to known spectra in large databases to identify known components.
AI offers a much more efficient solution, and according to Lee, “using data from retention times to understand physiochemical properties (pKa and logD) based on reverse phase chromatography and mass spectrometrical features in a spectrum allows us to categorize unique chemical entities into sub-classes of compounds, identify substructure, and ultimately identify chemical structure.”
AI can also support chromatographic method development, which is an essential process to ensure the method is suitable for its intended use (for example, for the development and manufacturing of pharmaceuticals). It encompasses the selection of a chromatography mode, detector, stationary phase, mobile phase, and numerous other factors. Method development is becoming a more frequent task in pharmaceutical laboratories, as new target molecules continue to emerge.
“Chromatographic method development based on mixture composition is an area of active research,” says Lee. “Physiochemical properties (pKa and logD) along with historical separation data (mobile phase, column, additives) can accelerate method development. This can be leveraged at various stages of chemistry; for example, in pharmaceutical development where there are specific requirements for separation between components in the final product to ensure sufficient resolution between possible impurities/degradants.” It is important to note that such applications of AI require a defined data model and well annotated data.
Overcoming challenges, and the future of AI in analytical chemistry
A major obstacle to the widespread implementation of AI in analytical chemistry is data heterogeneity. There are numerous instrument manufacturers, each having developed their own proprietary data formats, which must be addressed through data normalization. “Finding a universal format or a consistent data model will be key for AI applications that leverage analytical chemistry data,” explains Lee.
Another challenge is ensuring accurate data for algorithm training. “Having complete and correctly annotated data will be equally critical,” says Lee. “This does not preclude failed experiments (which can be just as valuable as successful experiments) but rather, they must be properly annotated, for inclusion in AI training sets.”
AI also faces specific challenges within the pharmaceutical industry. Notably, it has been suggested that a shift from the current expert-driven scientific method to a data-driven partnership between scientists and AI must occur. Others posit that successful drug design using AI must provide solutions to several questions, encompassed in five challenges: obtaining appropriate data sets, generating new hypotheses, optimizing in a multi-objective manner, reducing cycle times, and changing the research culture.
Lee remains optimistic regarding the future of AI in analytical chemistry. “Once AI matures and is more easily accessible, it has the ability to be a transformative technology. AI has the opportunity to guide and direct the scientist, reducing the scope of chemistry they have to evaluate, and can potentially affect all chemistry research and development industries. Although AI can provide guidance and direction, ultimately, decision-making will always rely on the chemist and their knowledge.”