High throughput experimentation (HTE) is a process of scientific exploration involving lab automation, effective experimental design, and rapid parallel or serial experiments. HTE often requires robotics, rigs, semiautomated kits, multichannel pipettors, solid dispensers, and liquid handlers. Well-designed HTE experiments result in a wealth of experimental data that creates the foundation for better technical decisions. Effective HTE work requires an appropriate IT and informatics infrastructure to fully capture all of the data in a FAIR (findable, accessible, interoperable, reusable)-compliant fashion. In addition to capturing raw data, results, and conclusions, the process can be significantly enhanced by ideation capture, and other design and experimental learnings. This improvement in knowledge management optimizes experimentation. An additional benefit of improved knowledge management is better organization of intellectual property. Including ideation provides contextual and foundational information that inspires the experimental work.
The tools necessary for a working HTE program are high throughput equipment for fast and parallel experiments, computational methods to design experiments, and a FAIR data environment with a well-designed data repository and query system to retrieve and further use the data in future ideation and enhanced designs. There are multiple advantages for implementing HTE, including greater reproducibility, innovation, and efficiency. HTE is ideal to accomplish more with less, and do it faster. For labs to realize the full benefits of HTE, careful investment in strategy, hardware, and software is required. Unfortunately, the software platforms for HTE projects are often underfunded or neglected, resulting in lost value and opportunity.
HTE in biological science labs
High throughput screening and high content screening have matured to enable researchers to routinely and rapidly screen thousands to millions of molecules in a biochemical or cellular context. This process uses a variety of assay and imaging techniques to determine biological activity, genetic markers, apoptosis, toxicity, binding, and other biochemical, cellular, and tissue endpoints. This has become a staple of the drug discovery process and has provided many lessons learned and valuable critical decision-making knowledge for an industry whose foundation relies on data. While lab information systems have provided tremendous value by enabling lab automation equipment and robots to run high throughput screens, the real value is derived from the efficient capture of data that allows scientists to create actionable knowledge and insights. A key lesson from this process is that targeted screening using design of experiments (DOE) and critical scientific and mathematical thinking yields greater success. The hardware from vendors like Tecan, Hamilton, and Molecular Devices have been transformational for this process. The next big advancement to biological HTE is likely to come in the area of artificial intelligence (AI) and machine learning (ML). Labs that embrace these advancements will likely emerge as the industry leaders.
HTE in chemical science labs
The advent of high throughput synthesis created the early market for lab automation and informatics for sample tracking and FAIR-compliant data capture. An early issue facing chemical discovery labs was the belief that all makeable chemical diversity was equally valuable. Unfortunately, just because you could synthesize a molecule did not mean it would be a successful product. Now, this valuable learning is leveraged by computational scientists to improve predictive models that inform the ideation and design components of the cycle. Research leaders in lab automation continue to make significant advances in synthesis equipment and automation platforms by introducing modular workstations with few synthetic restrictions. The next significant advance will come with integration of the workstations into the organizational analytical and IT systems.
The analytical chemistry space has also embraced HTE technology. The analytical platforms closely mimic those used to run high throughput screening. Analytical HTE uses instruments capable of completing analyses in a fast serial manner providing rapid turnaround of critical decision-making data. As was the case in the discovery chemistry space, automation exploiting plate-based analyses were initially well integrated with the existing IT systems. Integration of these analytical systems and their output with new instruments or new IT systems is often done via intermediary databases, which can slow down processes and critical decision making.
HTE in material science labs
Materials science, and in particular catalysis, is characterized by a scarcity of data compared to other technical domains. The key reason is the inverse correlation in this technology area between parallelization or miniaturization typically used by HTE and scale-up. Early HTE reaction screening was highly parallelized and miniaturized, but struggled to be relevant to drive optimization at larger scales. Using a combinatorial approach quickly becomes intractable for materials science. Today, efforts focus on larger scale equipment with a relatively limited reactor parallelization (four to 16) that use conditions allowing easier scale-up.
The next advance of HTE in materials science will likely come from a computational technique called active learning (AL). AL is concerned with the integration of data collection, DOE, and data mining. AL can better exploit the high volumes of acquired data. The learner is not a passive recipient of the data to be processed. The researcher has the control of data acquisition, and must select the samples to extract the greatest benefit from future data treatments. AL is crucial when each data point is costly, and the ability to predict outcomes is imperfect. AL will also enable more selective experiments to be run to optimize libraries based on the learning efficiency of the technique.
Data analysis challenges in HTE
HTE has helped synthesize new materials faster, test samples faster, and characterize materials faster. The huge amounts of data generated by these processes are impossible for humans alone to process and make optimal decisions. HTE data often requires adapted algorithms and high-performance computing to decipher the high volumes of data. Scientists still need to provide a clear understanding of the fundamental science, and utilize effective IT systems to get the most from the huge amounts of data produced by HTE experiments.
The best success often comes from effectively combining the electronic lab notebook (ELN) and lab information management system (LIMS) environments to provide the request, sample, experiment, test, analysis, and reporting workflows that will allow HTE to run efficiently. The Sapio Sciences Exemplar scientific platform is an example of a system that combines modules with analysis and knowledge extraction. An alternative is a solution architecture that takes a components approach sewing together DOE, ELN, sample management, analysis/visualization, and reporting from different sources. The challenge with a components approach is truly integrating the different pieces, so that the whole works efficiently without significant interaction from the human scientists.
High potential, yet challenges remain
The promise of high throughput experimentation to revolutionize the life and material science industries is yet to be fully realized. While HTE does not yet identify drug molecules directly from high throughput screens, it does find lead molecules from which to develop potential drugs. HTE optimization is routinely used in the life and material sciences to improve technical routes like yield, cost reduction, and sustainability improvement. One important future for HTE methods is to create greener, more environmentally friendly chemical processes. These green chemistry approaches are already underway in many R&D organizations.
Despite significant improvements in HTE, managing the data, processes, standards, contextualization, metadata, integration, and DOE remains challenging. HTE provides scientists with the ability to test multiple hypotheses in parallel and has produced an exponential increase in data generation. However, there is still significant room for improvement in key metrics, such as discovery cycle times and cost reduction. It seems that the ability to generate data, through the continued evolution and improvement of automation platforms, has outpaced our ability to optimally leverage that data for improved decision making. Closing the gap will require the development of reliable predictive models and improved integration with AI and ML. Once labs properly contextualize—at the time of capture—properly curated data, they can generate maximum value from the data. Then the power of AI and ML, and the implementation of an in silico-first or model-first strategy, can be realized. Labs who successfully marry big data with AI and ML by solving the communication, compatibility, and seamless integration problems will easily differentiate themselves from their competitors. Establishing a FAIR data environment will dramatically reduce the effort spent on data wrangling and allow scientists to focus on the ideation and design—leading to better experiments and better outcomes.