Advances in cellular assays and imaging technologies over recent years have increased throughput, allowing researchers to generate high-quality data more quickly. However, evaluating these large datasets promptly remains a challenge, especially in drug discovery where timely, actionable data is crucial. Artificial intelligence (AI), particularly machine learning (ML) and deep learning (DL), is being widely adopted to address this issue in drug discovery workflows.
Reduce the time and costs for drug development
It is commonly known that drug discovery takes an extraordinary amount of time and money to complete. The use of ML/DL in the high-content screening of thousands of potential drug candidates reduces the time, personnel, and costs needed to complete the screening and data analysis. One example of this is cell painting which uses up to six different dyes to stain cellular compartments.1 Cellular imaging is used to extract different features from the “painted” cell. Hundreds, or even thousands, of such features are extracted from individual images or cells to produce a valid dataset. If completed manually, this type of image analysis would require days to weeks of multiple scientists’ time (and the associated costs). ML/DL completes image analysis and data extraction in minutes, sometimes even seconds, for a field or well with one scientist overseeing the process.
This use of ML/DL enables the rapid identification of drugs with high efficacy that should continue to next steps, and concurrently identifies those that should be dropped from further consideration. In addition to saving time and costs for assay completion, it also saves future time and costs that would be incurred should ineffective drugs continue to the next steps of evaluation.
GLOSSARY Artificial intelligence (AI): AI is the science of developing computers that make human-like decisions using data. Machine learning (ML): ML is a branch of AI focused on how machines learn from data. It automates model building through iterative analytical computations using human-designed algorithms and training data. Deep learning (DL): DL is a subset of ML that uses deep neural networks (DNN) to perform tasks such as speech and image recognition. DL involves multiple algorithms that interpret data in various ways, forming a DNN that can evaluate large datasets and identify deep patterns. |
Gain novel insights early
Insights gained from earlier studies, which inform the design of downstream studies, must be robust and accurate. DL is being used to uncover deep insights from cellular imaging that are not obtainable by traditional image analysis.
For instance, complex image analysis tasks such as three-dimensional segmentation (the process of identifying areas of interest in a 3D image) and quantification (quantifying meaningful features in an image) are needed to fully evaluate complex cellular and tissue models such as spheroids and organoids. This is challenging to do with traditional image analysis. ML/DL methods with their DNN can use gathered data to automatically optimize segmentation and classification in these complex models.
A study explored the application of DL in an image-based profiling framework as a means of identifying unexpected phenotypes not relying on previous knowledge of expected phenotypes. 2 The researchers applied the framework to several large datasets of nuclear and mitotic cell morphologies, identifying and segmenting several rare phenotypes that would not have been identified by conventional image analysis or ML using training data for expected phenotypes.
Increase productivity
By reducing the time required to conduct assays and data analysis, along with automating certain processes, AI frees up more time for scientists to focus on the science that requires a uniquely human touch.
Technology providers are using ML/DL to provide advanced data analytics with their imaging software.
For example, one study used DL to eliminate the need for traditional feature selection and reduction in image analysis. 3 The researchers developed a multi-scale convolutional neural network (M-CNN), which can perform feature extraction on images of varying sizes5, to classify cellular images without the time-consuming steps of loading existing data and manual customization. In one step, the application classified cellular images into phenotypes using the images’ pixel intensity values. The researchers evaluated the performance of the classification using eight benchmark datasets, showing a greater classification accuracy than other standard procedures and DNN architectures, while demonstrating the ability to quantitatively describe phenotypes.
Future outlook
As drug discovery labs increasingly adopt new ML/DL approaches, they are uncovering more potential ways it can help by:
- Improving image quality, analysis time, and computing power
- Using precise data reduction to eliminate non-target data
- Successfully integrating and analyzing terabytes of datasets
- Focusing on targeting biologically relevant cell types such as primary and stem cells
There are challenges that must be met to continue the growth of AI-enabled innovation in drug discovery. For example, data scientists and organizations must find innovative ways to increase access to, and sharing of, databases to provide the enormous amount of data needed for deep learning. There must be an increase in the number of skilled data scientists and software engineers to design and operate AI-based platforms. Strategic and educational dialogue must continue to help pharmaceutical companies recognize and appreciate the potential of AI in drug development.
Technology providers are using ML/DL to provide advanced data analytics with their imaging software. This is especially important for complex, yet more biologically-relevant models, such as organoids and spheroids, for which it can be difficult to obtain accurate and thorough data on morphology, texture, and phenotypic classification.
Drug discovery researchers are making strides toward more efficient and timely workflows by pairing high-throughput imaging technology with advanced ML/DL applications for data analysis to enable faster access to better therapies.
References
- Bray, M.A. et al. 2016. “Cell painting, a high-content image-based assay for morphological profiling using multiplexed fluorescent dyes.” Nat Protoc. 11(9):1757-74. https://www.ncbi.nlm.nih.gov/pmc/articles/PMC5223290/
- Sommer, C. et al. 2017. “A deep learning and novelty detection framework for rapid phenotyping in high content screening.” Molecular Biology of the cell, Vol. 28, No. 23. https://www.molbiolcell.org/doi/full/10.1091/mbc.e17-05-0333
- Godinez, W. et al. 2017. “A multi-scale convolutional neural network for phenotyping high-content cellular images.” Bioinformatics, Volume 33, Issue 13, Pages 2010–2019. https://academic.oup.com/bioinformatics/article/33/13/2010/2997285
- Debleena, Paul et al. 2020. “Artificial intelligence in drug discovery and development.” Drug discovery today, October 21. https://doi.org/10.1016/j.drudis.2020.10.010
- Alhichri, Haikel et al. 2018. “Multi-Scale Convolutional Neural Network for Remote Sensing Scene Classification.” IEEE Xplore. https://ieeexplore.ieee.org/document/8500107