Tim Hohm, MBA, PhD, director of commercial strategy and business development at Optibrium, a company providing software solutions for small molecule design and optimization that accelerate and increase the success rate of drug discovery from early hit to clinical candidate nomination, discusses some of the main challenges on the operations and process side of biotechnology laboratories as well as some key solutions. Hohm obtained his PhD in computational biology from ETH Zurich and holds an MBA from Copenhagen Business School. Previously, he held positions in academia and large pharma and joined Optibrium in 2020 from Novo Nordisk.
Q: What are some of the key challenges when it comes to the operations/process side of the biotech lab?
A: The ability to generate new data and efficiently process, interpret, and share large volumes of complex information across an organization is a key enabler for drug discovery and biotech in general. Research has shifted from being process-driven to data-driven. Biotech companies that are maximizing the knowledge gained from each discovery cycle and effectively informing decision-making for the next cycle have the highest chance of project success. Computational tools, particularly AI [artificial intelligence], increasingly support these insights and impact decision-making. However, challenges relating to the nature of the data being gathered in drug discovery remain, thus limiting the potential of many computational approaches, including AI methods.
Generating data is hugely resource intensive. While big pharma possesses much greater internal resources with dedicated departments for screening, assaying, and data collection, biotech labs are more resource-constrained, often outsourcing experimental work, and need to be more selective in data gathering to be cost-efficient. This inevitably leads to incomplete (or sparse) data being collected. Compounding the issue, early biotech companies will typically only have access to data from a few projects, or even only one, whereas pharma has access to large corporate data repositories that can serve as training sets for model development.
Beyond the problem of the quantity of data being gathered, early discovery data is typically noisy, with measurements subject to significant experimental variability and error. The inability to effectively manage and account for this can lead biotechs to waste resources on dead ends or discard promising candidates due to false negative results.
Q: What are some of the solutions to these challenges?
A: AI approaches, particularly deep learning methods, have been transformative for many industries outside of drug discovery. For example, leveraging large, high-quality datasets (big data) has enabled image recognition technology to advance to unprecedented levels. In contrast, datasets available to biotech companies tend to be orders of magnitude smaller, incomplete (compounds are not measured in all relevant assays), and noisy (due to experimental variability and error). AI-based approaches specifically designed to handle challenging discovery data are now starting to show robust results. They enable biotech labs to make more informed decisions on what assays and measurements may add the most value, as well as on what compounds should be prioritized in the synthetic queue. This enables more effective use of resources and increases overall success.
In the drug discovery space, three primary domains of AI integration have begun to dominate the field: a) Generative chemistry, which creates new compound ideas; b) Property prediction by learning from existing data; and c) Reaction planning, which is used to identify synthetic routes with which to optimize compound production.
For example, unique deep learning imputation methods for compound property prediction go beyond what conventional modeling offers. Whereas traditional quantitative structure-activity relationship (QSAR) approaches rely on compound fingerprints and descriptors to establish a predictive model, they are often not applicable or intractable for critical downstream endpoints. Deep learning imputation overcomes these limitations by incorporating all available data across endpoints, in addition to the traditional descriptors. It then utilizes the relationships between endpoints to make more accurate predictions, including those for previously intractable endpoints. Furthermore, it enables translational predictions, using early-stage assay data to inform downstream assays, highlighting additional measurements that add the most information while flagging potential outliers and false negatives.
Q: What should lab managers look out for when choosing the right solution for their lab—whether it's a new software or other technology, or choosing the right people to collaborate with—to ensure the right choice?
A: When identifying suitable technologies, several dimensions should be considered: Does one have appropriate amounts of data of suitable quality? Is a technology proven and ready for deployment? Finally, how will the technology be accessible to users and support existing workflows?
The question of how a new technology integrates and supports drug discovery workflows is critical. The technology must be readily accessible to a broad user base, including the experimentalists. Many currently deployed AI applications are still in a concept phase, requiring AI and specialized experts to install, operate, and maintain. For many biotech companies, it is impracticable to have a dedicated AI team, making it critical that any adopted technology is ready for deployment and comes with maintenance services. That way, the cost of ownership is manageable while also providing access to leading technology capabilities.
Furthermore, technologies have to be intuitive and complement the users' capabilities, experience, and know-how to facilitate adoption. This idea of combining a scientists' expertise with an AI system is known as “augmented intelligence” and has already demonstrated that it can surpass either expert or AI alone. It engenders trust because it involves the user and their experience instead of relegating them to being a bystander.
Q: What are some of the major trends you're seeing with your biotech customers now? How do you think those trends will impact biotech labs going forward in terms of the operations/process side of things?
A: In previous years, seamlessly integrated discovery infrastructure was mainly a domain of large biotech and pharma companies. Technology platforms have since matured, and today, even small biotech companies and startups are increasingly gaining access to fully integrated solutions that support capturing information in a data-centered research paradigm.
Furthermore, computational approaches have become more prevalent, covering a more comprehensive range of subject areas, and are increasingly becoming part of the day-to-day of experimentalists in the biotech lab.
This requires focusing on solutions with open interfaces, allowing technologies from multiple vendors to be connected in combination with in-house tools. If done correctly, it will add value to the data stream and become a critical resource for streamlining discovery work, improving decision-making, and increasing productivity.