Lab Manager | Run Your Lab Like a Business


Drug Discovery and Development Go Virtual

In silico approaches streamline drug discovery

Angelo DePalma, PhD

Angelo DePalma is a freelance writer living in Newton, New Jersey. You can reach him at

ViewFull Profile.
Learn about ourEditorial Policies.
Register for free to listen to this article
Listen with Speechify

Drug companies spend about 15 years and $800 million to bring a promising chemical compound to pharmacy shelves. Drug sponsors have adopted numerous technologies and strategies with the goal of compressing this timeframe and reducing costs. The industry’s ominous-sounding “fail early, fail fast” philosophy seeks to identify likely approval failures early in their development, at a stage where relatively few resources have been devoted to a project. Hence the concentration of strategies targeted to the discovery phase, particularly compound selection, pharmacokinetics, and ADME (absorption, distribution, metabolism, and excretion).

Before computational methods were widespread, discovery scientists applied Lipinski’s rules, which inform on a compound’s rough “druggability” before the structure is even synthesized. More precise in silico computational methods have emerged that take Lipinski’s rules to the next level, to predict a drug compound’s activity toward specific biological targets as well as its pharmacokinetics (PK) or movement through the body as measured by the individual components of ADME.

Get training in Biosafety and Biosecurity and earn CEUs.One of over 25 IACET-accredited courses in the Academy.
Biosafety and Biosecurity Course


Chemical similarity has been the basis for much of drug discovery. Good examples are benzodiazepine tranquilizers and the cholesterol-lowering statin drugs.

“Behind the significance of chemical similarity is the notion that similar chemical structures exhibit similar biological activity,” notes David Malatinszky, PhD, applications scientist at ChemAxon. “Structural similarity of two molecules may be based on 2-D or 3-D information translated into molecular descriptors, such as chemical or pharmacophore fingerprints. The actual similarity is calculated from these fingerprints and may be expressed by various metrics.”

In March 2018, Enamine, which has assembled the REAL database of 337 million unique small molecular– weight compounds accessible through one-step synthesis, jointly launched an online resource for exploring the REAL collection. The resource provides drug discovery groups with querying capability to the chemical space within REAL via ChemAxon’s MadFast search tool.

Combining chemical similarity with immense structure libraries is a natural, Malatinszky says. “In similarity-based virtual screening it may be important to compare two large data sets pairwise, to rank compounds in a virtual library based on a molecule set of confirmed biological activity. Such comparison is usually visualized as a heat map and may help identify patterns of “hot groups” in the data set. Similarly, clustering may help classify a library of compounds into groups based on various descriptors—similarity- based or structure-based—and clustering algorithms.”

A typical use case for searching the Enamine REAL database with MadFast might occur during live drug design, where optimal synthesis routes are selected for candidate molecules. ChemAxon’s idea management platform, Marvin Live, incorporates predicted chemical properties, drug likeness calculations, and structure-based information from a variety of knowledge bases. “The latter, such as access of large databases like Enamine’s REAL, containing more than 300 million compounds, is time-consuming for a standard similarity search. Thus, advanced search algorithms are required, capable of delivering a result set in milliseconds,” Malatinszky says.


Regulatory agencies require the completion of ADME studies before they will even consider an investigational new drug application, which allows human testing. Traditional ADME studies are done in animals, which is expensive and time-consuming. In vitro assays provide some level of predictability, such as a compound’s partitioning into oil and water. More recently, in silico predictions of PK and its components have empowered drug developers with very early-stage insights into the interactions between putative drug compounds and patients.

In February 2018, Simulations Plus upgraded its PKPlus pharmacokinetics software package to version 2.0. Enhancements in this version include nonparametric superposition (NPS) and compartmental multi-dose simulation.

In pharmacokinetics, superposition means that a drug’s concentration at any given time equals the sum of concentration contributions from all previous doses. Superposition assumes that each dose is an independent event, and that the clearance rate is constant or linear. “Compartmental” refers to regions of the body where a drug is relatively concentrated or depleted.

Because clearance is proportional to concentration, after a certain number of doses the concentration-time curve reaches a steady state with equal peaks and troughs. This method therefore calculates how many doses will be required to reach a steady state, as well as steady-state peak and trough concentration levels.

NPS and compartmentalism are not impossible to predict, but their elucidation requires tedious experimentation or automation. “PKPlus automates the process for predicting concentration time for multiple doses from single-dose data using NPS, and also fits multiple-dose data with a parametric model to accommodate nonlinear pharmacokinetics,” explains Simulations Plus CEO Walt Woltosz. “PKPlus makes this a relatively simple point-and-click operation, including the generating of reports in various formats.”

While PKPlus facilitates rapid NPS and both compartmental and noncompartmental analysis, the models provide little, if any, mechanistic insight into where the drug goes within the body, Woltosz says, whereas physiologically based pharmacokinetics (PBPK) do provide such insight. “When the pharmacokinetics for a drug are straightforward, simple compartmental models may suffice. But when modeling the effects of enzymes and/ or transporters in different tissues is required to predict pharmacokinetics for different doses and formulations, PBPK models are required,” he explains.

As the scientific knowledge base of high-quality data for chemical properties, pharmacokinetics, and pharmacodynamic grows, companies like Simulations Plus incorporate more sophisticated models into software tools. While designing drugs from scratch in silico remains a distant goal, today’s software helps discovery scientists eliminate large numbers of new structures from consideration without ever synthesizing them, Woltosz says. “That means eliminating losers without spending the money to make and test those molecules. We can also show chemists the sensitivities of various properties to specific structural motifs in a molecule- specific manner. In other words, these sensitivities are not global but are specific to each structure. Thus, adding a carbonyl to a certain point on a structure could result in different effects for different molecules.”


Certara, which specializes in model-based drug development software, recently launched a consortium for studying quantitative systems pharmacology, one goal of which was to advance physiologically based pharmacokinetics. Drug developers increasingly use PBPK simulation to evaluate the effects of intrinsic and extrinsic factors on drug exposure in patients.

According to Suzanne Minton, PhD, Certara’s scientific communications manager, PBPK is not merely a refinement of PK modeling, but is instead a distinct biomathematical approach in model-informed drug development. “PBPK models include systems data, trial design, and drug data.”

Drug data refer to the molecular and physiochemical properties of the investigational drug. Systems data refer to the demographic, physiological, and environmental characteristics of the population being studied. Trial design information refers to the dose, route of administration, and co-administered drugs.

“These components are modular in the Simcyp Simulator PBPK platform. So if you have the information for a certain drug and trial design in one population, it’s easy to exchange one population for other. For instance, if you have developed a PBPK model looking at the effect of oseltamivir in healthy Northern European Caucasian adults, it’s easy to swap out the population file to determine whether there are PK differences in a virtual Japanese population.”

PBPK modeling is appropriate at all stages of drug development, but probably most applicable to early clinical development. For example, if a target patient population is taking medications likely to cause drug-drug interactions, a PBPK model will estimate the likelihood this will occur with a test drug.

Certara is also active in early-stage computational discovery tools. In early 2018, it introduced version 10 of its D360 discovery informatics platform, a component of which is applying structure-activity relationships (SARs) to discovery.

Like the nonquantitative Lipinski’s rules, SARs have long been used to predict a drug’s activity based on its chemical and physical properties. Yet, says Minton, discovery scientists have struggled, even with computational assistance, to connect explicitly SAR data with pharmacology.

“Even if they use virtual compound screening techniques rather than wet lab drug screening, it’s still a nightmare to manage all that data and see the critical relationships between structure and activity. Scientific informatics solutions like D360 allow scientists to create queries to find the compounds that have the desired chemical and biological properties,” Minton explains.


At some point during the evolution of groundbreaking drug discovery tools, one is justified in asking whether these tools have helped and, if so, how.

Figures on time and cost for drug development are probably not very reliable, nor can one rely solely on approvals of New Molecular Entities (NMEs) as a guide. The FDA approved 46 NMEs in 2017 compared with 27 in 2013. But this analysis is confounded by the fact that just three biologicals were included in 2013, compared with 12 last year. And as noted by Frank S. David, MD, PhD, in a 2017 Forbes article, “With such relatively small numbers, there will always be noise— two more NMEs this year, three fewer next year, and an occasional year with many more or many fewer—but I don’t think the short-term comparisons mean much.”

And, given the relentless surge of biopharmaceuticals, those comparisons are likely to mean even less in the future.

In defense of drug discovery in general, that discovery has become more difficult over the decades is unmistakable. At one time, natural products dominated our pharmacopeias, and for decades there seemed to be an endless supply of natural chemical structures that could be used directly or with modest chemical modification. Those days are gone, or perhaps the natural product route has simply been put on the back burner.

With the low-hanging fruit harvested long ago and regulators generally raising the bar for approval of me-too drugs, it is a remarkable accomplishment that discovery (as quantified by NMEs) has kept pace over the years, no doubt facilitated by the emergence of in silico tools.

Thanks to powers-of-ten advances in computing power, in silico drug discovery and development are miles ahead of their capabilities, compared with even five years ago. Algorithms are far more powerful, and the scientific knowledge base has grown.

Modern drug discovery demands access to and assessment of biological, chemical, logistical, and computational data from a wide variety of sources, the combination of which is beyond the capabilities of discovery scientists. “Often, they must rely on their IT staff to develop the infrastructure to access, integrate, analyze, and visualize scientific data to make critical decisions,” says Minton.

“I see no end in sight for simulation and modeling—it is clearly here to stay, and I predict that one day the cost and time to bring a new drug to market for a particular indication will be a fraction of what it is today,” adds Woltosz.