Female Research Scientist with Bioengineer Working on a Personal Computer with Screen Showing Virus Analysis Software User Interface.

Computer-Aided Synthesis Reduces Complexity and Accelerates Novel Chemical Discoveries

Modern synthesis planning software can be best used as a decision support tool to reduce risk of failure and cost of research


We’ve come a long way from the earliest days of synthesis planning. Though ambitions of alchemical transmutation from base elements such as lead to precious metals such as gold have long since passed1, the modern chemist still labors over a different kind of bubbling cauldron—one of cheminformatics. With hundreds of millions of known compounds and documented reactions, today’s synthesis planning relies on connecting the wisdom and practical expertise of an experienced organic chemist with the automation and advanced algorithms found in modern retrosynthesis software.

Chemical Production—Extraction to synthesis

The discovery of dyes produced by labor-intensive extraction from natural sources, such as insects like Kermes Vermilio for Vermillian2 or Tyrian Purple (6,6′-dibromoindigo) extracted from predatory rock snails3, accelerated the demand for novel approaches to the production of synthetic dyes in the 1800s4 In the early twentieth century, the development of complex, medicinal compounds by executing multi-step organic synthesis led to a formalized process of ”reverse engineering” such targets, paving the way for modern synthesis planning, using what is now referred to as “retrosynthetic analysis5.”

The benefits of synthetic planning were obvious, however, as with chemical extraction, organic synthesis is not without its own inherent risks. While a carefully planned synthesis centralized the production of a target at scale, a new set of challenges remained that threatened any successful operation. Scale-up was and still is easier planned than executed. A chemical reaction perceived to be “‘clean”’ at a small scale (generating undetectable amounts of side-products) could have tremendous challenges when scaled up6. Planned starting materials and reagents, that were later found to threaten the environment, may have needed to be replaced due to ever-changing regulatory constraints. Supply chain disruptions due to natural disasters, geo- political events, and other causes were, and remain, a significant threat to production7.

Retrosynthetic analysis and mitigating risk

Addressing the many intrinsic and external challenges to chemical synthesis has led to a highly complex system of regulations that involve numerous academic and governmental organizations in modern times. While the simplicity and elegance of E.J. Corey’s retrosynthetic analysis allows it to remain a pillar for the student of organic chemistry, it has been supplemented by powerful computer software with advanced algorithms to support the synthetic chemist in the last few decades. This new age of enhanced computer-aided retrosynthesis is still in its infancy, and like all new tools, has been both lauded and criticized. While the potential for these tools to further mitigate risks to industrial synthesis has already become clear, it would be an oversight to omit the impact on human capital, ingenuity, or resources when discussing risk in any economic endeavor—and nothing evokes this more strongly than tools claiming to have artificial intelligence. 

Nevertheless, a more powerful calculator has yet to fully replace the mathematician. Like all tools, there is an art to being productive with each one—and art vis-a-vis creativity, is a domain that will forever belong to human intellect. Moreover, we must remain cognizant of the specific use-cases that have been successfully managed with automation and artificial intelligence. Many grudgingly mundane physical tasks have been replaced by machines, allowing human operators the freedom and creativity to direct their energy towards decision-making and other project-related goals.

Improving synthesis outcomes

It is with this in mind that we review several recent publications featuring the usage of SYNTHIA™ Retrosynthesis Software. Perhaps the most obvious benefit to computer-aided synthetic planning is in the efficient identification of a low-cost, yet robust synthetic pathway. Chemistry has long since drawn culinary metaphors, referring to a chemical synthesis procedure as a “‘recipe.”’ In this manner, one might draw analogy of a computer-aided retrosynthesis plan as a catering guide for a successful 10-course meal. As demonstrated by Klucznik et al., such a “‘buffet”’ of synthetic targets could be readily produced in a chemical laboratory after using SYNTHIA™ Retrosynthesis Software (formerly Chematica) to devise efficient pathways8.

"The Chematica program was used to autonomously design synthetic pathways to eight structurally diverse targets, including seven commercially valuable bioactive substances and one natural product. All of these computer-planned routes were successfully executed in the laboratory and offer significant yield improvements and cost savings over previous approaches, provide alternatives to patented routes, or produce targets that were not synthesized previously8."

The utility of a computer-aided synthesis planning tool is made crystal clear from the work of Klucznik et al., demonstrating more robust route planning with better yields and reduced costs that steers around patents and generates valuable intellectual property. What may be less clear is whether this success is “cherry-picked” or where limitations of scope exist in the tool. However, this is an easy criticism for any tool, the boundaries of which may only be found by the skillful creativity of the user. One such example of creativity was demonstrated by Gajewska et al. who explored the landscape of tactical combinations (TCs) in organic synthesis and discovered a computer-aided algorithm that revealed 4.85 million combinations within 46,000 reaction classes9. Gajewska asserts that although difficult to identify, and with only 500 TCs previously catalogued, tactical combinations are uniquely useful in the synthetic planning of complex organic targets, opening up possibilities for pronounced structural simplification in subsequent, downstream steps9.

Though it seems such creative research projects may belong exclusively to the field of cheminformatics, the potential for identifying and capitalizing on new intellectual property is clear from such endeavors, surely extending to drug discovery, process chemistry, and beyond. Computer-aided synthesis planning promises to reduce complexity in this space and accelerate novel chemical discoveries. Molga et al. showed that navigating the whitespace for a synthetic target to avoid routes with patents pending was more easily accomplished with SYNTHIA™ Retrosynthesis Software10.

All of these computer-planned routes were successfully executed in the laboratory and offer significant yield improvements and cost savings over previous approaches, provide alternatives to patented routes, or produce targets that were not synthesized previously.

"By keeping track of lists of specific bonds one wishes to preserve, a computer program is able to identify the key disconnections used in the patented syntheses and design synthetic routes that circumvent these approaches10."

The researcher notes the complexity in the syntheses of blockbuster drugs which can be protected by tens to hundreds of patents claiming hundreds of synthetic steps and altogether forming quite complex reaction networks.10 Moreover, they explored the state-of-the-art with all relevant and available software tools which allowed them to draw a stark comparison in efficiency and ease of use:

"Reactions from patented syntheses are easily available by querying repositories such as Reaxys or SciFinder; however, they come as individual steps rather than complete synthetic plans, and the atoms in them are not numbered. Therefore, our first step is to assign atom mappings to each of these reactions. We do so by using Chematica’s SMARTS reaction templates and our in-house atom- mapping codes (commercial mappers in ChemDraw or Marvin can also be used) to match both substrates and products and also unambiguously assign the reaction type10."

A new age in computer aided synthesis planning

Navigating the murky waters of intellectual property is a complex problem. The savvy researcher may need to use numerous tools to find an optimal path, especially since many chemical patents far exceed 100 pages of complex language designed to protect what seemingly is all potential uses of a particular chemical entity. The complementary nature of similar tools is well understood by various trades. It is no wonder that research is being conducted to more precisely define the scope and limits of each synthesis planning tool.

If the ocean of intellectual property is murky, then supply chain disruptions are like icebergs. Obviously, there is tremendous value in tactical agility around such unfortunate and often unforeseen excursions. Therefore, it should seem crucial to build contingencies into a synthesis plan at the very beginning stages. Moreover, multi-route synthesis planning can dramatically benefit the development of quick- to-market drugs required to combat a future pandemic. Such an approach was published by Szymkuć et al. in early 2020 a few months after the worldwide outbreak of Sars-CoV-2, the virus that causes COVID-19. The team used SYNTHIA™ Retrosynthesis Software to identify multiple viable routes to synthesize two potential COVID-19 therapeutics11.

"A computer program for retrosynthetic planning helps develop multiple 'synthetic contingency' plans for hydroxychloroquine and also routes leading to remdesivir, both promising but yet unproven medications against COVID-19. These plans are designed to navigate, as much as possible, around known and patented routes and to commence from inexpensive and diverse starting materials, so as to ensure supply in case of anticipated market shortages of commonly used substrates11."

The authors acknowledge that such a strategy is merely one example of risk reduction to loss-of-life from a global health crisis. They further note that the "development of similar contingency syntheses is advocated for other already-approved medications, in case such medications become urgently needed in mass quantities to face other public- health emergencies11." Climate scientists largely agree that the frequency of pandemics is likely to increase as microorganisms have the ability to bio-diversify faster than other organisms and climate stressors such as food scarcity will affect population dynamics creating more occurrences for interspecies interactions and the transfer of infectious disease12.

Like all technologies, computer-aided synthesis planning continues to evolve. Today’s modern synthesis planning software combines human ingenuity with artificial intelligence to quickly propose multi-step synthetic pathways. However, the proposed routes still require careful assessment and bench execution by a skilled synthetic organic chemist. It is at the discretion of the end user to determine which route is the best fit for the purpose at hand. Therefore, the state of the art is best considered a decision support tool that can reduce risk of failure in the lab and plant, thereby reducing the cost of research and increasing profitability for commercial organizations. The game hasn’t yet changed, but this new age of synthesis planning feels more revolutionary than evolutionary.

Please visit SigmaAldrich.com/SYNTHIA for more information on SYNTHIA™ Retrosynthesis Software.



1.    Greenberg, A. A chemical history tour: Picturing chemistry from alchemy to modern molecular science. John Wiley & Sons, 2000.

2.    Eastaugh, N. Pigment Compendium: A Dictionary of Historical Pigments. Butterworth-Heinemann, 2004.

3.    Kassinger, R. G. Dyes: From Sea Snails to Synthetics. 21st century, 2003.

4.    Hagan, E., Poulin, J. Statistics of the early synthetic dye industry. Herit. Sci. [Online], 2021, 33. https:// doi.org/10.1186/s40494-021-00493-5  (accessed  17 Mar 2022).

5.    Corey, E. J.; Jorgensen, William L. Computer-assisted synthetic analysis. Synthetic strategies based on appendages and the use of reconnective transforms. J. Am. Chem. Soc. [Online], 1976, 1, 189–203. https://doi.org/10.1021/ja00417a030 (accessed 17 Mar 2022).

6.    Bisio, A., Scaleup of Chemical Processes. Wiley Interscience, 1985.

7.    Jüttner, U. et. al. Supply chain risk management: outlining an agenda for future research. International Journal of Logistics Research and Applications [Online], 2003, 4, 197-210. https://doi.org/10.1080/13675560310001627016 (access 17 Mar 2022).

8.    Klucznik, T., et. al. Efficient syntheses of diverse, medicinally relevant targets planned by computer and executed in the Laboratory. Chem [Online], 2018, 3, 522–532. https:// doi.org/10.1016/j.chempr.2018.02.002

9.    Ewa P. Gajewska, et. al. Algorithmic Discovery of Tactical Combinations for Advanced Organic Syntheses. Chem. [Online], 2020, 1, pp. 280-293. https://doi.org/10.1016/j.

chempr.2019.11.016 (accessed 17 Mar 2022).

10.    Molga, Karol, et. al. Navigating around Patented Routes by Preserving Specific Motifs along Computer-Planned Retrosynthetic Pathways. Chem. [Online], 2019, 5. https://doi.org/10.1016/j.chempr.2018.12.004 (accessed 17 Mar 2022).

11.    Szymkuć, S., et. al. Computer-generated “synthetic contingency” plans at times of logistics and supply problems: scenarios for hydroxychloroquine and remdesivir. Chem. Sci. [Online], 2020, 11, 6736-6744. https://doi.

org/10.1039/D0SC01799J (accessed 17 Mar 2022)

12.    Gallana, M., et. al. Climate change and infectious diseases of wildlife: Altered interactions between pathogens, vectors and hosts. Current Zoology. [Online], 2013, 3, pp. 427–437, https://doi.org/10.1093/czoolo/59.3.427 (accessed 17 Mar 2022).