Scientist pipettes into a tube with charts and data holograms floating around him, representing artificial intelligence being used in life science research

The Challenges and Opportunities of Generative AI in Life Science Research

A majority of scientists expect to use genAI within the next few years, but several hurdles may impact that progress

Written byHolden Galusha
Updated | 3 min read
Register for free to listen to this article
Listen with Speechify
0:00
3:00

Scientists are largely interested in generative artificial intelligence (genAI), but identifying suitable use cases for it within research or lab operations is proving to be a challenge. In this interview, Christian Baber, PhD, chief portfolio officer at the Pistoia Alliance, shares what they have learned about how life science researchers are leveraging genAI.

Q. Anecdotally, how are some scientists you've spoken to using genAI in life science research?

A.  From speaking to our members, life sciences organizations are using genAI to support their work from both a data science perspective and a scientific research perspective. On the data side, we’ve seen companies using genAI to conduct natural language searches of their datasets and summarize responses. We’ve also seen genAI being used to annotate datasets and to support metadata processing, as well as generating computer code for a specific purpose. On the research side, organizations are using genAI to generate chemical structures [along with] protein and peptide sequences with desired properties.

Q. What challenges does genAI address for them?

A.  GenAI helps life sciences organizations by democratizing data access, reducing the pressure on a limited number of experts, and freeing them up to work on other tasks. Researchers can now use natural language search to query data and quickly extract insights, removing the delays associated with relying on overburdened data specialists. This accelerates workflows and enhances productivity. Additionally, genAI’s understanding of natural language empowers researchers to augment their creativity, making it particularly valuable for drafting scientific texts and papers.

Elsewhere, the technology’s ability to analyze large datasets is also supporting pattern recognition—particularly for sequencing chemical structures—and hypothesis generation. 

Q. What challenges does genAI create for them?

A.  Our members have reported several challenges resulting from the adoption of genAI into their workflows. In the following [list], we will speak mainly about large language models (LLMs) because many use cases that we saw recently use LLMs. But this does not mean that the applications are limited to only this model type.

Some examples of these challenges include:

Hallucinations: Most genAI models have some risk of hallucination, which is inherent in their ability to be generative. This creates risk for companies, especially for use cases that are considered high-risk and could directly impact patients. However, it is worth noting that hallucinations are required for truly generative use cases, such as the production of novel structures in previously unexplored chemical space, as the AI can suggest new hypotheses.

Computer code issues: In general, LLM-created code is not optimally efficient, but it is good for prototyping.

Lack of prompt engineering expertise: The ability to guide a genAI model to produce desired outputs is much more demanding and critical to output accuracy than we thought a few years ago. In fact, prompt engineering is so complex that it may become a competitive disadvantage for genAI tools compared to other search methods, such as structured query languages. LLM performance is improved when a prompt given to it looks like a story, as opposed to dry, skeletal prompts. Prompt engineering is becoming an expert skill itself, reducing the democratization benefits of LLMs. 

Lab manager academy logo

Lab Quality Management Certificate

The Lab Quality Management certificate is more than training—it’s a professional advantage.

Gain critical skills and IACET-approved CEUs that make a measurable difference.

Output differences caused by varying data structure: Not all data structures are accessible for mining by LLMs at the same level of quality. LLMs operate well on text and have generally been trained using common vocabularies, so the closer the structure of data to natural language text is, the easier it is to process for an LLM. 

Lack of benchmarks: Many genAI use cases in life sciences lack proper benchmarks for validating AI outputs and evaluating the claims of vendors that sell commercial AI tools. This lack of benchmarking makes it challenging to prove the accuracy and reliability of models to regulators, which is becoming increasingly important as new legislation emerges.

Copyright issues: [As of December 2024,] Pistoia Alliance research found 42 percent of life science professionals do not consider copyright before sharing or using third-party information with AI tools, and only 40 percent of organizations report having a dedicated team or expert focused on AI copyright compliance. This gap could lead to infringement risks, fines, and reputational damage. Specialized knowledge on data licensing, text-mining rights, copyright, and IP legislation is becoming a must, but these skills are hard to acquire, and competition for hiring such experts is fierce. 

Interested in lab leadership?

Subscribe to our free Lab Leadership Digest Newsletter.

Is the form not loading? If you use an ad blocker or browser privacy features, try turning them off and refresh the page.

By subscribing, you agree to receive email related to Lab Manager content and products. You may unsubscribe at any time.

Q. Are these genAI solutions off-the-shelf or developed in-house? In either case, are they trained/finetuned on the lab's data?

A.  Currently, we are seeing companies using a mix of off-the-shelf solutions they have fine-tuned and models they have developed fully in-house. Each has its own pros and cons, and the choice depends on the level of expertise companies have access to and how many resources they are able to invest.

Christian Baber, PhD, has led both R&D and technology divisions for global pharmaceutical organizations focused on informatics and predictive modelling for drug discovery. Baber has also worked with the Pistoia Alliance for more than 15 years, including four years as a board director.

About the Author

  • Holden Galusha headshot

    Holden Galusha is an associate editor for Lab Manager. He was a freelance contributing writer for Lab Manager before joining the team full-time. Previously, he was the content manager for lab equipment vendor New Life Scientific, Inc., where he wrote articles covering lab instrumentation and processes. Additionally, Holden has an associate of science degree in web/computer programming from Rhodes State College, which informs his content regarding laboratory software, cybersecurity, and other related topics. In 2024, he was one of just three journalists awarded the Young Leaders Scholarship by the American Society of Business Publication Editors. You can reach Holden at holden.galusha@gmail.com.

    View Full Profile

Related Topics

Loading Next Article...
Loading Next Article...

CURRENT ISSUE - October 2025

Turning Safety Principles Into Daily Practice

Move Beyond Policies to Build a Lab Culture Where Safety is Second Nature

Lab Manager October 2025 Cover Image