Lab Manager | Run Your Lab Like a Business
three microscope slides containing pink-stained tissue samples being held by a blue-gloved hand with an out of focus microscope in the background
iStock, Kostafly

When Lab-Trained AI Meets the Real World, Mistakes Can Happen

Tissue contamination distracts AI models from making accurate real-world diagnoses

by Northwestern University
Register for free to listen to this article
Listen with Speechify

Human pathologists are extensively trained to detect when tissue samples from one patient mistakenly end up on another patient’s microscope slides (a problem known as tissue contamination). But such contamination can easily confuse artificial intelligence (AI) models, which are often trained in pristine, simulated environments, reports a new Northwestern Medicine study

“We train AIs to tell ‘A’ versus ‘B’ in a very clean, artificial environment, but, in real life, the AI will see a variety of materials that it hasn’t trained on. When it does, mistakes can happen,” said corresponding author Dr. Jeffery Goldstein, director of perinatal pathology and an assistant professor of perinatal pathology and autopsy at Northwestern University Feinberg School of Medicine. 

Get training in Lab Crisis Preparation and earn CEUs.One of over 25 IACET-accredited courses in the Academy.
Lab Crisis Preparation Course

“Our findings serve as a reminder that AI that works incredibly well in the lab may fall on its face in the real world. Patients should continue to expect that a human expert is the final decider on diagnoses made on biopsies and other tissue samples. Pathologists fear—and AI companies hope—that the computers are coming for our jobs. Not yet.”

In the new study, scientists trained three AI models to scan microscope slides of placenta tissue to (1) detect blood vessel damage, (2) estimate gestational age, and (3) classify macroscopic lesions. They trained a fourth AI model to detect prostate cancer in tissues collected from needle biopsies. When the models were ready, the scientists exposed each one to small portions of contaminant tissue (e.g., bladder, blood, etc.) that were randomly sampled from other slides. Finally, they tested the AIs’ reactions. 

Each of the four AI models paid too much attention to the tissue contamination, which resulted in errors when diagnosing or detecting vessel damage, gestational age, lesions, and prostate cancer, the study found. 

The findings were published earlier this month in the journal Modern Pathology. It marks the first study to examine how tissue contamination affects machine-learning models.

Tissue contamination is a well-known problem for pathologists, but it often comes as a surprise to non-pathologist researchers or doctors, the study points out. A pathologist examining 80 to 100 slides per day can expect to see two to three with contaminants, but they’ve been trained to ignore them.

When humans examine tissue on slides, they can only look at a limited field within the microscope, then move to a new field, and so on. After examining the entire sample, they combine all the information they’ve gathered to make a diagnosis. An AI model performs in the same way, but the study found AI was easily misled by contaminants. 

"The AI model has to decide which pieces to pay attention to and which ones not to, and that’s zero-sum,” Goldstein said. “If it’s paying attention to tissue contaminants, then it’s paying less attention to the tissue from the patient that is being examined. For a human, we’d call it a distraction, like a bright, shiny object.”

The AI models gave a high level of attention to contaminants, indicating an inability to encode biological impurities. Practitioners should work to quantify and improve upon this problem, the study authors said.

Previous AI scientists in pathology have studied different kinds of image artifacts, such as blurriness, debris on the slide, folds, or bubbles, but this is the first time they’ve examined tissue contamination. 

Perinatal pathologists, such as Goldstein, are incredibly rare. In fact, there are only 50 to 100 in the entire U.S., mostly located in big academic centers, Goldstein said. This means only 5% of placentas in the U.S. are examined by human experts. Worldwide, that number is even lower. Embedding this type of expertise into AI models can help pathologists across the country do their jobs better and faster, Goldstein said. 

“I'm actually very excited about how well we were able to build the models and how well they performed before we deliberately broke them for the study,” Goldstein said. “Our results make me confident that AI evaluations of placenta are doable. We ran into a real-world problem, but hitting that speedbump means we're on the road to better integrating the use of machine learning in pathology.” 

- This press release was originally published on the Northwestern University website