Labmanager Logo

Can We Trust Scientific Discoveries Made Using Machine Learning?

The key is creating machine learning systems that question their own predictions, say Rice University experts

| 2 min read
Share this Article
Register for free to listen to this article
Listen with Speechify
0:00
2:00

Rice University statistician Genevera AllenCredit: Tommy LaVergne / Rice University

Rice University statistician Genevera Allen says scientists must keep questioning the accuracy and reproducibility of scientific discoveries made by machine learning techniques until researchers develop new computational systems that can critique themselves.

Allen, associate professor of statistics, computer science and electrical and computer engineering at Rice and of pediatrics-neurology at Baylor College of Medicine, will address the topic in both a press briefing and a general session today at the 2019 Annual Meeting of the American Association for the Advancement of Science (AAAS).

Lab manager academy logo

Get training in Lab Crisis Preparation and earn CEUs.

One of over 25 IACET-accredited courses in the Academy.

Certification logo

Lab Crisis Preparation course

"The question is, 'Can we really trust the discoveries that are currently being made using machine learning techniques applied to large data sets?'" Allen said. "The answer in many situations is probably, 'Not without checking,' but work is underway on next-generation machine-learning systems that will assess the uncertainty and reproducibility of their predictions."

Machine learning (ML) is a branch of statistics and computer science concerned with building computational systems that learn from data rather than following explicit instructions. Allen said much attention in the ML field has focused on developing predictive models that allow ML to make predictions about future data based on its understanding of data it has studied.

"A lot of these techniques are designed to always make a prediction," she said. "They never come back with 'I don't know,' or 'I didn't discover anything,' because they aren't made to."

She said uncorroborated data-driven discoveries from recently published ML studies of cancer data are a good example.

"In precision medicine, it's important to find groups of patients that have genomically similar profiles so you can develop drug therapies that are targeted to the specific genome for their disease," Allen said. "People have applied machine learning to genomic data from clinical cohorts to find groups, or clusters, of patients with similar genomic profiles.

Interested in Life Science News?

Subscribe to our free Life Science Tools & Techniques newsletter.

Is the form not loading? If you use an ad blocker or browser privacy features, try turning them off and refresh the page.

"But there are cases where discoveries aren't reproducible; the clusters discovered in one study are completely different than the clusters found in another," she said. "Why? Because most machine-learning techniques today always say, 'I found a group.' Sometimes, it would be far more useful if they said, 'I think some of these are really grouped together, but I'm uncertain about these others.'"

Loading Next Article...
Loading Next Article...

CURRENT ISSUE - November 2024

The Blueprint for Lab Safety Success

Protecting your lab's greatest asset: its people

Lab Manager November 2024 Cover Image
Lab Manager Life Science eNewsletter

Stay Connected with Life Science News

Click below to subscribe to Life Science Tools & Techniques eNewsletter!

Subscribe Today