Ask the Expert: Keeping Up With LIMS Upgrades

David Dooling, Ph.D., assistant director of informatics at the Genome Institute at Washington University, discusses the challenges associated with tackling vast amounts of data in large sequencing labs and shares his experiences in building and utilizing an advanced LIMS for data tracking and storage.

Written byLab Manager
| 6 min read
Register for free to listen to this article
Listen with Speechify
0:00
6:00

David Dooling, Ph.D., is the assistant director of informatics at the Genome Institute at Washington University, where he oversees the Laboratory Information Management Systems (LIMS) and Information Systems. He has contributed to building one of the most advanced and powerful data-tracking systems at the institute and is now investigating methods to more efficiently store, compare, and operate sequencing data. Here he discusses with Tanuja Koppal, Ph.D., contributing editor for Lab Manager, the challenges associated with tackling vast amounts of data in large sequencing labs and shares his experiences in building and utilizing an advanced LIMS for data tracking and storage.

Q: What are some of the data-related challenges you face in your projects?

A: We are a large-scale sequencing center, and thus, data really drives what we do. The main projects that we’re working on now are cancer and human health related. We are also sequencing pools of microbes, fungi, and bacteria for the Human MicroBiome project, using DNA captured from various body sites in humans. We have several hundred individual samples from about 15 body sites, and we are trying to figure out what microbes are there and how they’re interacting with the human body, whether it’s in the mouth, in the gut, or on the skin. So as far as our LIMS system goes, it has increased in complexity as the scale of sequencing has increased over the last three or four years. There are many more samples and many more projects that we now need to track. Previously the data coming off the sequencers used to be on the order of kilobytes and megabytes, and now it’s on the order of terabytes. We have about 50 sequencing machines, and the runs are about 10 days long and each run generates somewhere around 1-2 terabytes of data per day. So every 10 days we’re generating about 50 to 100 terabytes’ worth of data.

To continue reading this article, sign up for FREE to
Lab Manager Logo
Membership is FREE and provides you with instant access to eNewsletters, digital publications, article archives, and more.

Related Topics

CURRENT ISSUE - October 2025

Turning Safety Principles Into Daily Practice

Move Beyond Policies to Build a Lab Culture Where Safety is Second Nature

Lab Manager October 2025 Cover Image