Getting Your Head Around Big Data

David Patterson, PhD, professor of computer science at the University of California at Berkeley, talks to contributing editor Tanuja Koppal, PhD, about big data—what it is, where it applies, and what lab managers can expect to gain by investing in it. He also provides guidance on where people can get more
information about (and help with) big data and the possible concerns they need to be aware of.

Written byTanuja Koppal, PhD
| 6 min read
Register for free to listen to this article
Listen with Speechify
0:00
6:00

Q: What is big data?

A: From a computer science perspective, it’s more accurate to call it “unstructured data.” Big data is not the pristine data usually found in tables. Unstructured data is the messy, dirty, incomplete data that is collected from a lot of different sources. So the idea is that, rather than throwing away this messy data that does not fit into any relational database, it will be useful to keep it and process the data in nontraditional ways. Big data can often be large databases with terabytes or petabytes of information, but it can also refer to small amounts of information that is hard to process with traditional software tools and relational databases. Big data is often complex, but that has to do with the incompleteness and inconsistency in the data rather than its size.

Q: Why should lab managers care about big data? Does big data impact all labs?

A: Presumably all labs get information from various data sources that they then have to store. So the question is, are they then able to analyze, compare, and correlate those datasets and results, possibly over time, to gain useful insights? The fundamental argument underlying big data is that, if only we have the right tools to process and analyze all the data that we have, then we can get “gold” (from the data) to drive future discoveries.

Q: What can we truly expect to gain if we make all the right investments in big data?

A: It depends on the type of lab. If it’s a research lab and you are able to pore through all the data, then big data promises to uncover some behaviors and patterns that would be the indicator of some new phenomenon. If it’s a production-oriented lab, then the potential benefit of big data would be to find ways to improve the lab processes by monitoring data from several different processes and machines.

Q: How have you exploited big data in your lab?

To continue reading this article, sign up for FREE to
Lab Manager Logo
Membership is FREE and provides you with instant access to eNewsletters, digital publications, article archives, and more.

About the Author

Related Topics

CURRENT ISSUE - October 2025

Turning Safety Principles Into Daily Practice

Move Beyond Policies to Build a Lab Culture Where Safety is Second Nature

Lab Manager October 2025 Cover Image