Getting Your Head Around Big Data

David Patterson, PhD, professor of computer science at the University of California at Berkeley, talks to contributing editor Tanuja Koppal, PhD, about big data—what it is, where it applies, and what lab managers can expect to gain by investing in it. He also provides guidance on where people can get more
information about (and help with) big data and the possible concerns they need to be aware of.

Written byTanuja Koppal, PhD

| 6 min read

Listen with Speechify

0:00

6:00

Q: What is big data?

A: From a computer science perspective, it’s more accurate to call it “unstructured data.” Big data is not the pristine data usually found in tables. Unstructured data is the messy, dirty, incomplete data that is collected from a lot of different sources. So the idea is that, rather than throwing away this messy data that does not fit into any relational database, it will be useful to keep it and process the data in nontraditional ways. Big data can often be large databases with terabytes or petabytes of information, but it can also refer to small amounts of information that is hard to process with traditional software tools and relational databases. Big data is often complex, but that has to do with the incompleteness and inconsistency in the data rather than its size.

Q: Why should lab managers care about big data? Does big data impact all labs?

A: Presumably all labs get information from various data sources that they then have to store. So the question is, are they then able to analyze, compare, and correlate those datasets and results, possibly over time, to gain useful insights? The fundamental argument underlying big data is that, if only we have the right tools to process and analyze all the data that we have, then we can get “gold” (from the data) to drive future discoveries.

Q: What can we truly expect to gain if we make all the right investments in big data?

A: It depends on the type of lab. If it’s a research lab and you are able to pore through all the data, then big data promises to uncover some behaviors and patterns that would be the indicator of some new phenomenon. If it’s a production-oriented lab, then the potential benefit of big data would be to find ways to improve the lab processes by monitoring data from several different processes and machines.

Q: How have you exploited big data in your lab?

To continue reading this article, sign up for FREE to

Membership is FREE and provides you with instant access to eNewsletters, digital publications, article archives, and more.

Unlock for FREE

Add Lab Manager as a preferred Google source to see more of our trusted coverage.

About the Author

Tanuja Koppal, PhD

Getting Your Head Around Big Data

Q: What is big data?

Q: Why should lab managers care about big data? Does big data impact all labs?

Q: What can we truly expect to gain if we make all the right investments in big data?

Q: How have you exploited big data in your lab?

About the Author

Tanuja Koppal, PhD

Related Topics

Q: What is big data?

Q: Why should lab managers care about big data? Does big data impact all labs?

Q: What can we truly expect to gain if we make all the right investments in big data?

Q: How have you exploited big data in your lab?

Q: Is the data processing done in real time, or does the data have to be stored and available in certain formats?

Q: Do the users need to be technically proficient, or will some amount of training suffice?

Q: What changes need to be implemented in the lab for the correct use of big data?

Q: What are some of the concerns associated with big data?

Q: Any advice for lab managers based on your experiences and expertise?

Q: When and how do you come to the realization that big data is not working for you?

When the Unexpected Hits

Sponsored

How to Lead a Biopharma Lab: Discovery to Delivery

Trace Chromatographic Failures to Their Water Source

epMotion®: A Pipette—Only Smarter

Matrix-Matched Centrifugation: Decisions Behind Reproducible Extracellular Vesicle Isolation