Lab Manager | Run Your Lab Like a Business

New Institute to Tackle 'Data Tsunami' Challenge

Researchers at the U.S. Department of Energy’s (DOE) Argonne National Laboratory have received part of a planned $25 million grant from the DOE Office of Science to tackle the problem of extracting knowledge from massive data sets.

by Other Author
Register for free to listen to this article
Listen with Speechify

ARGONNE, Ill.—Researchers at the U.S. Department of Energy’s (DOE) Argonne National Laboratory have received part of a planned $25 million grant from the DOE Office of Science to tackle the problem of extracting knowledge from massive data sets.

The work is part of the DOE’s newly established Scalable Data Management, Analysis, and Visualization (SDAV) Institute. Researchers in Argonne’s Mathematics and Computing Science division will receive a planned $3.4 million over five years for the research.

New computing advances are enabling researchers to attack important problems, from increasing the fuel efficiency of vehicles to making more aerodynamic airplane wings. The result is a veritable “tsunami of data.” Many simulations and experiments already generate petabytes of data—a single petabyte is 2,000 times more data than you can fit on a typical laptop—and they will soon be generating exabytes.

“The task of handling this data is overwhelming, forcing scientists to spend much of their time developing special-purpose solutions to store, access and manage the information,” said Robert Ross, Argonne computer scientist and deputy director of the new institute. “The SDAV teams will develop the necessary tools and software so that scientists can use their time more effectively for scientific investigation and discovery.”

The institute will address challenges in three areas. Data management enables query of scientific datasets; data analysis provides techniques for both in situ and postprocessing data analysis; and data visualization includes tools for identifying and understanding features in multiscale, multiphysics datasets.

The SDAV Institute was announced at the White House on March 29 as part of a new $200 million Big Data Research and Development Initiative. Funded under the DOE Office of Science’s Scientific Discovery through Advanced Computing (SciDAC) program, the institute is led by Lawrence Berkeley National Laboratory (LBNL). In addition to Argonne and LBNL, four other national laboratories, as well as seven universities and one visualization software company, are participating in the collaboration.

“To make all this possible, we will actively work with applications teams, assisting them with the tools and ensuring that our efforts meet the high standards needed to ensure correctness and performance of the scientists’ codes,” said Ross. “In turn, we will gain critical feedback about scientists’ needs in addressing mission-critical challenges.”

Essential to successful deployment and adoption of SDAV tools are close ties to leading computational facilities. The institute includes partners from the Argonne Leadership Computing Facility, the National Energy Research Scientific Computing Center at LBNL and the Oak Ridge Leadership Computing Facility at Oak Ridge National Laboratory, who are responsible for installing the new technologies developed by the SDAV teams. All three supercomputing facilities are supported by DOE’s Office of Science. These partners will also inform SDAV team members of upcoming system architectures, guiding development of SDAV tools to ensure that they will be effective as new systems come online.

To reach an even broader community, the SDAV team plans to hold tutorials and workshops to gather information from other researchers and train potential users. These efforts will be coordinated with leading conferences and DOE computing facility activities.

SDAV combines the expertise from three successful SciDAC Centers and Institutes: the SciDAC Scientific Data Management Center for Enabling Technologies, the Visualization and Analytics Center for Enabling Technologies and the Institute for Ultra-Scale Visualization.

“Our successes in those earlier SciDAC programs provide the knowledge needed to achieve breakthrough science in this data-rich era,” said Ross.