The Computation Institute, a joint effort of the University of Chicago and the U.S. Department of Energy's Argonne National Laboratory, has received a grant for a computer system that will enable researchers to store, access and analyze massive datasets.
The system is made possible by a $1.5 million grant from the National Science Foundation, which includes cost-sharing support from the University of Chicago. The new system is called the Petascale Active Data Store (PADS), which has been optimized for rapid data transactions, both on campus and around the globe.
Petascale computing involves the manipulation of petabytes of data. A petabyte is the equivalent of data contained on 1.5 million CD-ROMs.
The PADS design results from a study of the storage and analysis requirements of groups in astronomy and astrophysics, computer science, economics, evolutionary and organismal biology, geosciences, high-energy physics, linguistics, materials science, neuroscience, psychology and sociology.
For these groups, according to the PADS team, PADS represents a significant opportunity to look at their data in new ways, enabling new scientific insights and new collaborations across disciplines. PADS will also serve as a vehicle for computer science research into active data storage systems and will provide rich data with which to investigate new techniques.
Results will be made available as open source software, which can be freely downloaded and adapted for other purposes by interested users.
“PADS will bring a significant analysis resource to the University of Chicago campus and provide a testbed for research on high-performance analysis, a likely bottleneck in the scientific pipeline of the future,” said Michael Papka, Deputy Associate Laboratory Director for Computing, Environment, and Life Sciences at Argonne. Papka lead the interdisciplinary team of University of Chicago researchers who developed the PADS proposal.
PADS will be a hybrid system with many layers of storage. These layers range from a large, tape-based system at Argonne to individual computers on campus and elsewhere. The intermediate layer is a rack of computer disks at Argonne containing duplicate data sets as insurance against hard-drive failure.
To University of Chicago scientists, PADS represents a dramatic improvement over current practice, which requires them to quickly analyze data and then remove it from the system to make room for new datasets. With the storage that PADS provides, groups will be able to keep data active for longer periods of analysis.
Source: Argonne National Laboratory