Building Research Infrastructure for the InfiniteImage courtesy of the University of Arizona

How do researchers turn mountains of raw data into tangible, applicable breakthroughs?

It's the core issue facing scientific exploration in the 21st century and beyond, and it's one the University of Arizona is prepared to address in a big way in 2016.

Related article: Automating Big-Data Analysis

"We're in an era right now where collecting data has far outpaced our ability to house it, analyze it and translate it into something meaningful," said Parker Antin, professor at the UA College of Medicine, associate dean for research of the College of Agriculture and Life Sciences, a member of the UA Sarver Heart Center, an affiliate of the BIO5 Institute and president of the Federation of American Societies for Experimental Biology.

Parker AntinParker AntinPhoto credit: Mark Thaler/UAHS BioCommunicationsThink of it this way. The Internet is often thought of as a repository for the history of human knowledge. It has irrevocably altered every mode of communication and any research effort currently underway. The data that exists within this system, however, requires server space to house it, secure connections to share it and individuals with the technological expertise to access it.

But that is barely the tip of the iceberg.

Storage is finite, but data is not. Today's researchers across all disciplines are embarking upon unprecedented information collection efforts in search of generation-defining breakthroughs, which means the sheer amount of data at their disposal is simply impossible to comprehend.

This will, of course, lead to new lines of questioning requiring even more data-processing infrastructure. Information will increase exponentially in perpetuity. Current data management platforms inevitably will buckle under the weight of this new information, grinding research to a halt and slowing the Internet to a crawl.

"It is our job to connect data across all levels to see the big picture—to sift through the data and figure out 'why' instead of simply 'what,'" said UA Kimberly Andrews Espy, the UA's senior vice president for research. "Without the proper platforms, researchers spend too much time staring at frozen computer screens and not enough time making discoveries."

Related article: INSIGHTS on Big Data in Drug Discovery

The University's leadership role in this effort began in 2008, when the UA-led iPlant Collaborative was launched with a $50 million grant from the National Science Foundation to provide computational infrastructure for plant sciences. The platform was so sound and robust that a number of institutions saw its potential to expand well beyond plant sciences.

iPlant recently transitioned into CyVerse, expanding its data management and computational infrastructure services across a variety of scientific disciplines. It is a continuing collaboration among four institutions, led by the UA. Partner sites are the Texas Advanced Computing Center, Cold Spring Harbor Laboratory and the University of North Carolina, Wilmington.

Antin, CyVerse's principal investigator, says flexibility is the program's greatest asset.

Related article: Cutting Cost and Power Consumption for Big Data

"We've created an environment where researchers can store, share and analyze large data sets without having to know all of the back-end functionality," Antin said. "Our stacked infrastructure takes advantage of a myriad of already available resources and leverages their strengths to unlock an entirely new set of capabilities."

Researchers can simply visit the CyVerse website at www.cyverse.org and sign up for an account. It's an open-source platform that requires only an Internet connection to access it.

"Fundamentally, it's people who answer questions," Espy said. "We're enabling a human interface that allows researchers to collaborate with each other. That's the power of this program — we put a human face to it."

In addition to the UA's groundbreaking data storage efforts, it also is among the leaders in the recently formedAmerican Institute for Manufacturing Integrated Photonics (AIM Photonics) Consortium. This New York-based public-private partnership is developing photonic integrated circuits, or PICs, a light-based method of quickly and securely transferring data at much higher speeds than through the current fiber optic grid system.

Over the last 18 months, Thomas Koch, dean of the UA College of Optical Sciences, led the effort to establish both the technical concepts and the academic, industry and government partnerships to realize the prodigious manufacturing capabilities that AIM Photonics will represent.

"This is an absolutely thrilling project, and it will enable computing like nothing that came before it," Espy said. "UA researchers are at the cutting edge of this breakthrough, and our students are seeing it first."

Essentially, the UA is leading the effort to house and securely share the past and future sum of human knowledge at the speed of light.

"CyVerse is the most exciting project I've ever been involved with," Antin said. "This is the new frontier."