plantData from millions of museum specimens, such as this Ziziphus celata or Florida jujube, are now available to scientists around the world via digital databases such as iDigBioCREDIT: Florida Museum photo by Jeff Gage

A group of Florida Museum of Natural History scientists has issued a "call to action" to use big data to tackle longstanding questions about plant diversity and evolution and forecast how plant life will fare on an increasingly human-dominated planet.

In a commentary published in Nature Plants, the scientists urged their colleagues to take advantage of massive, open-access data resources in their research and help grow these resources by filling in remaining data gaps.

"Using big data to address major biodiversity issues at the global scale has enormous practical implications, ranging from conservation efforts to predicting and buffering the impacts of climate change," said study author Doug Soltis, a Florida Museum curator and distinguished professor in the University of Florida department of biology. "The links between big data resources we see now were unimaginable just a decade ago. The time is ripe to leverage these tools and applications, not just for plants but for all groups of organisms."

Over several centuries, natural history museums have built collections of billions of specimens and their associated data, much of which is now available online. New technologies such as remote sensors and drones allow scientists to monitor plants and animals and transmit data in real time. And citizen scientists are contributing biological data by recording and reporting their observations via digital tools such as iNaturalist.

Together, these data resources provide scientists and conservationists with a wealth of information about the past, present, and future of life on Earth. As these databases have grown, so have the computational tools needed not only to analyze but also link immense data sets.

Studies that previously focused on a handful of species or a single plant community can now expand to a global level, thanks to the development of databases such as GenBank, which stores DNA sequences, iDigBio, a University of Florida-led effort to digitize U.S. natural history collections, and the Global Biodiversity Information Facility, a repository of species' location information.

These resources can be valuable to a wide range of users, from scientists in pursuit of fundamental insights into plant evolution and ecology to land managers and policymakers looking to identify the regions most in need of conservation, said Julie Allen, co-lead author and an assistant professor in the University of Nevada-Reno department of biology.

If Earth's plant life were a medical patient, small-scale studies might examine the plant equivalent of a cold sore or an ingrown toenail. With big data, scientists can gain a clearer understanding of global plant health as a whole, make timely diagnoses and prescribe the right treatment plans.

Such plans are urgently needed, Allen said.

"We're in this exciting and terrifying time in which the unprecedented amount of data available to us intersects with global threats to biodiversity such as habitat loss and climate change," said Allen, a former Florida Museum postdoctoral researcher and UF doctoral graduate. "Understanding the processes that have shaped our world—how plants are doing, where they are now and why—can help us get a handle on how they might respond to future changes."

Why is it so vital to track these regional and global changes?

"We can't survive without plants," said co-lead author and museum research associate Ryan Folk. "A lot of groups evolved in the shadow of flowering plants. As these plants spread and diversified, so did ants, beetles, ferns and other organisms. They are the base layer to the diversity of life we see on the planet today."

In addition to using and growing plant data resources, the authors hope the scientific community will address one of the toughest remaining obstacles to using biological big data: getting databases to work smoothly with each other.

"This is still a huge limitation," Allen said. "The data in each system are often collected in completely different ways. Integrating these to connect in seamless ways is a major challenge."