XSEDE Resource Provides Open-Access Phylogenetic Supercomputing
A new Web resource developed at the San Diego Supercomputer Center (SDSC) at the University of California, San Diego is helping thousands of researchers worldwide unravel the enigmas of phylogenetics, the study of evolutionary relationships among virtually every species on the planet.
The CIPRES Science Gateway (CIPRES stands for Cyber Infrastructure for Phylogenetic RESearch), created by SDSC researchers, allows these studies to proceed in significantly shorter times without having to understand how to operate complex computers. Scientists anywhere in the world upload their data via a Web browser free of charge under a grant provided by the National Science Foundation (NSF).
CIPRES is part of the NSF’s Extreme Science and Engineering Discovery Environment (XSEDE). It is part of the XSEDE Science Gateway initiative, designed to provide scientists with broad and easy access to supercomputers.
Researchers say the gateway, and access to powerful supercomputers, are helping to answer increasingly sophisticated phylogenetic questions.
“The CIPRES Science Gateway makes it possible for researchers to make use of all this new information more quickly and effectively,” said Mark Miller, principal investigator of the CIPRES Gateway. “Our team is excited to have supported more than 300 publications of phylogenetic studies involving species in every branch of the Tree of Life.”
“It’s an important additional step in the conduct of science,” said Peter Nelson, a graduate student in the Department of Botany & Plant Pathology at Oregon State University in Corvallis. “This is a new opportunity for people who don’t yet have grant money, but who want to do meaningful research – and you don’t have to leave your computer.”
Nelson, a theorist in botany, is trying to understand the evolutionary processes that may operate one way in genetically homogeneous communities, but in a different way in more genetically diverse communities. He studies the divergence of tree species in North America. “We use GenBank and other sequence databases to gather the data, and free software is available to edit the sequences,” he said. “But the process is so computationally intensive I could never have accomplished it on a personal computer.”
Shedding new light on origins
All life forms, from simple bacteria to primates and plants, descended from a single common ancestor. A diagram of all the evolutionary relationships looks like a highly branched tree with the common ancestor at the base of the trunk, and extinct and living groups forming the branches. All living species are represented by leaves at the tips of the outermost limbs. This Tree of Life, like evolution itself, is not static; rather the branching process continues today as groups of individuals in single species, such as the Eastern Meadowlark appear to be splitting into two because of long-term geographical or environmental factors.
The phylogenetic history of each living species is contained in its DNA, and SDSC’s CIPRES Gateway is helping scientists analyze all the evolutionary relationships by making it possible for them to compare similarities and differences in the DNA among large numbers of species.
Phylogenetics is essential to understanding not only the history of life on earth, but also how populations of flowering plants, insects, crustaceans, fish, fungi, insects and microorganisms slowly change in response to their surroundings.
Such studies can also shed new light on how and where lineages began after challenging long-accepted theories. Researchers, for example, are using the CIPRES Gateway to clarify the evolution of wild grapes, which University of Florida Botany Professor J. Richard Abbott wrote, “indicate that American lineages could be older than Asian.” Abbott and his co-authors reported the controversial finding in a report in the February 2012 issue of Molecular Phylogenetics and Evolution.
In another project, Andrew F. Hugall and Devi Stuart-Fox, zoology researchers in the Department of Zoology at the University of Melbourne in Australia, used the CIPRES Gateway to provide the first phylogenetic analysis supporting an evolutionary theory that new species of birds are generated faster when the ancestral species exhibits color variations in its feathers.
Hugall and Stuart-Fox reported in the May 9, 2012, issue of Nature that speciation rates were almost three times higher for so-called color polymorphic species of birds of prey than similar monomorphic bird species. As the prevalence of feather-color polymorphism falls, so too does the rate of speciation.
The discipline of phylogenetic systematics combines taxonomy, or the description and naming of living species as well as fossilized life forms found in natural history museums, with modern phylogenetic studies. Systematic biologists combine a variety of sources of information, analyses and hypotheses to organize related groups of species, such as vertebrates, into clades and clades within clades. For example, the vertebrate clade is further subdivided into clades of amphibians, primates, rodents, and other groups of related species.
“Studies by systematic and evolutionary biologists have historically been limited by the number of available DNA sequences in public databases like GenBank,” said Miller. However, he added that modern DNA sequencing technologies generate data so quickly that analyzing all relevant data on conventional laptops can take weeks.
“There is a huge need in the community for easy access to computing resources,” said Miller. To meet that enthusiastic demand, Miller’s team at SDSC and their collaborators around the country continue to combine emerging techniques in computational biology with computer science.