Lab Manager | Run Your Lab Like a Business

Berkeley Lab Database Deciphers Secrets of Microscopic Life

Greengenes is helping in the search for biofuels and other important research goals.

by Other Author
Register for free to listen to this article
Listen with Speechify

A handful of muck or a bucket of water can teem with millions of microorganisms — a few of which could be the next big thing when it comes to learning how to create biofuels or understanding the planet’s carbon cycle.

This search for the movers and shakers of the microbial world is getting easier thanks to a database of “fingerprints” maintained by Lawrence Berkeley National Laboratory (Berkeley Lab) scientists that surpassed one million entries earlier this year.

The database, called Greengenes, is one of the world’s largest collections of high-quality DNA sequences of 16S ribosomal RNA genes. These protein-making genes are found in all microbes, and in general each species has a unique variation. They’re genetic IDs, the one thing that can finger a specific microbe in a crowded lineup, if you know which 16S rRNA belongs to which microbe.

That’s where Greengenes come in. Researchers from around the world can access the database online and enter 16S rRNA sequences extracted from samples of soil, water, and even intestinal bacteria. A match with a sequence in Greengenes is a giveaway that a specific microbe is in the sample. If there’s not a match, perhaps a new species has been discovered.

In this way, Greengenes is fast becoming a go-to resource for scientists seeking to better understand what microbes do, their diversity, and what we can learn from them. The database launched in 2002 and now gets about 100 citations per year in scientific papers.

“Our goal is to develop the highest quality reference set so scientists can use it to better understand life at the microscopic scale. We want to cover as much microbial diversity on Earth as possible,” says Todd DeSantis, a scientist in Berkeley Lab’s Earth Sciences Division who led the development of the database under the auspices of Gary Andersen’s lab.

Among its many hits, Stanford University scientists used the database to discover a microorganism in San Francisco Bay sediments that plays a role in the carbon and nitrogen cycles. The scientists could see the ammonia-oxidizing archaea under the microscope, but they couldn’t grow it in the lab. They extracted its DNA, sequenced it, and compared to known strains in Greengenes. It was unique, and a new organism was named: Candidatus Nitrosoarchaeum limnia SFB1.

A Cornell University-led team used Greengenes to identify microbes that efficiently convert industrial wastewater into methane. Their work could help scientists engineer microbial communities that are optimized to digest wastewater and emit methane for use as an energy source.

Elsewhere, a team from the University of Milan used the database to analyze bacterial DNA from stains on the pages of Leonardo da Vinci’s multi-volume Codex Atlanticus. They found matches to bacteria previously isolated from cleanrooms and human skin, which led the team to recommend new ways to protect texts from deterioration.

And a Danish team used the database to improve the treatment of a disease, called necrotizing enterocolitis, which is marked by inappropriate bacteria colonizing an infant’s intestines.

Expect more uses from Greengenes as it continues to grow. When scientists find a 16S rRNA gene in the course of their research, they submit its sequence to one of many gene databanks. Greengenes scours these databanks for new entries. When it finds one, it uses a computer program to compare the sequence to other 16S rRNA genes and to ensure its quality. Only the best and most complete sequences are added.

“There are tens of millions of 16S-like sequences in public databases, but we only want the highest quality sequences to use as references,” says DeSantis.

Lawrence Berkeley National Laboratory addresses the world’s most urgent scientific challenges by advancing sustainable energy, protecting human health, creating new materials, and revealing the origin and fate of the universe. Founded in 1931, Berkeley Lab’s scientific expertise has been recognized with 12 Nobel prizes. The University of California manages Berkeley Lab for the U.S. Department of Energy’s Office of Science. For more, visit

Additional information:

  • Learn more about Greengenes.
  • Greengenes is also instrumental in the development of the PhyloChip, which quickly and accurately identifies microbes in complex samples.