WALNUT CREEK, CA—The Earth is estimated to have about a nonillion (10 to the 30th) microbes in, on, around, and under it, comprised of an unknown but very large number of distinct species. Despite the widespread availability of microbial genome data—close to 2,000 microbes have been and are being decoded to date—a vast unknown realm awaits scientists intent on exploring microorganisms that inhabit this “undiscovered country.”
Two thousand years after Pliny the Elder compiled one of the earliest surviving encyclopedic works, and in the spirit of his goal of providing “light to the obscure,” the Department of Energy Joint Genome Institute (DOE JGI) has published the initial “volume” of the Genomic Encyclopedia of Bacteria and Archaea (GEBA). Presenting a provocative glimpse into this uncharted territory, an analysis of the first 56 genomes representing two of the three domains of the tree of life appears in the December 24 edition of the journal Nature.
“Microbes mediate almost every conceivable biological process on the planet and genome sequencing has revolutionized our understanding of the diverse roles that they play,” said DOE JGI Director Eddy Rubin. “The information from this first set of organisms has provided a rich source of novel enzymes and detailed biochemical pathways that can help scientists optimize processes of critical importance to areas of the DOE mission, such as biofuels production, bioremediation, and how carbon is captured and cycled in the environment.”
Most studies in microbiology have exploited a narrow subset of the evolutionary diversity of bacteria and archaea known to exist, and were selected more for convenience (and because they cause diseases) rather than the opportunity to advance discovery science. From the tree of microbial diversity the genomes from only a few branches have been sequenced. The DOE JGI is now exploring Earth’s microbial “dark matter” with a project to sequence little-studied microbial species that will inform other microbes and complex microbial communities.
“The main driver behind the GEBA project is that while the currently available sequenced genomes cover a wide range of biological and functional diversity, they have not covered a wide enough range of phylogenetic diversity,” said senior author Jonathan Eisen, DOE JGI Phylogenomics Program Head and University of California, Davis Professor. “What distinguishes GEBA is that it is less about the individual genomes and more about building a more balanced catalog of the diversity of genomes present on the planet which in turn should facilitate searches for novel functions and our understanding of the complex processes of the biosphere.”
Beyond filling in what he refers to as the “phylogenetic dark matter of the biological universe,” Eisen said that the information flowing from the project will shed light on the diversity of gene families and improve the understanding of how microbes acquire new functions. In addition, the newly sequenced organisms will provide urgently needed anchors for the improved annotation (assessment of biological function) of data emerging from the many ongoing projects that have expanded upon the idea of studying individual microbes by studying entire communities, deciphering specific microbial capabilities from complex environmental samples. A key outcome will be new gene products and enzymes previously unknown to biologists.
“Microbes run the world. It’s that simple.” These bold words open a 2007 National Academy of Sciences report on this study of microbial communities or “Metagenomics.” The DOE has a well-established tradition of contributing to the advancement of microbial genomics for energy and environmental applications.
Already, several of the characterized microbes from the first GEBA “volume” are paying dividends. DOE JGI researchers Natalia Ivanova and Athanasios Lykidis discovered a novel set of cellulases—enzymes capable of breaking down plant material into sugars that can be rendered into transportation fuel—in a variety of GEBA organisms. In partnership with the DOE Joint BioEnergy Institute, researchers synthesized these genes and have begun to characterize them. These enzymes are of particular interest because they should be active in highly acidic environments, which could make them valuable for the liquid pretreatment of biomass feedstocks for biofuels.
The GEBA pilot was launched in May 2007 in collaboration with the non-profit German Collection of Microorganisms and Cell Cultures, DSMZ (http://www.dsmz.de/), to sequence 100 bacterial and archaeal genomes based on the phylogenetic positions of organisms.
“The GEBA project perfectly fits with our vision for the future of microbial taxonomy and the collection of type strains in general,” said Hans-Peter Klenk, Head of the Department of Microbiology at DSMZ. “DSMZ will provide easy and affordable access to biological material, cultures as well as DNA, of all GEBA pilot project strains to the worldwide scientific community—without any strings attached. Moreover, participation in the GEBA pilot project provides an excellent opportunity to train the next generation of genome scientists.”
“GEBA is a triumph of edgy science from two government institutions with perfect complementarities, forming an international partnership for the benefit of the entire community,” said Nikos Kyrpides, JGI Genome Biology Program Head, who helped launch the project and whose group designed and administers the GEBA data management and analysis system in collaboration with the Biological Data Management and Technology Center of LBNL: http://img.jgi.doe.gov/geba.
“This is only the start,” said Eisen, reinforcing the magnitude of the project beyond the pilot phase. “The known phylogenetic diversity of bacteria and archaea is immense with hundreds of major lineages and probably millions if not hundreds of millions of species. This encyclopedia project is starting at the top – with the major phylogenetic groups – 100 genomes from across the tree. But we have barely scratched the surface of characterizing the diversity on the planet.” Eisen and his colleagues hope to extend GEBA beyond the pilot phase to sequence hundreds, and perhaps even thousands, of genomes from additional unknown microbes.
Detailed descriptions for all of the individual sequenced GEBA organisms are already being published in the recently launched Journal Standards in Genomic Sciences (SIGS) the official open access online publication of the Genomic Standards Consortium (GSC).
Click here to watch a video of Eisen featuring the GEBA project.