DOE Joint Genome Institute

  • COVID-19
  • About Us
  • Contact Us
  • Our Science
    • DOE Mission Areas
    • Bioenergy Research Centers
    • Science Programs
    • Science Highlights
    • Scientists
    Data yielded from RIViT-seq increased the number of sigma factor-gene pairs confirmed in Streptomyces coelicolor from 209 to 399. Here, grey arrows denote previously known regulation and red arrows are regulation identified by RIViT-seq; orange nodes mark sigma factors while gray nodes mark other genes. (Otani, H., Mouncey, N.J. Nat Commun 13, 3502 (2022). https://doi.org/10.1038/s41467-022-31191-w)
    Streamlining Regulon Identification in Bacteria
    Regulons are a group of genes that can be turned on or off by the same regulatory protein. RIViT-seq technology could speed up associating transcription factors with their target genes.

    More

    (PXFuel)
    Designer DNA: JGI Helps Users Blaze New Biosynthetic Pathways
    In a special issue of the journal Synthetic Biology, JGI scientific users share how they’ve worked with the JGI DNA Synthesis Science Program and what they’ve discovered through their collaborations.

    More

    A genetic element that generates targeted mutations, called diversity-generating retroelements (DGRs), are found in viruses, as well as bacteria and archaea. Most DGRs found in viruses appear to be in their tail fibers. These tail fibers – signified in the cartoon by the blue virus’ downward pointing ‘arms’— allow the virus to attach to one cell type (red), but not the other (purple). DGRs mutate these ‘arms,’ giving the virus opportunities to switch to different prey, like the purple cell. (Courtesy of Blair Paul)
    A Natural Mechanism Can Turbocharge Viral Evolution
    A team has discovered that diversity generating retroelements (DGRs) are not only widespread, but also surprisingly active. In viruses, DGRs appear to generate diversity quickly, allowing these viruses to target new microbial prey.

    More

  • Our Projects
    • Search JGI Projects
    • DOE Metrics/Statistics
    • Approved User Proposals
    • Legacy Projects
    Photograph of a stream of diatoms beneath Arctic sea ice.
    Polar Phytoplankton Need Zinc to Cope with the Cold
    As part of a long-term collaboration with the JGI Algal Program, researchers studying function and activity of phytoplankton genes in polar waters have found that these algae rely on dissolved zinc to photosynthesize.

    More

    This data image shows the monthly average sea surface temperature for May 2015. Between 2013 and 2016, a large mass of unusually warm ocean water--nicknamed the blob--dominated the North Pacific, indicated here by red, pink, and yellow colors signifying temperatures as much as three degrees Celsius (five degrees Fahrenheit) higher than average. Data are from the NASA Multi-scale Ultra-high Resolution Sea Surface Temperature (MUR SST) Analysis product. (Courtesy NASA Physical Oceanography Distributed Active Archive Center)
    When “The Blob” Made It Hotter Under the Water
    Researchers tracked the impact of a large-scale heatwave event in the ocean known as “The Blob” as part of an approved proposal through the Community Science Program.

    More

    A plantation of poplar trees. (David Gilbert)
    Genome Insider podcast: THE Bioenergy Tree
    The US Department of Energy’s favorite tree is poplar. In this episode, hear from ORNL scientists who have uncovered remarkable genetic secrets that bring us closer to making poplar an economical and sustainable source of energy and materials.

    More

  • Data & Tools
    • IMG
    • Data Portal
    • MycoCosm
    • PhycoCosm
    • Phytozome
    • GOLD
    HPCwire Editor's Choice Award (logo crop) for Best Use of HPC in the Life Sciences
    JGI Part of Berkeley Lab Team Awarded Best Use of HPC in Life Sciences
    The HPCwire Editors Choice Award for Best Use of HPC in Life Sciences went to the Berkeley Lab team comprised of JGI and ExaBiome Project team, supported by the DOE Exascale Computing Project for MetaHipMer, an end-to-end genome assembler that supports “an unprecedented assembly of environmental microbiomes.”

    More

    With a common set of "baseline metadata," JGI users can more easily access public data sets. (Steve Wilson)
    A User-Centered Approach to Accessing JGI Data
    Reflecting a structural shift in data access, the JGI Data Portal offers a way for users to more easily access public data sets through a common set of metadata.

    More

    Phytozome portal collage
    A More Intuitive Phytozome Interface
    Phytozome v13 now hosts upwards of 250 plant genomes and provides users with the genome browsers, gene pages, search, BLAST and BioMart data warehouse interfaces they have come to rely on, with a more intuitive interface.

    More

  • User Programs
    • Calls for Proposals
    • Special Initiatives & Programs
    • Product Offerings
    • User Support
    • Policies
    • Submit a Proposal
    screencap from Amundson and Wilkins subsurface microbiome video
    Digging into Microbial Ecosystems Deep Underground
    JGI users and microbiome researchers at Colorado State University have many questions about the microbial communities deep underground, including the role viral infection may play in other natural ecosystems.

    Read more

    Yeast strains engineered for the biochemical conversion of glucose to value-added products are limited in chemical output due to growth and viability constraints. Cell extracts provide an alternative format for chemical synthesis in the absence of cell growth by isolating the soluble components of lysed cells. By separating the production of enzymes (during growth) and the biochemical production process (in cell-free reactions), this framework enables biosynthesis of diverse chemical products at volumetric productivities greater than the source strains. (Blake Rasor)
    Boosting Small Molecule Production in Super “Soup”
    Researchers supported through the Emerging Technologies Opportunity Program describe a two-pronged approach that starts with engineered yeast cells but then moves out of the cell structure into a cell-free system.

    More

    These bright green spots are fluorescently labelled bacteria from soil collected from the surface of plant roots. For reference, the scale bar at bottom right is 10 micrometers long. (Rhona Stuart)
    A Powerful Technique to Study Microbes, Now Easier
    In JGI's Genome Insider podcast: LLNL biologist Jennifer Pett-Ridge collaborated with JGI scientists through the Emerging Technologies Opportunity Program to semi-automate experiments that measure microbial activity in soil.

    More

  • News & Publications
    • News
    • Blog
    • Podcasts
    • Webinars
    • Publications
    • Newsletter
    • Logos and Templates
    • Photos
    A view of the mangroves from which the giant bacteria were sampled in Guadeloupe. (Hugo Bret)
    Giant Bacteria Found in Guadeloupe Mangroves Challenge Traditional Concepts
    Harnessing JGI and Berkeley Lab resources, researchers characterized a giant - 5,000 times bigger than most bacteria - filamentous bacterium discovered in the Caribbean mangroves.

    More

    In their approved proposal, Frederick Colwell of Oregon State University and colleagues are interested in the microbial communities that live on Alaska’s glacially dominated Copper River Delta. They’re looking at how the microbes in these high latitude wetlands, such as the Copper River Delta wetland pond shown here, cycle carbon. (Courtesy of Rick Colwell)
    Monitoring Inter-Organism Interactions Within Ecosystems
    Many of the proposals approved through JGI's annual Community Science Program call focus on harnessing genomics to developing sustainable resources for biofuels and bioproducts.

    More

    Coloring the water, the algae Phaeocystis blooms off the side of the sampling vessel, Polarstern, in the temperate region of the North Atlantic. (Katrin Schmidt)
    Climate Change Threatens Base of Polar Oceans’ Bountiful Food Webs
    As warm-adapted microbes edge polewards, they’d oust resident tiny algae. It's a trend that threatens to destabilize the delicate marine food web and change the oceans as we know them.

    More

News & Publications
Home › News Releases › Longer DNA Fragments Reveal Rare Species Diversity

April 1, 2015

Longer DNA Fragments Reveal Rare Species Diversity

New sequence assembly technologies help reconstruct environmental microbial communities.

April 2015 cover of Genome Research features study evaluating sequencing technologies to accurately assess diversity of microbial communities

In a study published on the cover of the April 2015 edition of Genome Research, a team led by longtime DOE JGI collaborator Jill Banfield compared two ways of using the next generation Illumina sequencing machines, one of which produced significantly longer reads than the other. (Image courtesy of Genome Research)

Many microbes cannot be cultivated in a laboratory setting, hindering attempts to understand Earth’s microbial diversity. Since microbes are heavily involved in, and critically important to environmental processes from nutrient recycling, to carbon processing, to the fertility of topsoils, to the health and growth of plants and forests, accurately characterizing them, as a basis for understanding their activities, is a major goal of the Department of Energy (DOE). One approach has been to study collected DNA extracted from the complex microbial community, or the metagenome, in order to describe its DNA-coded “parts” catalog and understand how microbes respond and adapt to environmental changes. Studying a population rather than an individual raises different obstacles on the path to knowledge. The challenges of assembling genes and genomic fragments into meaningful sequence information for an unknown microbe has been likened to putting together a jigsaw puzzle without knowing what the final picture should look like, or even if you have all the pieces.

“For metagenomics,” said Jillian Banfield of the University of California, Berkeley and Lawrence Berkeley National Laboratory’s Earth Sciences Division, a longtime collaborator of the DOE Joint Genome Institute (DOE JGI), a DOE Office of Science User Facility, “it is like reconstructing puzzles from a mixture of pieces from many different puzzles—and not knowing what any of them look like.” Part of the problem lies in the fact that the more commonly used sequencing machines generate data in short lengths or fragments, on the order of a few hundred base pairs of DNA. Additionally, short-read assemblers may not be able to distinguish among multiple occurrences of the same or similar sequences and will therefore either fail to place them in the correct context, or eliminate them entirely from the final assembly, in the same way that putting together a jigsaw puzzle with many small pieces that look the same, is difficult. The result of this are gaps that indicate not all of the microbes in a community can be identified through the application of environmental genomics.

In a study published on the cover of the April 2015 edition of Genome Research, a team including DOE JGI and Berkeley Lab researchers compared two ways of using the next generation Illumina sequencing machines, one of which–TruSeq Synthetic Long-Reads–produced significantly longer reads than the other. Metagenome data were generated from the Berkeley Lab-led DOE subsurface biogeochemistry field study site in Rifle, Colorado by a Banfield-led team. They evaluated the accuracy of the genomes reconstructed from the sequences produced by the two Illumina technologies to learn more about the microbes present in lower amounts than others and better determine the species richness of the metagenome samples.

Ken Hurst Williams screencap from video about importance of evaluating true diversity in subsurface microbial communities

Berkeley Lab earth scientist Kenneth Hurst Williams describes the Genomes-to-Watershed Scientific Focus Area, and how the DOE JGI is contributing to the scientific effort. Watch the video at http://bit.ly/JGI15WIlliamsSFA.

The project is part of the Berkeley Lab Genomes-to-Watershed Scientific Focus Area (SFA), which involves over 50 scientists from Berkeley Lab and other institutions including UC Berkeley, Pacific Northwest National Laboratory, Colorado School of Mines, and Oak Ridge National Laboratory. The Genomes-to-Watershed SFA is led by geophysicist Susan Hubbard, the director of Berkeley Lab’s Earth Sciences Division. Its goal is to develop an approach for gaining a predictive understanding of complex, biologically based system interactions from the genome to the watershed scale. Jill Banfield is a co-lead of the Metabolic Potential component of this team project, which focuses on characterizing prevalent metabolic pathways in subsurface microbial communities that mediate carbon and electron flux, and using that information to inform genome-enabled watershed reactive transport simulators. Banfield describes the Metabolic Potential component of the SFA effort in this video, and some of her group’s other recent groundbreaking subsurface ecogenomic findings associated with this project can be found here.

Revisiting Microbial Communities in Rifle, Colorado

For the study, the team used sediment samples collected from an aquifer adjacent to the Colorado River, which had been used for previous experiments. For one of these earlier efforts, the DOE JGI sequenced Rifle Site microbial communities and was able to completely reconstruct a high quality genome of a previously unknown organism from short-read assemblies. Additionally, the findings revealed that many of the bacteria and archaea found in the samples had not been previously recognized or sampled.

For their study, the researchers compared the sequences and assemblies generated from Illumina’s short read technology with the data from the newer, longer-read technology that generates read lengths of up around 8,000 base pairs. They found that the longer reads captured more of the community’s diverse species. For instance, using short read technology, they previously identified just over 160 microbial species within a sediment sample. Using the longer-read technology, though, over 400 microbial species from the sample could be phylogenetically classified, though some accounted for just 0.1 percent of the community.

The study’s first author, Itai Sharon of UC Berkeley, pointed out that they also identified species that previously failed to assemble due to the presence of closely related species within the sample. These close relatives, accounting for as much as 15 percent of the community, confounded the assembly algorithm. “These populations were pretty much missed by the short read assemblies because assemblers tend to fail at the presence of multiple closely related species and strains. Using algorithms that we developed for analyzing the long reads we were able to reconstruct genome architecture for these populations,” he said.

“Extending the analysis further to species with a lower abundance suggests that at least … 2,100 different species are present,” the team reported. “The true number of species is therefore expected to be much higher – probably at the range of several thousands or tens of thousands of different species.”

Longer Reads Add Value to Sequencing Capabilities

The difference between the results suggests that the assembly of thousands of rare genomes by short reads failed due to insufficient coverage despite significant sequencing efforts. On the other hand, the longer reads revealed this “long tail” of previously undetected microbial species that were present in very low abundance in the metagenome samples. In addition, short reads assembled poorly for closely related genomes even when enough sequencing coverage is available. Using the long reads it was possible to reconstruct gene order for most of these genomes.

“The availability of both short and long read data allowed us to explore patterns of population diversity, taxonomic diversity, and organism abundance levels using genome sequence information for rare as well as more abundant organisms,” the team reported. “Overall, short and long read data provide complementary advantages for metagenome studies, thus making the use of both technologies together more powerful than use of one alone.”

DOE JGI Metagenome Program head Susannah Tringe noted that while the Rifle studies came out of the Community Science Project (CSP), the longer-read analyses conducted and reported in this study were motivated in part by the DOE JGI’s Emerging Technologies Opportunity Program (ETOP). Launched in 2013, the program seeks to develop and support selected new technologies that the DOE JGI could establish to add value to the high-throughput sequencing it currently carries out for its users. “We’re not just motivated by wanting to learn about Rifle, but how to use these technologies to learn about microbial communities through ETOP,” she added. The inaugural ETOP cycle focuses on six capabilities, one of them a project from Banfield. A key yield from this new sequencing approach is a much more detailed characterization of the microbial communities within a sampled site; not only does this furnish an improved understanding of the processes mediated by microbes taking place at that site—which can include carbon capture, contaminant remediation, or the breakdown of plant and other organic materials which can have bioenergy interest—but also the discovery of new genes and enzymes of interest to DOE missions.

DOE JGI Community Science Program (CSP) is now accepting letters of intent for large-scale sequence-based genomic science projects addressing DOE missions Focus areas of this year’s call include extreme environments including deep subsurface. Additional information can be found at http://bit.ly/JGI-2016-CSP. The deadline for letters of intent is April 16, 2015.

Itai Sharon spoke at the 2014 DOE JGI Genomics of Energy & Environment Meeting on the benefits of multi-Kb Illumina reads. Watch his talk on the DOE JGI’s YouTube channel at http://bit.ly/JGIUM9_Sharon.

The research presented in the publication is supported by the Subsurface Biogeochemistry Program within the U.S. Department of Energy Office of Science, Office of Biological and Environmental Research.

Share this:

  • Click to share on Facebook (Opens in new window)
  • Click to share on LinkedIn (Opens in new window)
  • Click to share on Pinterest (Opens in new window)
  • Click to share on Twitter (Opens in new window)
  • Click to print (Opens in new window)

The U.S. Department of Energy Joint Genome Institute, a DOE Office of Science User Facility at Lawrence Berkeley National Laboratory, is committed to advancing genomics in support of DOE missions related to clean energy generation and environmental characterization and cleanup. JGI provides integrated high-throughput sequencing and computational analysis that enable systems-based scientific approaches to these challenges. Follow @jgi on Twitter.

DOE’s Office of Science is the largest supporter of basic research in the physical sciences in the United States, and is working to address some of the most pressing challenges of our time. For more information, please visit science.energy.gov.

Filed Under: News Releases

More topics:

  • COVID-19 Status
  • News
  • Science Highlights
  • Blog
  • Webinars
  • CSP Plans
  • Featured Profiles

Related Content:

Busting the Unbreakable Lignin

Pictured is a micrograph of Neocallimastix californiae.

Tracing the Evolution of Shiitake Mushrooms

A vertical tree stump outdoors with about a dozen shiitake mushrooms sprouting from its surface.

JGI announces final round of 2022 Functional Genomics awardees

Digital ID card with six headshots reads: Congratulations to our 2022 Function Genomics recipients!

Introducing New Members of the JGI User Executive Committee

incoming 2023 UEC members

JGI at 25: Mapping Switchgrass Traits with Common Gardens

Aerial photo of the switchgrass diversity panel late in the 2020 season at the Kellogg Biological Station in Michigan. (Robert Goodwin)

JGI Contributes Nine to 2022 Highly Cited Researchers List

Nine headshots, one for each researcher, laid out beside a purple ribbon reading, "Home to Highly Cited Researchers 2022 Clarivate"
  • Careers
  • Contact Us
  • Events
  • User Meeting
  • MGM Workshops
  • Internal
  • Disclaimer
  • Credits
  • Policies
  • Emergency Info
  • Accessibility / Section 508 Statement
  • Flickr
  • LinkedIn
  • RSS
  • Twitter
  • YouTube
Lawrence Berkeley National Lab Biosciences Area
A project of the US Department of Energy, Office of Science

JGI is a DOE Office of Science User Facility managed by Lawrence Berkeley National Laboratory

© 1997-2023 The Regents of the University of California