DOE Joint Genome Institute

  • COVID-19
  • About
  • Phones
  • Contacts
  • Our Science
    • DOE Mission Areas
    • Bioenergy Research Centers
    • Science Programs
    • Products
    • Science Highlights
    • Scientists
    Maize can produce a cocktail of antibiotics with a handful of enzymes. (Sam Fentress, CC BY-SA 2.0)
    How Maize Makes An Antibiotic Cocktail
    Zealexins are produced in every corn variety and protect maize by fending off fungal and microbial infections using surprisingly few enzymes.

    More

    The genome of the common fiber vase or Thelephora terrestris was among those used in the study. (Francis Martin)
    From Competition to Cooperation
    By comparing 135 fungal sequenced genomes, researchers were able to carry out a broader analysis than had ever been done before to look at how saprotrophs have transitioned to the symbiotic lifestyle.

    More

    Miscanthus grasses. (Roy Kaltschmidt/Berkeley Lab)
    A Grass Model to Help Improve Giant Miscanthus
    The reference genome for M. sinensis, and the associated genomic tools, allows Miscanthus to both inform and benefit from breeding programs of related candidate bioenergy feedstock crops such as sugarcane and sorghum.

    More

  • Our Projects
    • Search JGI Projects
    • DOE Metrics/Statistics
    • Approved User Proposals
    • Legacy Projects
    Poplar (Populus trichocarpa and P. deltoides) grow in the Advanced Plant Phenotyping Laboratory (APPL) at Oak Ridge National Laboratory in Tennessee. Poplar is an important biofuel feedstock, and Populus trichocarpa is the first tree species to have its genome sequenced — a feat accomplished by JGI. (Image courtesy of Oak Ridge National Laboratory, U.S. Dept. of Energy)
    Podcast: Xiaohan Yang on A Plantiful Future
    Building off plant genomics collaborations between the JGI and Oak Ridge National Laboratory, Xiaohan Yang envisions customizing plants for the benefit of human society.

    More:

    Expansin complex with cell wall in background. (Courtesy of Daniel Cosgrove)
    Synthesizing Microbial Expansins with Unusual Activities
    Expansin proteins from diverse microbes have potential uses in deconstructing lignocellulosic biomass for conversion to renewable biofuels, nanocellulosic fibers, and commodity biochemicals.

    Read more

    High oleic pennycress. (Courtesy of Ratan Chopra)
    Pennycress – A Solution for Global Food Security, Renewable Energy and Ecosystem Benefits
    Pennycress (Thlaspi arvense) is under development as a winter annual oilseed bioenergy crop. It could produce up to 3 billion gallons of seed oil annually while reducing soil erosion and fertilizer runoff.

    Read more

  • Data & Tools
    • IMG
    • Genome Portal
    • MycoCosm
    • PhycoCosm
    • Phytozome
    • GOLD
    Artistic interpretation of CheckV assessing virus genome sequences from environmental samples. (Rendered by Zosia Rostomian​, Berkeley Lab)
    An Automated Tool for Assessing Virus Data Quality
    CheckV can be broadly utilized by the research community to gauge virus data quality and will help researchers to follow best practices and guidelines for providing the minimum amount of information for an uncultivated virus genome.

    More

    Unicellular algae in the Chlorella genus, magnified 1300x. (Andrei Savitsky)
    A One-Stop Shop for Analyzing Algal Genomes
    The PhycoCosm data portal is an interactive browser that allows algal scientists and enthusiasts to look deep into more than 100 algal genomes, compare them, and visualize supporting experimental data.

    More

    Artistic interpretation of how microbial genome sequences from the GEM catalog can help fill in gaps of knowledge about the microbes that play key roles in the Earth's microbiomes. (Rendered by Zosia Rostomian​, Berkeley Lab)
    Podcast: A Primer on Genome Mining
    In Natural Prodcast: the basics of genome mining, and how JGI researchers conducted it in IMG/ABC on thousands of metagenome-derived genomes for a Nature Biotechnology paper.

    Read more

  • User Programs
    • Calls for User Proposals
    • Special Initiatives & Programs
    • User Support
    • Submit a Proposal
    Scanning electron micrographs of diverse diatoms. (Credits: Diana Sarno, Marina Montresor, Nicole Poulsen, Gerhard Dieckmann)
    Learn About the Approved 2021 Large-Scale CSP Proposals
    A total of 27 proposals have been approved through JGI's annual Community Science Program (CSP) call. For the first time, 63 percent of the accepted proposals come from researchers who have not previously been a principal investigator on an approved JGI proposal.

    Read more

    MiddleGaylor Michael Beman UC Merced
    How to Successfully Apply for a CSP Proposal
    Reach out to JGI staff for feedback before submitting a proposal. Be sure to describe in detail what you will do with the data.

    Read more

    Click on the image or go here to watch the video "Enriching target populations for genomic analyses using HCR-FISH" from the journal Microbiome describing the research.
    How to Target a Microbial Needle within a Community Haystack
    Enabled by the JGI’s Emerging Technologies Opportunity Program, researchers have developed, tested and deployed a pipeline to first target cells from communities of uncultivated microbes, and then efficiently retrieve and characterize their genomes.

    Read more

  • News & Publications
    • News
    • Blog
    • Podcasts
    • Publications
    • Scientific Posters
    • Newsletter
    • Logos and Templates
    • Photos
    Artistic interpretation of how microbial genome sequences from the GEM catalog can help fill in gaps of knowledge about the microbes that play key roles in the Earth's microbiomes. (Rendered by Zosia Rostomian​, Berkeley Lab)
    Uncovering Novel Genomes from Earth’s Microbiomes
    A public repository of 52,515 microbial draft genomes generated from environmental samples around the world, expanding the known diversity of bacteria and archaea by 44%, is now available .

    More

    Green millet (Setaria viridis) plant collected in the wild. (Courtesy of the Kellogg lab)
    Shattering Expectations: Novel Seed Dispersal Gene Found in Green Millet
    In Nature Biotechnology, a very high quality reference Setaria viridis genome was sequenced, and for the first time in wild populations, a gene related to seed dispersal was identified.

    More

    The Brachypodium distachyon-B. stacei-B. hybridum polyploid model complex. (Illustrations credits: Juan Luis Castillo)
    The More the Merrier: Making the Case for Plant Pan-genomes
    Crop breeders have harnessed polyploidy to increase fruit and flower size, and confer stress tolerance traits. Using a Brachypodium model system, researchers have sought to learn the origins, evolution and development of plant polyploids. The work recently appeared in Nature Communications.

    Read more

News & Publications
Home › Blog › Expanding Virophage Diversity

December 23, 2019

Expanding Virophage Diversity

Virophage discovery pipeline. (A) MCP amino acid sequences from reference isolated genomes and published metagenomic contigs were queried against the IMG/VR database with stringent e value cutoffs. All homologous sequences detected were then clustered together to build four independent MCP profiles. (B) The resulting four MCP models were used to recruit additional homologous sequences from the entire IMG/M system. All new sequences were clustered, and models were built creating a final set of 15 unique MCP HMMs. (C) These 15 unique MCP HMMs were then used to search two different databases for homologous sequences: the IMG/M system and a custom assembled human gut database containing 3771 samples from NCBI’s Sequence Read Archive (SRA). (D) The resulting set of 28,294 non-redundant (NR) sequences with stringent e value cutoffs was filtered by size and e by the presence of the four core virophage genes (high-quality genomes; HQ virophages). Finally, completeness of novel metagenomic virophage genomes wsa predicted based on circularity or presence of inverted terminal repeats (ITR). (Figure from Paez-Espino et al. Microbiome (2019) 7:157 https://doi.org/10.1186/s40168-019-0768-5)

Virophage discovery pipeline. (A) MCP amino acid sequences from reference isolated genomes and published metagenomic contigs were queried against the IMG/VR database with stringent e value cutoffs. All homologous sequences detected were then clustered together to build four independent MCP profiles. (B) The resulting four MCP models were used to recruit additional homologous sequences from the entire IMG/M system. All new sequences were clustered, and models were built creating a final set of 15 unique MCP HMMs. (C) These 15 unique MCP HMMs were then used to search two different databases for homologous sequences: the IMG/M system and a custom assembled human gut database containing 3771 samples from NCBI’s Sequence Read Archive (SRA). (D) The resulting set of 28,294 non-redundant (NR) sequences with stringent e value cutoffs was filtered by size and (E) by the presence of the four core virophage genes (high-quality genomes; HQ virophages). Finally, completeness of novel metagenomic virophage genomes wsa predicted based on circularity or presence of inverted terminal repeats (ITR). (Figure from Paez-Espino et al. Microbiome (2019) 7:157 https://doi.org/10.1186/s40168-019-0768-5)

Virophages are small viruses with double-stranded DNA genomes that co-infect eukaryotic cells along with giant viruses. Almost all known virophage genomes share only four genes in common: major and minor capsid proteins (MCP and mCP, respectively), ATPase involved in DNA packaging, and PRO, a cysteine protease involved in capsid maturation.

Recently reported in Microbiome, researchers from the US Department of Energy (DOE) Joint Genome Institute (JGI), a DOE Office of Science facility, have increased the number of known high quality virophage genome sequences 10-fold through computational approaches. By mining more than 14,000 publicly available metagenomic datasets in JGI’s Integrated Microbial Genomes & Microbiomes (IMG/M) data suite—which includes IMG/VR (for Virus)—for the virophage marker gene MCP, they were able to identify 44,221 total virophage partial sequences, including over 28,000 unique MCP sequences.

Further analysis led to the identification of 328 “high quality” (based on completeness) diverse new virophage genomes containing all four core genes. These virophages were found in diverse habitats including the air, plant rhizosphere, wastewater, and even animal and human (for the first time) gut and taxonomically classified into 27 distinct clades (17 of them without previously known representatives). Of these, 89 contigs were considered to be complete genomes, and their discovery has extended the possible virophage genome size range from 13.8-29.3 kilobases (Kb) to 10.9-42.3 Kb. Additionally, the gene counts have similarly gone up from 13-25 to 12-39. “Overall,” the team concluded, “we provide a global analysis of the diversity, distribution, and evolution of virophages.”

Lead author David Paez-Espino discussed an early version of the paper at the 2018 Viral EcoGenomics & Applications (VEGA) Symposium. Early-bird registration rates currently apply for the 2020 VEGA Symposium, which immediately precedes the 15th Annual JGI Genomics of Energy & Environment Meeting! Register now at https://usermeeting.jgi.doe.gov/vega!

The work also used resources of the National Energy Research Scientific Computing Center (NERSC), which is supported by the Office of Science of the U.S. Department of Energy.

Publication:

  • Paez-Espino D, Zhou J, Roux S, Nayfach S, Pavlopoulos GA, Schulz F, McMahon KD, Walsh D, Woyke T, Ivanova NN, Eloe-Fadrosh EA, Tringe SG, Kyrpides NC. Diversity, evolution, and classification of virophages uncovered through global metagenomics. Microbiome. 2019 Dec 10;7(1):157. doi: 10.1186/s40168-019-0768-5.

Related Links:

  • David Paez-Espino at the 2018 VEGA Symposium: http://bit.ly/JGI2018PaezEspino2VEGA
  • VEGA Symposium at the 15th Annual JGI Genomics of Energy & Environment Meeting
  • JGI 2017 News Release: Tracking the Viral Parasites of Giant Viruses over Time

 

Share this:

  • Click to share on Facebook (Opens in new window)
  • Click to share on LinkedIn (Opens in new window)
  • Click to share on Pinterest (Opens in new window)
  • Click to share on Twitter (Opens in new window)
  • Click to print (Opens in new window)

Filed Under: Blog

More topics:

  • COVID-19 Status
  • News
  • Science Highlights
  • Blog
  • Podcasts
  • CSP Plans
  • Featured Profiles

Related Content:

JGI on the 2020 Highly Cited Researchers List

2020 Highly Cited Researchers at the JGI

Tanja Woyke Awarded van Niel International Prize for Studies in Bacterial Systematics

Tanja Woyke JGI

Harnessing JGI’s Metabolomics Capabilities

JGI engagement webinar:harnessing metabolomics capabilities

UC Merced Interns Reflect on Their JGI Summer Projects

Screencap Axel Visel intro video

JGI Welcomes New UEC Members

(left to right) Kathleen Greenham of the University of Minnesota, Matthias Hess of the University of California, Davis and Kristen DeAngelis of University of Massachusetts-Amherst,

The JGI Data Portal: Improving User Experience

JGI Data Portal screencap
  • Careers
  • Contact Us
  • Events
  • User Meeting
  • MGM Workshops
  • Internal
  • Disclaimer
  • Credits
  • Emergency Info
  • Accessibility / Section 508 Statement
  • RSS feed
  • Flickr
  • LinkedIn
  • Twitter
  • YouTube
Lawrence Berkeley National Lab Biosciences Area
A project of the US Department of Energy, Office of Science

JGI is a DOE Office of Science User Facility managed by Lawrence Berkeley National Laboratory

© 1997-2021 The Regents of the University of California