DOE Joint Genome Institute

  • COVID-19
  • About Us
  • Contact Us
  • Our Science
    • DOE Mission Areas
    • Bioenergy Research Centers
    • Science Programs
    • Science Highlights
    • Scientists
    Data yielded from RIViT-seq increased the number of sigma factor-gene pairs confirmed in Streptomyces coelicolor from 209 to 399. Here, grey arrows denote previously known regulation and red arrows are regulation identified by RIViT-seq; orange nodes mark sigma factors while gray nodes mark other genes. (Otani, H., Mouncey, N.J. Nat Commun 13, 3502 (2022). https://doi.org/10.1038/s41467-022-31191-w)
    Streamlining Regulon Identification in Bacteria
    Regulons are a group of genes that can be turned on or off by the same regulatory protein. RIViT-seq technology could speed up associating transcription factors with their target genes.

    More

    (PXFuel)
    Designer DNA: JGI Helps Users Blaze New Biosynthetic Pathways
    In a special issue of the journal Synthetic Biology, JGI scientific users share how they’ve worked with the JGI DNA Synthesis Science Program and what they’ve discovered through their collaborations.

    More

    A genetic element that generates targeted mutations, called diversity-generating retroelements (DGRs), are found in viruses, as well as bacteria and archaea. Most DGRs found in viruses appear to be in their tail fibers. These tail fibers – signified in the cartoon by the blue virus’ downward pointing ‘arms’— allow the virus to attach to one cell type (red), but not the other (purple). DGRs mutate these ‘arms,’ giving the virus opportunities to switch to different prey, like the purple cell. (Courtesy of Blair Paul)
    A Natural Mechanism Can Turbocharge Viral Evolution
    A team has discovered that diversity generating retroelements (DGRs) are not only widespread, but also surprisingly active. In viruses, DGRs appear to generate diversity quickly, allowing these viruses to target new microbial prey.

    More

  • Our Projects
    • Search JGI Projects
    • DOE Metrics/Statistics
    • Approved User Proposals
    • Legacy Projects
    Photograph of a stream of diatoms beneath Arctic sea ice.
    Polar Phytoplankton Need Zinc to Cope with the Cold
    As part of a long-term collaboration with the JGI Algal Program, researchers studying function and activity of phytoplankton genes in polar waters have found that these algae rely on dissolved zinc to photosynthesize.

    More

    This data image shows the monthly average sea surface temperature for May 2015. Between 2013 and 2016, a large mass of unusually warm ocean water--nicknamed the blob--dominated the North Pacific, indicated here by red, pink, and yellow colors signifying temperatures as much as three degrees Celsius (five degrees Fahrenheit) higher than average. Data are from the NASA Multi-scale Ultra-high Resolution Sea Surface Temperature (MUR SST) Analysis product. (Courtesy NASA Physical Oceanography Distributed Active Archive Center)
    When “The Blob” Made It Hotter Under the Water
    Researchers tracked the impact of a large-scale heatwave event in the ocean known as “The Blob” as part of an approved proposal through the Community Science Program.

    More

    A plantation of poplar trees. (David Gilbert)
    Genome Insider podcast: THE Bioenergy Tree
    The US Department of Energy’s favorite tree is poplar. In this episode, hear from ORNL scientists who have uncovered remarkable genetic secrets that bring us closer to making poplar an economical and sustainable source of energy and materials.

    More

  • Data & Tools
    • IMG
    • Data Portal
    • MycoCosm
    • PhycoCosm
    • Phytozome
    • GOLD
    HPCwire Editor's Choice Award (logo crop) for Best Use of HPC in the Life Sciences
    JGI Part of Berkeley Lab Team Awarded Best Use of HPC in Life Sciences
    The HPCwire Editors Choice Award for Best Use of HPC in Life Sciences went to the Berkeley Lab team comprised of JGI and ExaBiome Project team, supported by the DOE Exascale Computing Project for MetaHipMer, an end-to-end genome assembler that supports “an unprecedented assembly of environmental microbiomes.”

    More

    With a common set of "baseline metadata," JGI users can more easily access public data sets. (Steve Wilson)
    A User-Centered Approach to Accessing JGI Data
    Reflecting a structural shift in data access, the JGI Data Portal offers a way for users to more easily access public data sets through a common set of metadata.

    More

    Phytozome portal collage
    A More Intuitive Phytozome Interface
    Phytozome v13 now hosts upwards of 250 plant genomes and provides users with the genome browsers, gene pages, search, BLAST and BioMart data warehouse interfaces they have come to rely on, with a more intuitive interface.

    More

  • User Programs
    • Calls for Proposals
    • Special Initiatives & Programs
    • Product Offerings
    • User Support
    • Policies
    • Submit a Proposal
    screencap from Amundson and Wilkins subsurface microbiome video
    Digging into Microbial Ecosystems Deep Underground
    JGI users and microbiome researchers at Colorado State University have many questions about the microbial communities deep underground, including the role viral infection may play in other natural ecosystems.

    Read more

    Yeast strains engineered for the biochemical conversion of glucose to value-added products are limited in chemical output due to growth and viability constraints. Cell extracts provide an alternative format for chemical synthesis in the absence of cell growth by isolating the soluble components of lysed cells. By separating the production of enzymes (during growth) and the biochemical production process (in cell-free reactions), this framework enables biosynthesis of diverse chemical products at volumetric productivities greater than the source strains. (Blake Rasor)
    Boosting Small Molecule Production in Super “Soup”
    Researchers supported through the Emerging Technologies Opportunity Program describe a two-pronged approach that starts with engineered yeast cells but then moves out of the cell structure into a cell-free system.

    More

    These bright green spots are fluorescently labelled bacteria from soil collected from the surface of plant roots. For reference, the scale bar at bottom right is 10 micrometers long. (Rhona Stuart)
    A Powerful Technique to Study Microbes, Now Easier
    In JGI's Genome Insider podcast: LLNL biologist Jennifer Pett-Ridge collaborated with JGI scientists through the Emerging Technologies Opportunity Program to semi-automate experiments that measure microbial activity in soil.

    More

  • News & Publications
    • News
    • Blog
    • Podcasts
    • Webinars
    • Publications
    • Newsletter
    • Logos and Templates
    • Photos
    A view of the mangroves from which the giant bacteria were sampled in Guadeloupe. (Hugo Bret)
    Giant Bacteria Found in Guadeloupe Mangroves Challenge Traditional Concepts
    Harnessing JGI and Berkeley Lab resources, researchers characterized a giant - 5,000 times bigger than most bacteria - filamentous bacterium discovered in the Caribbean mangroves.

    More

    In their approved proposal, Frederick Colwell of Oregon State University and colleagues are interested in the microbial communities that live on Alaska’s glacially dominated Copper River Delta. They’re looking at how the microbes in these high latitude wetlands, such as the Copper River Delta wetland pond shown here, cycle carbon. (Courtesy of Rick Colwell)
    Monitoring Inter-Organism Interactions Within Ecosystems
    Many of the proposals approved through JGI's annual Community Science Program call focus on harnessing genomics to developing sustainable resources for biofuels and bioproducts.

    More

    Coloring the water, the algae Phaeocystis blooms off the side of the sampling vessel, Polarstern, in the temperate region of the North Atlantic. (Katrin Schmidt)
    Climate Change Threatens Base of Polar Oceans’ Bountiful Food Webs
    As warm-adapted microbes edge polewards, they’d oust resident tiny algae. It's a trend that threatens to destabilize the delicate marine food web and change the oceans as we know them.

    More

Our Science
Home › Our Science › Science Programs › Fungal & Algal Program › Benchmarks

Benchmarks

Complex intron-exon structure of eukaryotic genes makes their prediction challenging. Quality of gene prediction in eukaryotic genomes can be improved by combining different gene prediction approaches (ab initio, based on homology, ESTs, synteny, or their combinations) and experimental data (transcriptomics, proteomics, etc). In the course of fungal genome annotations we compared different gene predictors and annotation pipelines to assess and refine our annotation strategies for future genomes. Results of two such tests are presented here:

1. Annotation of Heterobasidion annosum genome

Results: Several gene predictors and annotation pipelines were used in annotating the genome of fungus H. annosum v1.0 and accuracy of gene prediction was compared based on homology and EST support. Combination of tools used in the JGI annotation pipeline predicted larger sets of genes with best support.

EuGene
[1]
GeneMark
[2]
FgenesH
[3]
JGI Pipe
[4,5]
Number of predicted gene models 11,547 9,609 8,409 12,270
with partial EST support 5,544 3,829 4,567 5,248
with full length EST support 2,538 1,182 2,896 3,073
with homology support 6,758 6,043 5,750 7,214
with strong homology support (>80% aa identity, >80% coverage) 112 109 174 187
with homology and EST support 2,894 2,172 2,720 2,953
Average EST coverage per gene 77.7% 68.2% 80.8% 79.1%
Supported splice sites 41,581 40,808 45,498 47,671
Average homology coverage per gene 64% 60% 68% 69%

EuGene models were built and provided by a collaborator. All models were used in JGI pipeline. EST support was computed based on 40,807 ESTs and 10,126 EST cluster consensus sequences mapped by BLAT; protein homology was computed by blast against NCBI NR.
Reference

2. Comparison of MAKER and JGI Annotation pipeline

Results: Publicly available annotation pipeline MAKER[6] was compared with JGI annotation pipeline [4,5]. For Basidiomycete Dichomitus squalens , JGI pipeline predicted more genes with better support using several lines of evidence.

MAKER
[6]
JGI Annotation pipeline
[4,5]
Number of predicted gene models 9,940 12,290
with Swissprot hits 6,521 7,356
with non-repeat PFAM domains 5,365 6,010
with EST support 9,252 10,796
with >90% EST support 7,729 9,178
Number of unique PFAM domains 2,207 2,245
Average EST coverage per gene 93.0% 93.3%
Splice sites supported by ESTs 99,627 102,200

Inputs: Aassembly v1.0 of D. squalens, 359,410 proteins seeds from NCBI NR, 16,501 EST cluster consensus sequences mapped by BLAT to the assembly. Mapper used the following gene predictors: Exonerate, FgenesH (same parameters as in JGI pipeline) and Augustus. All genes were blasted against the same Swissprot set of 530,264 protein sequences (downloaded Jul5 2011), EST sequences, and PFAM database(Pfam_v21)

3. Comparative Analysis Methods and Tools

Motivation

Genome annotation and analysis requires development and validation of new algorithms and tools. Several directions of this development include methods to analyze eukaryotic genome organization (tandem and segmental duplication, gene-based synteny, including for multiple related genomes), gene structure (intron conservation or loss across genomes), gene gain/loss (detection of possible errors in automated clustering results for analysis of gene families, creating whole genome based phylogenetic trees based on clustering results, pfam domain analysis to detect expanded and lost families), genome evolution, gene expression, genome variation, metabolic pathways and regulatory elements. Test new gene predictors, including those using Rna-Seq data and synteny-based approaches on validated gene sets in terms of accuracy and speed, pipelines (eg, MAKER), repeat finding software, and non-coding RNA finding software. This project aims at (1) developing algorithms and prototypes for new genome analysis methods for publications; (2) testing new gene prediction and genome analysis tools for possible integration into production annotation process.

Comparative Gene Modeling

Comparative gene modeling aimed to improve the initial gene predictions for a set of closely related organisms and correct for missing or incorrectly predicted genes (incorrect splice sites, chimeras, gene fragments, etc).The idea of comparative modeling is that for closely related genomes, most orthologs have the same conserved gene structure. The algorithm maps all gene models predicted in all genomes to all individual genomes, and for each locus selects among the potentially many competing models, the one which is most closely resemble the homologous genes from other genomes. This procedure maybe iterated several times until no change in gene models will be observed

Results

For Basidiomycete Dichomitus squalens reannotation using comparative modeling is compared with initial JGI production annotation:

JGI Annotation pipeline Comparative modeling
Number of predicted gene models 12,290 12,802
with Swissprot hits 7,356 7,900
with non-repeat PFAM domains 6,010 6,353
with EST support 10,796 11,105
with >90% EST support 9,178 9,444
Number of unique PFAM domains 2,245 2,322
Average EST coverage per gene 93.3% 93.3%
Splice sites supported by ESTs 102,200 104,246

 

Reference:

  1. Schiex T, Moisan A, Rouzé P. (2001) Computational Biology, selected papers from JOBIM’ 2000, no 2066 in LNCS. Springer Verlag; EuGène, an eukaryotic gene finder that combines several type of evidence; pp. 118–133.
  2. Ter-Hovhannisyan V, Lomsadze A, Chernoff YO, Borodovsky M. (2008) Gene prediction in novel fungal genomes using an ab initio algorithm with unsupervised training. Genome Res. 18(12):1979-90.
  3. Solovyev V, Kosarev P, Seledsov I, Vorobyev D. (2006) Automatic annotation of eukaryotic genes, pseudogenes and promoters. Genome Biol. 7 Suppl 1:S10.1-12.
  4. Grigoriev IV, Martinez DA, Salamov AA (2006) Fungal genomic annotation. In Applied Mycology and Biotechnology (Eds. Aurora, DK, Berka, RM, Singh, GB), Elsevier Press, Vol 6 (Bioinformatics), 123-142.
  5. http://genome.jgi.doe.gov/programs/fungi/FungalGenomeAnnotationSOP.pdf
  6. Cantarel BL, Korf I, Robb SM, Parra G, Ross E, Moore B, Holt C, Sánchez Alvarado A, Yandell M. (2008) MAKER: an easy-to-use annotation pipeline designed for emerging model organism genomes. Genome Res. 18(1):188-96.

 

  • Plant Program
  • Fungal & Algal Program
    • MycoCosm Fungal Portal
    • PhycoCosm Algal Portal
    • Genomic Encyclopedia of Fungi
    • 1000 fungal genomes
    • Benchmarks
    • Fungal & Algal Publications
  • Metagenome Program
  • Microbial Program
  • DNA Synthesis Science Program
  • Metabolomics Program
  • Secondary Metabolites
MycoCosm, the fungal genomics resource.

MycoCosm, the fungal genomics resource.

PhycoCosm, the algal genomics resource

PhycoCosm, the algal genomics resource.

  • Careers
  • Contact Us
  • Events
  • User Meeting
  • MGM Workshops
  • Internal
  • Disclaimer
  • Credits
  • Policies
  • Emergency Info
  • Accessibility / Section 508 Statement
  • Flickr
  • LinkedIn
  • RSS
  • Twitter
  • YouTube
Lawrence Berkeley National Lab Biosciences Area
A project of the US Department of Energy, Office of Science

JGI is a DOE Office of Science User Facility managed by Lawrence Berkeley National Laboratory

© 1997-2023 The Regents of the University of California