DOE Joint Genome Institute

  • COVID-19
  • About
  • Phones
  • Contacts
  • Our Science
    • DOE Mission Areas
    • Bioenergy Research Centers
    • Science Programs
    • Products
    • Science Highlights
    • Scientists
    Screencap of green algae video for PNAS paper
    Green Algae Reveal One mRNA Encodes Many Proteins
    A team of researchers has found numerous examples of polycistronic expression – in which two or more genes are encoded on a single molecule of mRNA – in two species of green algae.

    Read more

    Advances in Rapidly Engineering Non-model Bacteria
    CRAGE is a technique for chassis (or strain)-independent recombinase-assisted genome engineering, allowing scientists to conduct genome-wide screens and explore biosynthetic pathways. Now, CRAGE is being applied to other synthetic biology problems.

    Read more

    Maize can produce a cocktail of antibiotics with a handful of enzymes. (Sam Fentress, CC BY-SA 2.0)
    How Maize Makes An Antibiotic Cocktail
    Zealexins are produced in every corn variety and protect maize by fending off fungal and microbial infections using surprisingly few enzymes.

    More

  • Our Projects
    • Search JGI Projects
    • DOE Metrics/Statistics
    • Approved User Proposals
    • Legacy Projects
    Poplar (Populus trichocarpa and P. deltoides) grow in the Advanced Plant Phenotyping Laboratory (APPL) at Oak Ridge National Laboratory in Tennessee. Poplar is an important biofuel feedstock, and Populus trichocarpa is the first tree species to have its genome sequenced — a feat accomplished by JGI. (Image courtesy of Oak Ridge National Laboratory, U.S. Dept. of Energy)
    Podcast: Xiaohan Yang on A Plantiful Future
    Building off plant genomics collaborations between the JGI and Oak Ridge National Laboratory, Xiaohan Yang envisions customizing plants for the benefit of human society.

    More:

    Expansin complex with cell wall in background. (Courtesy of Daniel Cosgrove)
    Synthesizing Microbial Expansins with Unusual Activities
    Expansin proteins from diverse microbes have potential uses in deconstructing lignocellulosic biomass for conversion to renewable biofuels, nanocellulosic fibers, and commodity biochemicals.

    Read more

    High oleic pennycress. (Courtesy of Ratan Chopra)
    Pennycress – A Solution for Global Food Security, Renewable Energy and Ecosystem Benefits
    Pennycress (Thlaspi arvense) is under development as a winter annual oilseed bioenergy crop. It could produce up to 3 billion gallons of seed oil annually while reducing soil erosion and fertilizer runoff.

    Read more

  • Data & Tools
    • IMG
    • Genome Portal
    • MycoCosm
    • PhycoCosm
    • Phytozome
    • GOLD
    Artistic interpretation of CheckV assessing virus genome sequences from environmental samples. (Rendered by Zosia Rostomian​, Berkeley Lab)
    An Automated Tool for Assessing Virus Data Quality
    CheckV can be broadly utilized by the research community to gauge virus data quality and will help researchers to follow best practices and guidelines for providing the minimum amount of information for an uncultivated virus genome.

    More

    Unicellular algae in the Chlorella genus, magnified 1300x. (Andrei Savitsky)
    A One-Stop Shop for Analyzing Algal Genomes
    The PhycoCosm data portal is an interactive browser that allows algal scientists and enthusiasts to look deep into more than 100 algal genomes, compare them, and visualize supporting experimental data.

    More

    Artistic interpretation of how microbial genome sequences from the GEM catalog can help fill in gaps of knowledge about the microbes that play key roles in the Earth's microbiomes. (Rendered by Zosia Rostomian​, Berkeley Lab)
    Podcast: A Primer on Genome Mining
    In Natural Prodcast: the basics of genome mining, and how JGI researchers conducted it in IMG/ABC on thousands of metagenome-derived genomes for a Nature Biotechnology paper.

    Read more

  • User Programs
    • Calls for User Proposals
    • Special Initiatives & Programs
    • User Support
    • Submit a Proposal
    Image of Octopus Springs for the CSP annual call
    Letters of Intent are due April 12, 2021 for the annual Community Science Program (CSP) call focused on large-scale genomic science projects that address specific areas of special emphasis and exploit the diversity of JGI capabilities.

    Read more

    SIP engagement webinar
    “SIP technologies at EMSL and JGI” Webinar
    The concerted stable isotope-related tools and resources of the JGI and the Environmental Molecular Sciences Laboratory (EMSL) may be requested by applying for the annual “Facilities Integrating Collaborations for User Science” (FICUS) call.

    Read more

    martin-adams-unsplash
    CSP Functional Genomics Call Ongoing
    The CSP Functional Genomics call helps users translate genomic information into biological function. Proposals submitted by July 31, 2021 will be part of the next review.

    Read more

  • News & Publications
    • News
    • Blog
    • Podcasts
    • Webinars
    • Publications
    • Newsletter
    • Logos and Templates
    • Photos
    Aerial photo of the switchgrass diversity panel late in the 2020 season at the Kellogg Biological Station in Michigan. (Robert Goodwin)
    A Team Effort Toward Targeted Crop Improvements
    A multi-institutional team has produced a high-quality reference sequence of the complex switchgrass genome. Building off this work, researchers at three DOE Bioenergy Research Centers have expanded the network of common gardens and are exploring improvements to switchgrass.

    More

    Artistic interpretation of how microbial genome sequences from the GEM catalog can help fill in gaps of knowledge about the microbes that play key roles in the Earth's microbiomes. (Rendered by Zosia Rostomian​, Berkeley Lab)
    Uncovering Novel Genomes from Earth’s Microbiomes
    A public repository of 52,515 microbial draft genomes generated from environmental samples around the world, expanding the known diversity of bacteria and archaea by 44%, is now available .

    More

    Green millet (Setaria viridis) plant collected in the wild. (Courtesy of the Kellogg lab)
    Shattering Expectations: Novel Seed Dispersal Gene Found in Green Millet
    In Nature Biotechnology, a very high quality reference Setaria viridis genome was sequenced, and for the first time in wild populations, a gene related to seed dispersal was identified.

    More

News & Publications
Home › Blog › Webinar: MycoCosm Tutorial

March 22, 2021

Webinar: MycoCosm Tutorial

Learn more about JGI’s MycoCosm data portal, an interactive collection of sequenced algal genomes, omics data and comparative analysis tools.

Questions and Answers

1. Q(uestion): Do you accept oomycetes to be proposed for genome sequencing, or only “true fungi”?
A(nswer): Our focus is on fungi, algae, plants, and the microbial world but we also annotated several Phytophthora genomes; this depends on the scientific questions asked and whether they are within DOE mission areas of bioenergy, carbon cycling, and biogeochemistry. This is a great opportunity to mention that we have a sister portal PhycoCosm (https://phycocosm.jgi.doe.gov) to explore algae and other eukaryotes, where we have the Oomycetes node (https://phycocosm.jgi.doe.gov/oomycota) on PhycoCosm tree with several sequenced genomes.

2. Q: How exactly EcoGroups are built?
A:EcoGroups are built based on what is known about the organism’s ecology, mostly based on literature and user community inputs. We also have the potential to create new EcoGroups – reach out to us.

3. Q: Can you use NCBI taxonomy information to search your interactive tree?
A: Sure, if you enter ‘Mucorales’ in the search box, you will see only that part of the tree (see full answer on video)

4. Q: Will it support custom columns in the future? For other things than assembly size or gene count? (In the interactive tree)
A: Sajeet is presenting some additional ways to compare various statistics across genomes (see video on comparative tools)

5. Q: Is there any programmatic access to the portal? Or any data and metadata dumps available to download?
A: We have multiple ways for programmatic access to MycoCosm. Please contact us for specific details.

6. Q: Is there a tool in this platform to search fungal genomes in metagenomes?
A: Search in metagenomes is available from another JGI portal, IMG: https://img.jgi.doe.gov

7. Q: And what about myxomycetes (Amoebozoa), would you consider them for sequencing?
A: We do have 1 Amoebozoan, >Dicyostelium purpureum at https://genome.jgi.doe.gov/Dicpu1, but being not a fungus and not related to known algal clades, it is not included in either MycoCosm or PhycoCosm. At the moment there is no DOE mandate to sequence more Amoebozoa, but we are open to a dialogue about a possible rationale.

8. Q: Is there any way to get the most recent tree deployed on the mycocosm? or specific groups?
A: While are still working on finalizing the tree and related publication the tree is still in development and not available for download.

9. Q: Can I upload my own sequences?
A: We can add external annotated fungal genomes, please contact us. For not annotated genomes, our focus is on JGI sequenced genomes and those external genomes that enable JGI CSP projects – please first contact us at ivgrigoriev@lbl.gov and then submit metadata and assembly at https://gold.jgi.doe.gov

10. Q: Is there any possibility to search for genes/proteins by their commonly used names in publications? e.g. erg-1 (instead of the Gene ID etc., which is sometimes not easy to find out)
A: If these names are assigned in the process of annotation or manual curation they are searchable. For model fungi, not sequenced by JGI and adopted by large research community we  are trying to link to the community standards in gene naming

11. Q: In the INFO browser JGI should give credit to the sample provider who are doing the sampling, strain isolation, RNA/DNA extractions and quantifications, and shipping.
A: Absolutely!

12. Q: Is it possible to use the Browse page tool to compare your own data with the JGI database?
A: You can use custom tracks menu item to load data in standard formats or contact us if you would like to share your data with other researchers

13. Q: Is it possible to download predicted transcripts/proteins with other info such as their corresponding NCBI taxid?
A: All download file formats are predefined but can certainly be improved in future developments of the portal.

14. Q: Has there been any interest in using JBrowse / JB2  for genomic views ( e.g.,: https://mycocosm.jgi.doe.gov/cgi-bin/browserLoad/?db=Xylhe1&position=scaffold_1:33902-34627)
A: It is in our development plans

15. Q: What is the user annotation process (are these functional annotations) and how are these integrated within the platform?
A: Please contact the Principal Investigator of the corresponding genome projects as they may coordinate users in the community annotation process for their genomes. User annotations are highlighted and GeneCatalog track and may be included in GenBank submissions

16. Q: Is there any information on the promoter of genes predicted by JGI?
A: We have 1KB DNA fragments upstream of predicted genes available from download sections of the corresponding portals

16. Q: Mitochondrial page only for Neurospora or for any  genome sequenced by JGI?
A: We’ve just recently added a capability to display mitochondrial genome pages to our portals and are gradually adding them to both JGI sequenced and external (if available) genomes

17. Q: How do we find the parameters, values and softwares/packages used by JGI to create these syntenic analysis? In case, we tried to reproduce it or even to use the results in the papers
A: Grigoriev et al (2014) MycoCosm portal: gearing up for 1000 fungal genomes. Nucleic Acids Res. 42(1):D699-704 describes the pipeline and portal. Some parameters used in the Annotation pipeline are also described in Haridas S, Salamov A, & Grigoriev IV. (2018) Fungal Genome Annotation. In Methods in molecular biology 1775:171-184. Furthermore, Annotration SOP with additional details is available from the main menu of MycoCosm and Fungal Program pages (https://jgi.doe.gov/our-science/science-programs/fungal-genomics/). You can also contact us for additional details.

18. Q: Is there any way to download all the syntenic results in case to further uses? Or we can just see the results in JGI portal
A: Synteny browser shows VISTA alignments available in various formats from VISTA Point pages linked to both Synteny pages and VISTA tracks on the Genome Browser.

20. Q: When displaying protein info, is there any way to look for this ID but in a different strain assembly (same organism) without having to blast the sequence in the other strain portal? Like for example between Mucor CBS v2 and MU402 v1 (MU402 derives from CBS)
A: No, all protein ids are independent between all Portals. Often, older protein ids are searchable in newer versions of annotation. In addition, tools like synteny are especially good to cluster together similar genes in different strains or related species.

21. Q: How can we select & change for other sequenced genomes in the synteny browser?
A: There are many comparative genomes available in the pull down menu. If you are asking about the genomes not listed in the pull down selector, the synteny has not been computed yet. Contact us and we can add it.

22. Q: How many genes have unknown function in fungal genomes?
A: There is a broad range of functional annotation across the fungal kingdom. Some genomes, like yeasts, have a lot of well characterized genes. Other groups, like rusts, have maybe up to 70% uncharacterized.
We are working on a tool to encourage community support with characterizing  fungal conserved genes families of unknown function looking across the entirety of MycoCosm. We’ve taken 18 million proteins from 1,300 fungal genomes in MycoCosm, clustered them together, and present here, a subset of 142 of them that are not only conserved across large phylogenetic distances, but have an unknown or poorly characterized function and also supported by transcriptomics data. From https://mycocosm.jgi.doe.gov/conserved-clusters/run/run-2020 you can indicate either your progress if you’re already working on this, or your intent to move forward and do manual, or experimental curation, or other characterization of these. Reach out to us if you’re interested in participating in this type of initiative. (See video for a short demo).

23. Q: Is there a good way to download fasta files of a bunch of genes?
A: Specific genes from a genome can be downloaded using the mcl or search page. If you want to download from several genomes, use the group page download page (see video for additional details).

24. Q: The genome information in the download part is a lot and confusing, are there any instructions to which each file represents?
A: Some information is available in Help pages.

25. Annotation tracks such as CAZymes, Peptidases, TF, Transporters are great
Q1. Are all based on PFAM classification-only and/or other reference DB (citable)?
Q2. Is it possible to download these annotations? e.g. adding a table with these specific categorical PFAM-annotation details in a table in the download section?

A: Annotations are available for download on the portal. PFAM are assigned using interproscan. Transcription factors are a subset of the PFAMs known to be present in transcription factors. Peptidases and transporters are assigned using MEROPS (https://www.ebi.ac.uk/merops/) and TCDB (http://tcdb.org/) databases, respectively. For secondary metabolism clusters, we have developed in-house tools similar to Smurf. For CAZymes, we have support from Bernard Henrissat’s lab at http://www.cazy.org/. They predict CAZymes for JGI sequenced genomes and there may be delays between us releasing a genome and the CAZyme data becoming available. But these are all curated by the CAZyme team.

26. Q: For many of these imported genomes, you mentioned they are filtered , what does/will this filtering entail? What features do you filter? remove/keep and is it possible see original vs. filtered features?
A: Filtering of imported genomes is an attempt to make gene sets in these externally sequenced genomes more comparable with our internal ones which are filtered by default. In the process of filtering we remove things like TEs, pseudogenes, overlapping models, alternative transcripts, alleles, and unsupported short models. We do keep the original external model set with no changes and no modifications at all, and provide both of these model sets on the browser and downloads pages.

28. Q: Do you have inside all the genome sequences? Also from plants, etc.?
A: MycoCosm is a fungal genomics and multi-omics resource. JGI also has designated resources for plants (phytozome), algae (PhycoCosm), and microbial world (IMG)

29. Q: Does MycoCosm support diploid genomes that are now common with long read technology? and how do you present the different alleles?
A: Yes, MycoCosm can support diploid genomes. Our annotation pipeline can call scaffolds as “primary” and “secondary”, and designates alleles accordingly. These primary and secondary alleles are presented as distinct tracks on the genome browser, and provided as separate download files. In instances where fully phased haplotypes are known, we also have the capability to set up “super-portals” to organize the two haplotypes in a central location.

30. Q: Is it always better to use filtered models than non-filtered?
A: Filtering is an automated attempt to identify the best representative gene model per locus, based on transcriptomics and protein similarity support. These can be further curated by the user community and result in GeneCatalog. If a gene of interest is missing in the FilteredModels track you may want to search AllModels too.

31. Is there a way to track the JGI ID changes as genomes get updated?
A: In many improved/reannotated genomes old protein ids are searchable in newer versions, although not necessarily included in GeneCatalog if there are better alternative models.

32. Q: For Annual CSP – How many letters of support from the community are really needed?
A: There is no ‘magic’ number. Through a combination of co-PIs, collaborators, and letters of support you can communicate the scale of community interest in the data that you would like to produce.

33. Q: How long does it take usually to get the sequencing results back?
A: This depends on multiple factors such as product type, sample quality and quantity, any failures in sequencing process, etc. We do not just sequence but deliver a complete package including standard bioinformatics analyses. Average times for different JGI products can be found at https://jgi.doe.gov/our-science/product-offerings

34. Q: Do scaffolds for closely related organisms, such as the same species, but different strains are similarly labeled? For eg., will scaffold 2 of one species (e.g., Trichoderma reesei ver 2.0) be comparable to scaffold 2 of another strain (e.g., Trichoderma reesei Rut C-30?
A: No, scaffolds are ordered and numbered based on scaffold length (the largest to smallest). They do not match each other in different assemblies but the synteny browser can inform the correspondence of scaffolds.

35. Q: Will you get the Cytochrome P450 in your annotations tool?
A: Cytochrome P450 and other families annotations are in development and should appear in the future versions of MycoCosm

36. Q: Perhaps it should be clear on the webpage what has been manually curated and what is automatically annotated?
A: Manually curated genes are highlighted in the GeneCatalog track on the Genome browser. The corresponding protein pages also display User Annotations

Share this:

  • Click to share on Facebook (Opens in new window)
  • Click to share on LinkedIn (Opens in new window)
  • Click to share on Pinterest (Opens in new window)
  • Click to share on Twitter (Opens in new window)
  • Click to print (Opens in new window)

Filed Under: Blog, Webinars

More topics:

  • COVID-19 Status
  • News
  • Science Highlights
  • Blog
  • Podcasts
  • Webinars
  • CSP Plans
  • Featured Profiles

Related Content:

Statement on the Use of Genomics Data

Aerial photo of the IGB

Webinar: PhycoCosm Tutorial

PhycoCosm webinar screencap

Engagement Webinar: Utilizing long-read sequencing for metagenomics and DNA modification detection

screencap long reads webinar_ Metagenome Program

Engagement Webinar: Accessing NEON’s Environmental Sample Archives, Applying JGI & EMSL Omics Tools

NEON webinar screencap

Engagement Webinar: How to Apply for Resources at Multiple DOE User Facilities

Green Algae Reveal One mRNA Encodes Many Proteins

Screencap of green algae video for PNAS paper
  • Careers
  • Contact Us
  • Events
  • User Meeting
  • MGM Workshops
  • Internal
  • Disclaimer
  • Credits
  • Emergency Info
  • Accessibility / Section 508 Statement
  • RSS feed
  • Flickr
  • LinkedIn
  • Twitter
  • YouTube
Lawrence Berkeley National Lab Biosciences Area
A project of the US Department of Energy, Office of Science

JGI is a DOE Office of Science User Facility managed by Lawrence Berkeley National Laboratory

© 1997-2021 The Regents of the University of California