To provide broad plant genomic capabilities, the DOE Joint Genome Institute works in partnership with the HudsonAlpha Institute for Biotechnology, which specializes in genome improvement for plants. As part of that partnership, the JGI and HudsonAlpha researchers led a team that recently published work on the JGI Plant Gene Atlas in Nucleic Acids Research. The project spans 15 years and involves more than 17 research groups. The team set out to characterize the transcriptome – the complete set of RNA molecules transcribed from all the genes within a cell or tissue at a specific time or under specific conditions – in over a dozen plants.
To learn more about the project, HudsonAlpha’s Research Communication Manager Sarah Sharman recently interviewed study first author Avinash Sreedasyam, a HudsonAlpha senior scientist, about the project. The story is reposted here in modified format; their conversation has been lightly edited and condensed below.
Sarah Sharman: What is the JGI Plant Gene Atlas?
Avinash Sreedasyam: JGI Plant Gene Atlas is a huge updateable transcriptome resource spanning diverse plant species. It was developed to improve plant genome annotations at the US Department of Energy (DOE) Joint Genome Institute (JGI)*, a national user facility located at Lawrence Berkeley National Laboratory, and add additional gene function descriptions. This resource also helps in performing cross-species comparative transcriptomics.
*The HudsonAlpha Genome Sequencing Center works with scientists at the JGI a lot. In fact, HudsonAlpha Faculty Investigator Jeremy Schmutz is the Plant Program Lead at the JGI.
Sarah: Why is a resource like this important to the field of plant genetics/genomics?
Avinash: Having a better handle on the gene function helps identify the molecular targets for plant improvement. Surprisingly, about 16 to 56 percent of plant genes are poorly characterized, meaning they have no known function. This is due to the overreliance on a few species like Arabidopsis or rice as homology models for computational function predictions and also due to the inability to link experimental evidence across species. Centralized databases with large-scale transcriptome projects such as Expression Atlas and PPRD could help with understanding gene functional roles, but the experimental inconsistency makes interpretation and integration across studies difficult. Our Plant Gene Atlas resource addresses that by providing standardized experimental conditions, tissue types, and analytical protocols that permit gene expression analysis across plants and add additional experimentally derived biological roles to genes.
Sarah: How many plants did you all look at?
Avinash: We started off with 12 plants, which are JGI Flagship Plants, mostly related to biofuels and feedstocks. And then we expanded that to include six more species, so in total, we are looking at 18 different species, which included over 2000 RNA Seq libraries. As I previously mentioned, this is an updateable resource. To demonstrate that, we included datasets from two species, one of which is sweet sorghum Rio from a JGI Community Science Program project and another is Lupinus albus from a non-JGI project.
Sarah: I assume you were not doing this alone. How many groups were you working with?
Avinash: In the initial planning phase, there were about 10 groups that came together to standardize the experimental protocols. Then, in 2020, seven more groups joined the project to contribute to the data for six new species. I must mention that some of the members from the initial team contributed additional sample sets, such as sorghum internode time course data from 4 different genotypes by Dr. John Mullet; Dr. Tom Juenger and his postdoc Xiaoyu Weng from Univ. of Austin, who previously led work on Arabidopsis and Panicum hallii, contributed panicle time course data from two ecotypes of Panicum hallii and multiple switchgrass (Panicum virgatum) experimental data.
Sarah: How important is collaboration in the big data and plant science community?
Avinash: Having collaboration within the plant science community is of utmost importance. The scope of handling 18 different species is beyond the capacity of a single lab. Growing different species poses significant challenges, as establishing standardized growth protocols demands time, and subjecting them to diverse conditions is a time-consuming process. Successfully accomplishing this requires a diverse team of specialists, each contributing their expertise to different aspects of the project.
Sarah: You all hit a big milestone getting the manuscript describing the atlas published in Nucleic Acids Research. Does this mean the project is over, or will you continue to update it as more plants are studied?
Avinash: JGI puts huge efforts into generating reference genomes for new species allowing it to fill gaps in the under-sampled area of the plant phylogeny and improving genome annotations. For that undertaking, new transcriptome datasets are generated through Community Science Program funding calls. JGI has research funding grants aimed specifically at “Gene function” and “Functional genomics.” We will keep updating the Gene Atlas with curated datasets from CSP projects and aim to improve plant gene function descriptions.
Sarah: Have you all learned anything from the data yet? Are there any specific examples you can share?
Avinash: Yes, I will mention two here. The main purpose of this project is to understand the gene function and add additional biological information. As I’ve said earlier, 16 to 56 percent of plant genes are poorly characterized. So the first thing we aimed for was to understand the functions of genes across the investigated plants. We did so by analyzing this huge sample set specifically using results from tissue and condition-specific expression groups, differential expression, co-expression network analysis, and ortholog function descriptions from nearest phylogenetic neighbors. Our pipeline allowed us to add expression-derived additional biological information to an average of 40 percent of genes across Gene Atlas plants. Comparing orthologs among common gene sets between species allowed us to pinpoint and rank biologically relevant and evolutionarily conserved genes that could be potential future targets for functional genomic studies.
We also looked at the cross-species comparable study, where plants were subjected to three nitrogen sources (urea, ammonium, and nitrate) as the sole nitrogen source. We looked at the plant’s response in the aboveground and root tissues. The striking thing we found was related to tissue-specific gene expression variation within genotypes. The root transcriptome was more responsive than aboveground tissues in all studied plants except Arabidopsis. We also observed that treatment with nitrate versus urea showed nitrogen and amino acid-specific metabolic pathways were overrepresented in the nitrate-subjected plants but not the ammonium. These results highlight differences in plants’ response to nitrate compared to ammonium as the sole nitrogen source at the metabolic level.
Sarah: Can just anyone use Plant Gene Atlas, or do you have to be a subscriber?
Avinash: It has been publicly available since 2018. There were more than 15 citations even before this was published, and people regularly contact Jeremy [Schmutz] or me to seek more information on the usage of this resource.
There are two different portals where this data is currently hosted. The first one is the JGI plant portal called Phytozome. There, you can query a single gene and look at the expression across the different tissues and conditions available for a species. You can also look at its co-expressed genes. It provides detailed functional annotations, protein homologs, plant family information, and a genome browser view of gene models.
The other one is the JGI Plant Gene Atlas, a dedicated portal where you can do bulk downloads of data and look at the expression of a single gene to multiple genes across the species. For your genes of interest, you can look at the expression of those genes in currently available 17 other species. And you can also access the differentially expressed genes and visualize plots representing the GO and KEGG pathway enrichments. Detailed documentation about using this resource is included under the “Help” tab on the portal.
Sreedasyam A et al. JGI Plant Gene Atlas: an updateable transcriptome resource to improve functional gene descriptions across the plant kingdom. Nucleic Acids Res. 2023 Aug 1;gkad616. doi: 10.1093/nar/gkad616. Online ahead of print.