Research in the OMICs Analysis group is focused on:
- Developing state-of-the-art data processing and analysis pipelines for the interpretation of microbiome omics data
- Unraveling the depths of biological phenomena in microbes and metagenome communities using large-scale analysis of –omics data.
- Developing forward-looking strategies for the deployment of computational workflows at peta- and exa- scales on multicore and manycore architectures with the ultimate objective of facilitating omics-based scientific investigations.
Overall, the OMICs group owns and maintains integral components of JGI’s production workflow, and is responsible for the annotation and analysis of genomic, transcriptomic, and functional genomic data and serving them to users via the Integrated Microbial Genomes (IMG) system (https://img.jgi.doe.gov).
Selected ongoing research projects and related software resources are described below.
Genomics directed discovery of novel chemical scaffolds through the identification of novel secondary metabolite biosynthetic gene clusters
Efforts in this project are directed towards the development of computational strategies that leverage sequence features, enzymatic classes, gene expression, and synteny in order to detect secondary metabolite (SM) biosynthetic gene clusters (BCs) encapsulation novel enzymatic activity and producing SMs with novel chemical scaffolds. A comprehensive web-based resource (IMG-ABC, Atlas of Biosynthetic gene Clusters) has been developed to support this effort. Computational biologists within this effort are also studying large-scale networks of BCs in various environments. Investigators: Michalis Hadjithomas, Natalia Ivanova.
Average Nucleotide Identity
As part of this project, pairwise average nucleotide identities (ANI) and fraction of orthologous genomic regions (Alignment fraction, AF) have been computed for nearly 28,000 bacterial and archaeal genomes. By clustering genomes based on their pairwise AF and ANI values, we were able to ascertain mis-assignment of species names in genomes spanning nearly 18% of all existing species. Additionally, the complete linkage clustering made it possible to confidently assign species to nearly 326 genomes. Through the analysis of cliques, it has also become possible to identify speciation events within existing species. Within the JGI’s production pipeline, ANI is used to ascertain the species specificity of single cells and genomes extracted from metagenomes, and also as a metric for quality control within IMG. While ANI is integrated within IMG, http://ani.jgi-psf.org serves the current data. Investigators: Neha Varghese, Supratim Mukherjee, Nikos Kyrpides.
|Torben Nielsen, Group Lead||Neha Varghese, Software Developer||Marcel Huntemann, Software Developer||Michalis Hadjithomas,
|Neha’s research mainly focuses on the use of genome sequences to delineate prokaryotic organisms at different levels of taxonomic classification, towards identification and implementation of objective and robust measures of genomic distance. She is also actively involved in expression analysis of transcriptomic and metatranscriptomic data, specifically exploring, benchmarking and implementing user-based RNAseq data analysis tools.||Marcel is in charge of several production pipelines (gene calling,
functional annotation and methylomics) that run on microbial genomes
and metagenomes. He also works on automating intra-department data
exchange and assists with large scale computations on R&D projects.
|Michalis studies biosynthetic gene clusters for secondary metabolites in genomes and metagenomes. He is developing computational approaches for prediction of biosynthetic clusters producing novel secondary metabolites. Michalis is also studying the similarity networks between such clusters in various environments.|
- Ovchinnikov S. et al. (2017) Protein structure determination using metagenome sequence data. Science 355(6322):294-298
- Paez-Espino D. et al. (2017) IMG/VR: a database of cultured and uncultured DNA Viruses and retroviruses. Nucleic Acids Res. 45(D1):D457-D465.Paez-Espino, D. et al. (2016) Uncovering Earth’s virome. Nature 536:425-30
- Chen IA. et al. (2017) IMG/M: integrated genome and metagenome comparative data analysis system. Nucleic Acids Res. 45(D1):D507-D516.
- Paez-Espino, D. et al. (2016) Uncovering Earth’s virome. Nature 536:425-30
- Chen IM. et. al. (2016) Supporting community annotation and user collaboration in the integrated microbial genomes (IMG) system. BMC Genomics. 17:307
- Huntemann M. et al. (2016) The standard operating procedure of the DOE-JGI Metagenome Annotation Pipeline (MAP v.4). Stand Genomic Sci. 11:17
- Huntemann M. et al. (2015) The standard operating procedure of the DOE-JGI Microbial Genome Annotation Pipeline (MGAP v.4). Stand Genomic Sci. 10:86.
- Insights into secondary metabolism from a global analysis of prokaryotic biosynthetic gene clusters. Cell 2014, 2014 Jul 17;158(2):412-21. doi: 10.1016/j.cell.2014.06.034.
- IMG/M: the integrated metagenome data management and comparative analysis system. Nucleic Acids Research 42 (Database-Issue): 568-573 (2014).
- IMG 4 version of the integrated microbial genomes comparative analysis system. Nucleic Acids Research 42 (Database-Issue): 560-567 (2014).