The central mission of the Genomic Technologies Department is to develop and effectively apply genomic technologies to accelerate DOE user science. Our primary product continues to be robust sample management and library construction coupled with state-of-the-art high throughput DNA sequencing and analysis. In addition, our group pursues Research & Development activities to support future user needs. Examples include developed platforms in nanopore sequencing, single cell genomics, transcriptomics, epigenomics, DNA synthesis, and metabolomics. The Genomic Technologies Department is divided into four groups to effectively delegate activities among a complementary set of scientific managers and their associated staff.
1. Sequencing Technologies
The key function of the Sequencing Technologies Group is to apply cutting edge molecular capabilities and state-of-the-art DNA sequencing to enable biological discovery by JGI and external DOE users. We actively engage in interactions with the Project Management Office to make our genomic capabilities efficiently and readily applicable for all of our user projects. We are also aggressively committed to continuous research & development efforts surrounding new sequencing technologies to stay competitive as well as facilitate innovative scientific methods to benefit the DOE research community.
The Sequencing Technologies Group is organized into three highly interactive teams that reflect the natural workflow of the sequencing process. The Sample Management team is responsible for receiving user-submitted nucleic acid (RNA and DNA) samples, assessing quantity and quality, and generating aliquots for subsequent library creation. Aliquoted samples are then passed to either the Library Construction or Automated Library team to generate one of the 30 unique sequencing library types produced at the JGI. The Library Construction team focuses on challenging libraries types (i.e. low input, high molecular weight PacBio, ChIP-seq) that require individual care to ensure high success rate. The Automated Library team utilizes end-to-end roboticized protocols for high-throughput production of DNA-seq and RNA-seq library types. After library construction the Sequencing Platform team performs quantitative PCR to determine library quality and quantity for optimal library pooling. Pooled libraries are then run on the appropriate 2nd and 3rd generation instruments to generate sequencing data. Additionally, the group includes team members who work closely with the PMO and Operations teams to ensure a smooth workflow, optimized systems, and rapid resolution of process issues that may emerge. Finally, we actively cross-train individuals across the teams for better coverage of duties in case of temporary absences of key personnel, and to foster cross-group communication and education.
In addition to production sequencing responsibilities, team members also participate in R&D projects to improve workflows, decrease costs, and expand capabilities at the JGI. Members from all three production groups participate in one of three R&D areas including the Short Read, Long Read, and Chromatin Technologies working groups. The Short Read and Long Read groups focus on optimizing existing methods and testing new approaches for the Illumina and PacBio/Oxford Nanopore platforms, respectively. The Chromatin Technologies group is focused on leveraging sequencing technologies to interrogate the structural and regulatory properties of prokaryotic and eukaryotic genomes and epigenomes. These three working groups are composed of members from across the Sequencing Technologies Group to take advantage of expertise across all stages of sequencing, and to further increase communication between the teams.
2. Institutional Informatics
The Institutional Informatics group maintains and extends software systems to accomplish User Projects. This includes support of proposal management and project specification by the Project Management Office, lab work by production sequencing, post-sequencing automated data processing and analysis, and managing availability of data on portals and GenBank. Overall this system is known as ITS (or our Integrated Tracking System). The Institutional Informatics group is organized into six key sub groups. They are further described below:
Project Management Office Services (PMOS): This group is responsible for supporting the software systems used by the Project Management Office (PMO), our users (to access web resources at the JGI, including for proposal submission), and JGI analysts (to manage automated analysis tasks). These systems include the Proposal Management System, Project Navigator, Single Sign On and Analysis Projects Management.
Production Process Support (PPS): This team maintains and extends the software systems used by the Sequencing Technology group. This includes Sample Management, Libraries, and Sequencing Operations. In the past two years, an updated Integrated Tracking System (ITS) has been released by the JGI and the LIMS supported by this group is one part of this larger system.
Sequence Data Management (SDM): The Data Management System manages all data produced from our DNA sequencing platforms and analysis outputs. Vendor-provided pipelines are wrapped with in-house software to allow for integration within the JGI’s larger infrastructure. All analyzed and raw data are stored within our in-house built hierarchical file system called JAMO. All users access the JAMO tool chain to retrieve data for downstream processing.
Data Warehouse: Since various teams maintain their own data stores it is not possible to query across the systems without having a higher-level schema map. To facilitate this, a Data Warehouse is built that performs the necessary mapping across schemas and de-normalizes the data for fast read queries. Preconfigured reports are built, using open source version of Jaspersoft or the commercial Tableau reporting tool, and are available for users to run on demand. Ad hoc queries can be built against the data warehouse using SQL or Tableau’s report builder as tools of choice.
Software Quality Assurance (SQA): The SQA team is responsible for testing ITS releases and ensuring that requirements are met before their release. They perform end-to-end integration tests, and when problems are found they are raised with the developers and users. This team also responds to user support requests that arise in the production environment.
Genome Portal: The JGI Genome Portal provides unified access to all of the JGI genomic datasets. A user can search, download and explore multiple datasets available for all the JGI sequencing projects including their status, assemblies and annotations of sequenced genomes. This framework has access control to restrict the visibility of a portal to the PI and associates while work is in progress, allowing them to track progress. However, once the project is completed and released to public the access restriction is lifted.
3. Production Analysis
The primary mission of the Production Analysis Group is to support user programs and production by providing a suite of quality checks and standardized analyses for the JGI’s large variety of sequencing platforms and products. The general responsibilities of the Production Analysis group are broad in scope and customer base. Customers include but are not limited to a) the JGI internal laboratory groups conducting process development experiments, b) the JGI Scientific Programs in need of standard analysis and new capabilities, c) internal and external users requiring custom analysis, and d) internal informatics groups requiring sharing of data and information across groups. One important ongoing duty is continuous data processing of the approximately 70 standard analysis products offered by the JGI to users, many through automated pipelines. There are 5 subgroups which include:
Automation: The Automation Group’s primary responsibility is to develop, maintain and improve the 25 automated analysis pipelines that continually process most of the JGI standard product data. Furthermore, the group is responsible for producing, from the raw sequence data file, the fundamental sequence “filtered” file which is the primary file used as input for any downstream analysis across most groups. The framework that runs these analysis pipelines is called Rolling QC (RQC). As part of the larger data tracking and analysis process at the JGI, the Automation group responsibilities include communicating with upstream and downstream groups to acquire, track, report on, and deposit data (Figure 55).
QA/QC: The central goal of the Quality Control and Assurance group (QA/QC) is to ensure that the Genomic Technologies Department delivers timely and high quality data to the JGI’s scientific programs, users, and partner institutions. To this end, the group provides laboratory quality control as well as data analysis and pipeline automation support for sequence generated by the Sequencing Technologies Group. Responsibilities include: a) validation and monitoring of sequence data, b) troubleshooting, c) laboratory process improvement support, and d) microbial program sequence data standard analysis (minimal drafts, improved drafts, single cell (SAG), cell enrichment (CE), single particle sort (SPS), and iTags).
Genome Assembly: The primary role of the Genome Assembly group is to perform de novo genome assemblies. This includes production scaling, programmatic support, and process improvement. The JGI programs supported (and their products) include: Metagenome Program (metagenome and metatranscriptome QC and assembly), Fungal Program (fungal standard, minimal and improved drafts), and Microbial Program (RNA data processing). The Genome Assembly group also works closely with the Sequencing Technology group on platform validation and process improvement by performing experimental design and data analysis. Internal process improvement efforts focus on identifying efficient tools and methods to reduce analysis time and increase throughput and new tools are integrated into our pipelines after extensive benchmarking and validation. Finally, the Genome Assembly group works closely with the JGI science programs and frequent check-in meetings provide a forum to gather user and program suggestions for improvements in methods and reporting.
User Support Analysis: The User Support Analysis (USA) group provides computational analysis for new technology development and JGI user science projects. We perform analysis for diverse experiment types, with main focus areas being genome variation, transcriptomics and epigenomics. Our work varies in scope from the development of highly customized analysis workflows for new product types, to the development of automated analysis pipelines for standard sequencing products performed at scale.
Research & Development Analysis: The R&D group has the following responsibilities: 1) to propose or identify new tools, techniques, methods and technology that create new capabilities for Scientific Programs, 2) to develop and test projects to determine what is feasible and scalable and 3) finally for successful projects, to work with User Support and/or the Automation group to develop automated production ready versions.
4. Functional Genomics
Adding functional information to digital sequences has been the major strategic growth area for the JGI in recent years. Several pilot efforts have been launched to explore new genomics technologies for their utilities and feasibility to scale. The Functional Genomics Group established in 2011 was tasked with expanding two areas that have attracted significant interest from the scientific community. The first area is to develop single cell genomics technologies to explore the hidden genomes of the microbial world. The second area is to develop a synthetic DNA assembly pipeline to express and characterize genes or pathways of interest. More recently we have focused on a third area, metabolomics, which has also attracted high user interests. In all three areas we have hired production staff, developed production SOPs, and offered these capabilities as services to JGI users. An overview of these three groups is provided below, followed by an in depth description of their operations. These include:
- The Microscale Applications Group has been tasked to perform single cell isolation from complex environmental samples, and enzymatically amplify single cell DNA or RNA to a sufficient quantity so that their genomes or transcriptomes can be determined by DNA sequencing. This capability has allowed JGI users to explore genomics and transcriptomics information of organisms that are otherwise not attainable. This group has now evolved to provide additional services including targeted sequencing of translational active single cells or mini-metagenomes, and isolating individual mutant strains from Tn-seq libraries for users to interrogate specific gene function.
- The DNA Synthesis Group has been tasked with helping users generate synthetic DNA constructs for downstream functional interrogation. We differentiate ourselves from commercial vendors in that we offer sequence data mining, gene or pathway design, DNA synthesis, expression host selection, and limited analytic capability as a whole service package. The goal is to provide JGI investigators access to not only DNA synthesis, but also expertise in sequence data analysis and successful design of genes and operons for downstream functional studies. Since the JGI started offering DNA synthesis as a service to users in 2012, we have seen a continuous increase in demand. Another promising new technology is to perform high throughput host engineering. In collaboration with the DNA Synthesis Science Program, we have developed a landing pad technology to enable insertion of foreign DNA into many bacterial genomes and express genes heterologously in non-model species where genetic tools are not well developed. A pilot effort designed to couple this landing pad technology with CRISPR-Cas9 technology was recently launched. This newly developed technology enables us to up- and down-regulate gene expression at a whole genome scale to investigate gene function in many organisms by employing high throughput phenotyping technologies.
- The Metabolomic Technologies Group now supports four products enabling JGI users with diverse technologies for functional genomic characterization. The first two products, polar and non-polar metabolite analysis, were released in 2016. These are now widely used for analysis of polar metabolites within or outside cells, typically to examine the catabolic capabilities of organisms and examine cross-feeding between plants and microbes. Secondary metabolite analyses are the most common application of the non-polar metabolomic analysis product and is the most rapidly growing. Often this analysis is performed in conjunction with transcriptomics to identify compounds that mediate microbial interactions or with DNA synthesis to discover novel biosynthetic clusters.