The central mission of the Genomic Technologies Department is to develop and effectively apply genomic technologies to accelerate DOE users’ science. Our primary product continues to be robust sample management & library construction coupled to state-of-the-art high throughput DNA sequencing and analysis. In addition, our group has also diversified to explore new opportunities for technological access to our user services (including single cell genomics, epigenomics and DNA synthesis). Our group works closely with the Project Management Office which initiates and communicates projects with external and internal scientific users. The Genomic Technologies Department is divided into 5 Groups to effectively delegate activities among a complementary set of scientific managers and their associated staff.
1. Sequencing Technologies
The key function of the Sequencing Technologies Group is to apply cutting edge molecular capabilities and state-of-the-art DNA sequencing to enable biological discovery by JGI and external DOE users. We actively engage in interactions with the Project Management Office to make our genomic capabilities efficiently and readily applicable for all of our user projects. We are also aggressively committed to continuous research & development efforts surrounding new sequencing technologies to stay competitive as well as facilitate innovative scientific methods to benefit the DOE research community.
In order to efficiently integrate the front-end sample processing with our library construction capabilities and high throughput sequencing platforms, we are organized into three functionally discrete but highly interactive units. These include nucleic acid based Sample Management, large-scale and high quality Library Construction and various 2nd and 3rd generation DNA Sequencing Platforms. Through this integrated operation, we partner with all the JGI internal research programs and further reach out to the external user community to perform sequencing based analysis for genome (DNA) and transcriptome (RNA) characterization. To ensure operational efficiency and accountability, our team leads frequently communicate with the Project Management Office and Operations Department to communicate project cycle times, failure rates, instrument utilization and sequencing output to achieve proper workflow management.
2. Sequencing Informatics
The Institutional Informatics group maintains and extends the software systems required to support activities of proposal management and project specification by the project management office, lab work by production sequencing, post sequencing automated primary data processing and analysis, and publication of data to portals and GenBank. The Institutional Informatics group is organized into six key sub groups:
Project Management Office Services (PMOS): This group is responsible for supporting the software systems used by the Project Management Office (PMO), and a single sign on system for users to access web resources at the JGI. These systems include the Proposal Management System, Project Navigator, and Single Sign On.
Production Process Support (PPS): This team maintains and extends the software systems used by the Sequencing Technology group. This includes Sample Management, Libraries, and Sequencing Operations. In the past two years, a new Integrated Tracking System (ITS) has been released by the JGI and the LIMS supported by this group is one part of this larger system.
Sequence Data Management (SDM): This team manages all data produced from our DNA sequencing platforms. Vender-provided pipelines are wrapped with in-house software to allow for integration within the JGI’s larger infrastructure. All analyzed and raw data are stored within our in-house built hierarchical file system called JAMO. All users access the JAMO tool chain to retrieve data for downstream processing.
Data Warehouse: Since various teams maintain their own data stores it is not possible to query across the systems without having a higher-level schema map. To facilitate this, a data warehouse is built on a nightly basis that performs the necessary mapping across schemas and denormalizes the data for fast read queries. Preconfigured reports are built, using open source version of Jaspersoft, and available for users to run on demand. Ad hoc queries can be made against the data warehouse using SQL as a tool of choice.
Software Quality Assurance (SQA): The SQA team is responsible for testing ITS releases and ensuring that requirements are met before their release. They perform end-to-end integration tests, and when problems are found they are raised with the developers and users. This team also responds to user support requests that arise in the production environment.
Genome Portal: The JGI Genome Portal provides unified access to all JGI genomic datasets. A user can search, download and explore multiple data sets available for all DOE JGI sequencing projects including their status, assemblies and annotations of sequenced genomes. This framework has access control to restrict the visibility of a portal to the PI and associates while work is in progress, allowing them to track progress. However, once the project is completed and released to public the access restriction is lifted. Certain functionality of these systems is highly dependent on each other. To avoid unnecessary tight coupling yet support the dependencies, we have implemented, where performance permitted, RESTful web services. This has allowed each team the necessary flexibility to address the evolving business needs of their customer base while allowing other parts of the organization to access the required data through formal, supported interfaces.
3. Quality Control & Assembly
We perform sequence QA analyses which function as a checkpoint to measure progress against goals prescribed by the Project Management Office at the beginning of each project. Sequence QA analyses result in recommendations to the sequencing coordinator and project managers on data types and sequencing effort needed to reach project goals. Since we are the first to analyze data coming off of the JGI’s sequencers, we are the first to recognize and provide feedback on quality issues which could impact our ability to successfully complete projects. In addition to JGI’s programmatic microbial, fungal, plant, and metagenome projects, we provide ad hoc data analysis support for laboratory R&D, process improvement, and new product development projects for internal laboratory scientists.
The central goal of Quality Control and Assurance group is to ensure that the Genomic Technologies Department delivers high quality data to the JGI’s scientific programs, users, and partner institutions. To this end, the group provides laboratory quality control as well as bioinformatic data analysis and pipeline automation support for sequence generated by the Sequencing Technologies Group.
We developed tools to monitor the performance and output of JGI’s sequencing operation. The tools are used to monitor quality trends and help identify processes that have become unstable or have drifted out of specification. The control chart interface has been expanded to support interactive querying over date range, product type, platform, sequencing run mode, library type and other dimensions.
4. Genome Analysis
The goal of the Genome Analysis group is to perform custom R&D research to further develop new analysis capabilities for JGI users. The team also assesses data being generated to enable deployment of new molecular methods and new sequencing platforms. The group consists of two teams, a research & development team and a user scientific support team.
Our responsibilities lie in three major areas. First, we provide support for the department to adopt new sequencing technologies. This includes analysis support for new experimental protocol development and support for new sequencing instruments (such as Pacific Biosciences). Second, we provide data analysis as an evolving service to internal and external users to deliver on JGI products. This includes implementing or developing data analysis workflows and software algorithms, researching and developing tools that can take advantage of the latest high performance computing capabilities, and engineering an integrated data analysis framework to streamline our data analysis service. Finally, we actively participate in JGI’s ongoing flagship and grand challenge projects.
5. Functional Genomics
Adding functional information to digital sequences generated by the JGI is a major strategic growth area for our institute. Under this framework, two major programs have been established. These include:
- The Microscale Applications Group was formed in 2011 and tasked to perform single cell isolation from complex environmental samples, enzymatically amplify DNA or RNA from isolated single cells to a sufficient quantity so that their genomes or transcriptomes can be determined by DNA sequencing. This capability has allowed JGI users to explore genomics and transcriptomics information of organisms that are otherwise not reachable.
- The Synthetic Biology Group was also formed in 2011 and responsible for building synthetic DNA constructs. We differentiate ourselves from the commercial vendors in that we offer sequence data mining, gene or pathway structure design, DNA synthesis, limited host selection, and limited analytic capability as the whole service package. The goal is to provide DOE investigators access to expertise in sequence data analysis and design of DNA substrates for downstream functional studies. Since JGI started offering DNA synthesis as a service to users in 2012, we have seen an increase of demands over the past few years.