WALNUT CREEK, CA—The Department of Energy Joint Genome Institute (JGI) and the National Energy Research Scientific Computing Center (NERSC) Division have joined forces to create a more robust computational infrastructure for the world’s leading generator of DNA sequence information for bioenergy and environmental applications.
“We evaluated wide a variety of options,” said Vito Mangiardi, JGI’s Deputy Director for Business Operations & Production. “After a thorough review, it made perfect sense to partner with NERSC.”
The NERSC Division will take on six JGI staff to consolidate and address the Institute’s ever-escalating high-performance/scientific computing needs as well as computer and network security and instrumentation computer systems, leaving desktop support services under local control at the JGI.
While NERSC has been providing data storage services to the JGI for years, this new partnership represents a more comprehensive and integrated approach. “With NERSC’s significant experience in providing these kind of services they don’t have the steep learning curve to climb,” said Mangiardi. “Under this arrangement the JGI will receive 24/7/365 coverage, which minimizes the amount of disruption and provides us with much better backup.”
In the last year alone, JGI generated over one terabase (one trillion nucleotide letters of genetic code) from its plant, microbe, fungal and metagenome (community of microbes) for its various user programs. This total represents a productivity increase of eight times over the previous year with an additional five-times more anticipated this year. In order to accommodate the “Niagara” torrent of data flowing from the advanced sequencing platforms that the JGI has incorporated into its production process over the last two years, they went searching for help.
“We simply didn’t have the capacity to keep up,” Mangiardi said.
NERSC represented a clear choice.
“This represents an exciting coming of age for the combined genomics and high performance computing enterprise,” said NERSC Division Director Kathy Yelick. “NERSC has a successful track record in responding to these kinds of challenges. This move will bring to NERSC the domain-specific knowledge within the JGI team about the genomics requirements, which is quite different from NERSC’s more traditional HPC workload. The science is heavy in data analytics and JGI runs several dozen databases and web portals providing access to genomic information. Data-centric computing is an important future direction for NERSC, and genomics is seeing exponential increases in computation and storage.”
NERSC will provide a high-quality, high-efficiency environment by marshaling such resources as:
- A dedicated 10 Gbps (Gigabit per second) link deployed by the engineers at the Energy Sciences Network (ESnet) between both institutions on its high-bandwidth Science Data Network (SDN) designed to haul massive scientific datasets—such as those from The Great Prairie Soil Metagenomes project that JGI is piloting for the DOE Grand Challenge program.
- Such important infrastructural considerations as redundant cooling, seismic isolation, a pre-action fire sprinkler that employs a dry pipe system tailored for use in locations where accidental activation is undesirable;
- An Uninterruptible Power Supply to maintain data continuity and integrity;
- An extensive environmental and energy-usage monitoring; and
- A central help desk function.
“These are critically important assets that we can now bring to bear on some of the most complex questions in biology,” said JGI Director Eddy Rubin. “We really need the massive computational ‘horsepower’ and supporting infrastructure that NERSC offers to help us advance our understanding of the carbon cycle and many of the other biogeochemical processes in which microbes play a starring role and that the JGI is characterizing in a massively parallel fashion.