Accomplishments at a Glance
Below are briefs on some of the scientific collaborations that came out of the JGI in 2023, as well as highlights around our outreach efforts.
Impact: By the Numbers
Spending Profile
Users on the Global Map
Region | Count | Region | Count | Region | Count | Region | Count |
---|---|---|---|---|---|---|---|
North America | 1,694 | Asia | 82 | Europe | 479 | ||
United States | 1,592 | China | 24 | Austria | 13 | Norway | 18 |
Canada | 95 | India | 10 | Belgium | 17 | Poland | 2 |
Mexico | 7 | Israel | 9 | Croatia | 1 | Portugal | 7 |
Japan | 29 | Czech Republic | 13 | Russia | 5 | ||
South America | 34 | Malaysia | 1 | Denmark | 17 | Serbia | 2 |
Argentina | 1 | Singapore | 3 | Estonia | 2 | Slovenia | 2 |
Brazil | 23 | South Korea | 6 | Finland | 13 | Spain | 43 |
Chile | 2 | France | 62 | Sweden | 28 | ||
Colombia | 1 | Africa | 14 | Germany | 86 | Switzerland | 12 |
Ecuador | 1 | Morocco | 1 | Greece | 10 | Turkey | 1 |
Peru | 1 | South Africa | 12 | Hungary | 10 | United Kingdom | 60 |
Uruguay | 5 | Tunisia | 1 | Iceland | 1 | ||
Ireland | 3 | Australia + New Zealand | 70 | ||||
Italy | 22 | Australia | 53 | ||||
Netherlands | 29 | New Zealand | 17 |
Users on the U.S. Map
Cumulative Number of Projects Completed
Cumulative Number of Scientific Publications
Sequencing Output
(in billions of bases or GB)
The JGI supports short- and long-read sequencers, where a read refers to a sequence of DNA bases. Short-read sequencers produce billions of paired-end 150 basepair reads used for quantification, such as in gene expression analysis. Long-read sequencers currently average 60,000–70,000 bp reads and are used for de novo genome assembly. Combined short-read and long-read totals per year give JGI’s annual sequence output. The total sequence output in 2023 was 716,929 GB.
Users Letters of Intent/Proposals Submitted & Approved
Computational Infrastructure: Users of JGI Tools & Data
The Genome Portal provides unified access to all JGI genomic databases and analytical tools. Users can search, download and explore data sets available for all JGI sequencing projects including their status, assemblies, and annotations of sequenced genomes. The Data Portal allows JGI users to more easily access public data sets through a common set of metadata across files submitted by each scientific program. The Genome Portal will be retired once the Data Portal reaches data- and feature-parity with its predecessor. FY2023 improvements to the Data Portal include improved data parity, cart download, navigation by pagination, and significant progress on privileged access and access management.
- JGI Archive and Metadata Organizer (JAMO): 15.192 million file records
- JAMO Archived Data Footprint: 15.952 Petabytes
- Data Downloads in FY23:
- Genome Portal: 7.286 million file-downloads
- Data Portal: 0.633 million file-downloads
- Total: 7.919 million file-downloads
Since the retirement of NERSC’s Cori system, a number of JGI’s pipelines and processes have moved out of NERSC to the JGI’s new informatics cluster, Dori, situated at LBL’s LabIT, and the computing infrastructure at IGB. This has required JGI’s data management system, JAMO (JGI Archive and Metadata Organizer), to expand operations across data centers — both ingesting new data and delivering data stored at NERSC for processes running at these other centers. This has been our first step in creating a distributed version of JAMO, which in the future will be capable of sharing data across registered lab members.
As part of our business continuity planning, JGI has worked with the Environmental Molecular Sciences Laboratory to enable JAMO to automatically transmit and store files on EMSL’s HPSS tape system via Globus. Currently, all new raw sequencing data is being transmitted. Over the next year, all legacy data will be restored from NERSC’s tape system and transmitted automatically to EMSL.
Computational graphic by Neil Byers, JGI. Photography and cinemagraphs by Thor Swift, Berkeley Lab. ‘By the Numbers’ infographics by Creative Services, IT Division, Berkeley Lab.