The U.S. Department of Energy (DOE) Joint Genome Institute (JGI) is committed to advancing the field of genomics through open access to resources and promoting data sharing to accelerate discovery and maximize the benefit to society.
Inspired by the Human Genome Project (HGP), and building upon the principles for open access in the field of genomics established in Bermuda (1996-1997) and reinforced by the Fort Lauderdale Agreement (2003), the JGI recognized the need to engage the community to inform necessary changes to JGI data policies.
Since completion of the HGP, the JGI has sustained its commitment to uploading sequence data generated on behalf of its user community to NCBI as well as to the data portals that the JGI manages, immediately after data processing. The JGI has sought to balance its Data Release and Utilization Policies with the DOE mandate to provide data access widely to the scientific community as soon as possible after data generation while retaining the ability of the principal investigators of approved JGI user projects to publish the primary report of their data without concern of preemption by others.
NCBI databases provide rapid, unrestricted public access to submitted data unless data are requested to be held until a specified date, or the publication date, whichever comes first. For nearly 40 years, the International Nucleotide Sequence Database Collaboration (INSDC) has managed the growing corpus of sequence data for all organisms as a public good available to all users for any application while also encouraging data users to cite the original data source. Differences between JGI’s and NCBI’s data use policies have resulted in conflicting assumptions and expectations by the community regarding the use of JGI data prior to publication by the principal investigators. To resolve these differences and provide a consistent policy framework for JGI customers, the JGI and NCBI have initiated conversations to explore a path to reconcile the immediate availability of JGI data on SRA and GenBank with the period of exclusive publication rights under JGI’s policy. In parallel, as of March 2nd, 2021, the JGI temporarily halted all automated submissions to NCBI (including SRA and GenBank).
The JGI and NCBI have discussed a path forward for SRA, and GenBank records that are under JGI use restrictions to be moved to private status. This will remove records from search indexes and BLAST databases until such time that JGI indicates the data should be publicly released. Over the next several weeks, the JGI will be in contact with JGI PIs to resolve the status of the available data and determine when these records will be made available again within NCBI systems.
In addition, the JGI issued a Request for Information (RFI), closing April 21st, 2021, to solicit additional community feedback on proposed changes to current data policies for gaining a better understanding of how changes may affect data-producing users and other users.
After the RFI closes on April 21, the JGI will review the solicited feedback to consider policy changes in coordination with its stakeholder at the DOE Office of Biological and Environmental Research program. Following these discussions, the JGI will coordinate with NCBI to communicate improvements for facilitating the open use of data for the benefit of the worldwide research community.
Additional references: The field of genomics has guidelines for the use of pre-publication genomic data: Sharing Data from Large-scale Biological Research Projects: A System of Tripartite Responsibility and the Toronto Data Release Workshop (Nature 461, 168-170, 10 September 2009). As have been discussed on multiple occasions (e.g., BMC Genomics 2014, 15:5 ), these guidelines place specific responsibilities on users of genomic data, who benefit from using data that they did not need to generate, to follow normal standards of scientific etiquette and fair use of unpublished data and to recognize that “resource producers have a legitimate interest in publishing prominent peer-reviewed reports describing and analyzing the resource that they have produced.”