DOE Joint Genome Institute

  • COVID-19
  • About Us
  • Contact Us
  • Our Science
    • DOE Mission Areas
    • Bioenergy Research Centers
    • Science Programs
    • Science Highlights
    • Scientists
    Data yielded from RIViT-seq increased the number of sigma factor-gene pairs confirmed in Streptomyces coelicolor from 209 to 399. Here, grey arrows denote previously known regulation and red arrows are regulation identified by RIViT-seq; orange nodes mark sigma factors while gray nodes mark other genes. (Otani, H., Mouncey, N.J. Nat Commun 13, 3502 (2022). https://doi.org/10.1038/s41467-022-31191-w)
    Streamlining Regulon Identification in Bacteria
    Regulons are a group of genes that can be turned on or off by the same regulatory protein. RIViT-seq technology could speed up associating transcription factors with their target genes.

    More

    (PXFuel)
    Designer DNA: JGI Helps Users Blaze New Biosynthetic Pathways
    In a special issue of the journal Synthetic Biology, JGI scientific users share how they’ve worked with the JGI DNA Synthesis Science Program and what they’ve discovered through their collaborations.

    More

    A genetic element that generates targeted mutations, called diversity-generating retroelements (DGRs), are found in viruses, as well as bacteria and archaea. Most DGRs found in viruses appear to be in their tail fibers. These tail fibers – signified in the cartoon by the blue virus’ downward pointing ‘arms’— allow the virus to attach to one cell type (red), but not the other (purple). DGRs mutate these ‘arms,’ giving the virus opportunities to switch to different prey, like the purple cell. (Courtesy of Blair Paul)
    A Natural Mechanism Can Turbocharge Viral Evolution
    A team has discovered that diversity generating retroelements (DGRs) are not only widespread, but also surprisingly active. In viruses, DGRs appear to generate diversity quickly, allowing these viruses to target new microbial prey.

    More

  • Our Projects
    • Search JGI Projects
    • DOE Metrics/Statistics
    • Approved User Proposals
    • Legacy Projects
    Photograph of a stream of diatoms beneath Arctic sea ice.
    Polar Phytoplankton Need Zinc to Cope with the Cold
    As part of a long-term collaboration with the JGI Algal Program, researchers studying function and activity of phytoplankton genes in polar waters have found that these algae rely on dissolved zinc to photosynthesize.

    More

    This data image shows the monthly average sea surface temperature for May 2015. Between 2013 and 2016, a large mass of unusually warm ocean water--nicknamed the blob--dominated the North Pacific, indicated here by red, pink, and yellow colors signifying temperatures as much as three degrees Celsius (five degrees Fahrenheit) higher than average. Data are from the NASA Multi-scale Ultra-high Resolution Sea Surface Temperature (MUR SST) Analysis product. (Courtesy NASA Physical Oceanography Distributed Active Archive Center)
    When “The Blob” Made It Hotter Under the Water
    Researchers tracked the impact of a large-scale heatwave event in the ocean known as “The Blob” as part of an approved proposal through the Community Science Program.

    More

    A plantation of poplar trees. (David Gilbert)
    Genome Insider podcast: THE Bioenergy Tree
    The US Department of Energy’s favorite tree is poplar. In this episode, hear from ORNL scientists who have uncovered remarkable genetic secrets that bring us closer to making poplar an economical and sustainable source of energy and materials.

    More

  • Data & Tools
    • IMG
    • Data Portal
    • MycoCosm
    • PhycoCosm
    • Phytozome
    • GOLD
    HPCwire Editor's Choice Award (logo crop) for Best Use of HPC in the Life Sciences
    JGI Part of Berkeley Lab Team Awarded Best Use of HPC in Life Sciences
    The HPCwire Editors Choice Award for Best Use of HPC in Life Sciences went to the Berkeley Lab team comprised of JGI and ExaBiome Project team, supported by the DOE Exascale Computing Project for MetaHipMer, an end-to-end genome assembler that supports “an unprecedented assembly of environmental microbiomes.”

    More

    With a common set of "baseline metadata," JGI users can more easily access public data sets. (Steve Wilson)
    A User-Centered Approach to Accessing JGI Data
    Reflecting a structural shift in data access, the JGI Data Portal offers a way for users to more easily access public data sets through a common set of metadata.

    More

    Phytozome portal collage
    A More Intuitive Phytozome Interface
    Phytozome v13 now hosts upwards of 250 plant genomes and provides users with the genome browsers, gene pages, search, BLAST and BioMart data warehouse interfaces they have come to rely on, with a more intuitive interface.

    More

  • User Programs
    • Calls for Proposals
    • Special Initiatives & Programs
    • Product Offerings
    • User Support
    • Policies
    • Submit a Proposal
    screencap from Amundson and Wilkins subsurface microbiome video
    Digging into Microbial Ecosystems Deep Underground
    JGI users and microbiome researchers at Colorado State University have many questions about the microbial communities deep underground, including the role viral infection may play in other natural ecosystems.

    Read more

    Yeast strains engineered for the biochemical conversion of glucose to value-added products are limited in chemical output due to growth and viability constraints. Cell extracts provide an alternative format for chemical synthesis in the absence of cell growth by isolating the soluble components of lysed cells. By separating the production of enzymes (during growth) and the biochemical production process (in cell-free reactions), this framework enables biosynthesis of diverse chemical products at volumetric productivities greater than the source strains. (Blake Rasor)
    Boosting Small Molecule Production in Super “Soup”
    Researchers supported through the Emerging Technologies Opportunity Program describe a two-pronged approach that starts with engineered yeast cells but then moves out of the cell structure into a cell-free system.

    More

    These bright green spots are fluorescently labelled bacteria from soil collected from the surface of plant roots. For reference, the scale bar at bottom right is 10 micrometers long. (Rhona Stuart)
    A Powerful Technique to Study Microbes, Now Easier
    In JGI's Genome Insider podcast: LLNL biologist Jennifer Pett-Ridge collaborated with JGI scientists through the Emerging Technologies Opportunity Program to semi-automate experiments that measure microbial activity in soil.

    More

  • News & Publications
    • News
    • Blog
    • Podcasts
    • Webinars
    • Publications
    • Newsletter
    • Logos and Templates
    • Photos
    A view of the mangroves from which the giant bacteria were sampled in Guadeloupe. (Hugo Bret)
    Giant Bacteria Found in Guadeloupe Mangroves Challenge Traditional Concepts
    Harnessing JGI and Berkeley Lab resources, researchers characterized a giant - 5,000 times bigger than most bacteria - filamentous bacterium discovered in the Caribbean mangroves.

    More

    In their approved proposal, Frederick Colwell of Oregon State University and colleagues are interested in the microbial communities that live on Alaska’s glacially dominated Copper River Delta. They’re looking at how the microbes in these high latitude wetlands, such as the Copper River Delta wetland pond shown here, cycle carbon. (Courtesy of Rick Colwell)
    Monitoring Inter-Organism Interactions Within Ecosystems
    Many of the proposals approved through JGI's annual Community Science Program call focus on harnessing genomics to developing sustainable resources for biofuels and bioproducts.

    More

    Coloring the water, the algae Phaeocystis blooms off the side of the sampling vessel, Polarstern, in the temperate region of the North Atlantic. (Katrin Schmidt)
    Climate Change Threatens Base of Polar Oceans’ Bountiful Food Webs
    As warm-adapted microbes edge polewards, they’d oust resident tiny algae. It's a trend that threatens to destabilize the delicate marine food web and change the oceans as we know them.

    More

News & Publications
Home › News Releases › International Human Genome Sequencing Consortium Announces “Working Draft” of Human Genome

June 26, 2000

International Human Genome Sequencing Consortium Announces “Working Draft” of Human Genome

The Human Genome Project public consortium today announced that it has assembled a working draft of the sequence of the human genome–the genetic blueprint for a human being.

This major milestone involved two tasks: placing large fragments of DNA in the proper order to cover all of the human chromosomes, and determining the DNA sequence of these fragments.

The assembly reported today consists of overlapping fragments covering 97 percent of the human genome, of which sequence has already been assembled for approximately 85 percent of the genome. The sequence has been threaded together into a string of As, Ts, Cs, and Gs arrayed along the length of the human chromosomes.

Production of genome sequence has skyrocketed over the past year, with more than 60 percent of the sequence having been produced in the past six months alone. During this time, the consortium has been producing 1000 bases a second of raw sequence–7 days a week, 24 hours a day.

The average quality of the “working draft” sequence far exceeds the consortium’s original expectations for this intermediate product. (Note to journalists: Human Genome Project fact sheet in press kit contains definitions of “working draft,” etc.)

Consortium centers have produced far more sequence data than expected (over 22.1 billion bases of raw sequence data, comprising overlapping fragments totaling 3.9 billion bases and providing 7-fold sequence coverage of the human genome).

As a result, the “working draft” is substantially closer to the ultimate “finished” form than the consortium expected at this stage. Approximately 50 percent of the genome sequence is in near-“finished” form or better, and 24 percent of it is in completely “finished” form. Across the genome, the average DNA segment resides in a continuous gapless sequence “contig” of 200,000 bases. The average accuracy of all of the DNA sequence in this assembly is 99.9 percent.

The sequence information from the public project has been continuously, immediately and freely released to the world, with no restrictions on its use or redistribution. The information is scanned daily by scientists in academia and industry, as well as by commercial database companies providing information services to biotechnologists.

Already, many tens of thousands of genes have been identified from the genome sequence. Analysis of the current sequence shows 38,000 predicted genes confirmed by experimental evidence. There are many thousands of additional gene predictions to be tested experimentally. Dozens of disease genes have been pinpointed by access to the working draft.

Consortium goals. The consortium’s goal for the spring of 2000 was to produce a “working draft” version of the human sequence, an assembly containing overlapping fragments that cover approximately 90 percent of the genome and that are sequenced in “working draft” form, i.e.- with some gaps and ambiguities. The consortium’s ultimate goal is to produce a completely “finished” sequence, i.e. one with no gaps and 99.99 percent accuracy. The target date for this ultimate goal had been 2003, but today’s results mean that the final, stand-the-test-of-time sequence will likely be produced considerably ahead of that schedule.

Complementary approaches

In a related announcement, Celera Genomics announced today that it has completed its own first assembly of the human genome DNA sequence.

The public and private projects use similar automation and sequencing technology, but different approaches to sequencing the human genome. The public project uses a ‘hierarchical shotgun’ approach in which individual large DNA fragments of known position are subjected to shotgun sequencing (i.e., shredded into small fragments that are sequenced, and then reassembled on the basis of sequence overlaps).

The Celera project uses a “whole genome shotgun” approach, in which the entire genome is shredded into small fragments that are sequenced and put back together on the basis of sequence overlaps.

The hierarchical shotgun method has the advantage that the global location of each individual sequence is known with certainty, but it requires constructing a map of large fragments covering the genome. The whole shotgun method does not require this step, but presents other challenges in the assembly phase.

Both approaches align the sequence along the human chromosomes by using landmarks contained in the physical map produced by the Human Genome Project.

“The two approaches are quite complementary. The public project and Celera plan to discuss the relative scientific merits of the methods employed by the two projects. In the end, the best approach may well be to use a combination of the methods for sequencing future genomes,” said Francis Collins, M.D., Ph.D., director of the National Human Genome Research Institute of the National Institutes of Health. In fact, current plans by the public project to sequence the genome of the laboratory mouse involve this hybrid strategy.

Next phase

The Human Genome Project will now focus on converting the “working draft” and near-“finished” sequences to a “finished” form. This will be done by filling the gaps in the “working draft” sequence and by increasing the overall sequence accuracy to 99.99 percent. Although the “working draft” version is useful for most biomedical research, a highly accurate sequence that is as close to perfect as possible is critical for obtaining all the information there is to get from human sequence data. This has already been achieved for chromosomes 21 and 22, as well as for 24% of the entire genome.

Human DNA variation

The greater-than-expected sequence production has also yielded a bumper crop of human genetic variations – called single nucleotide polymorphisms or SNPs. The Human Genome Project had set a goal of discovering 100,000 SNPs by 2003. Already, with today’s assembled sequences and other data accumulated by The SNP Consortium, scientists have now found more than 300,000 SNPs and will likely have 1 million SNPs by year-end. These SNPs provide a powerful tool for studies of human disease and human history.

Background

Sequencing, which is determining the exact order of DNA’s four chemical bases, commonly abbreviated A, T, C and G, has been expedited in the Human Genome Project by technological advances in deciphering DNA and the collaborative nature of the effort, which includes about 1,000 scientists worldwide working together effectively.

The Human Genome Sequencing Project aims to determine the sequence of the euchromatic portion of the human genome. The euchromatic portion excludes certain regions consisting of long stretches of highly repetitive DNA that encode little genetic information, and that are not recovered in the vector systems used by the genome project. Such regions account for about 10% of the genome, and are said to be heterochromatic. (For example, the center of chromosomes, called centromeres, consists of heterochromatic DNA.)

The international Human Genome Sequencing consortium includes scientists at 16 institutions in France, Germany, Japan, China, Great Britain and the United States. The five largest centers are located at: Baylor College of Medicine, Houston, Texas; Joint Genome Institute in Walnut Creek, CA; Sanger Centre near Cambridge, England; Washington University School of Medicine, St. Louis; and Whitehead Institute, Cambridge, Massachusetts. Together, these five centers have generated about 82% of the sequence. The following list provides more detail about the 16 centers and their individual contributions to the Human Genome Project.

The project has been tightly coordinated so that no region of the genome is left unattended to, and duplication is minimized. Participants in the international consortium have all adhered to the project’s quality standards and to the daily data release policy. The project is funded by grants from government agencies and public charities in the various countries. These include the National Human Genome Research Institute at the National Institutes of Health, the Wellcome Trust in England, and the US Department of Energy.

The total cost for the working draft is approximately $300 million worldwide, with roughly half ($150 million) being funded by the US National Institutes of Health. The cost of sequencing the human genome is sometimes reported as $3 billion. However, this figure refers to the original estimate of total funding for the Human Genome Project over a 15-year period (1990-2005) for a wide range of scientific activities related to genomics. These include studies of human diseases, experimental organisms (such as bacteria, yeast, worms, flies and mice), development of new technologies for biological and medical research, computational methods to analyze genomes, and ethical, legal and social issues related to genetics.

 

The sixteen institutions that form the Human Genome Sequencing Consortium include

  1. Baylor College of Medicine, Houston, Texas, USA
  2. Beijing Human Genome Center, Institute of Genetics, Chinese Academy of Sciences, China
  3. Gesellschaft fur Biotechnologische Forschung mbH, Braunschweig, Germany
  4. Genoscope, Evry, France
  5. Genome Therapeutics Corporation, Waltham, MA, USA
  6. Institute for Molecular Biotechnology, Jena, Germany
  7. Joint Genome Institute, U.S. Department of Energy, Walnut Creek, CA, USA
  8. Keio University, Tokyo, Japan
  9. Max Planck Institute for Molecular Genetics, Berlin, Germany
  10. RIKEN Genomic Sciences Center, Saitama, Japan
  11. The Sanger Centre, Hinxton, U.K.
  12. Stanford DNA Sequencing and Technology Development Center, Palo Alto, CA, USA
  13. University of Washington Genome Center, Seattle, WA, USA
  14. University of Washington Multimegabase Sequencing Center, Seattle, WA, USA
  15. Whitehead Institute for Biomedical Research, MIT, Cambridge, MA, USA
  16. Washington University Genome Sequencing Center, St. Louis, MO, USA

In addition, two institutions played a key role in providing computational support and analysis for the Human Genome Project over the course of the past eighteen months. These include:

The National Center for Biotechnology Information at NIH
The European Bioinformatics Institute in Cambridge, UK

Scientists at the University of California, Santa Cruz, and Neomorphic, Inc. also assisted the assembly of the genome sequence across chromosomes.

Share this:

  • Click to share on Facebook (Opens in new window)
  • Click to share on LinkedIn (Opens in new window)
  • Click to share on Pinterest (Opens in new window)
  • Click to share on Twitter (Opens in new window)
  • Click to print (Opens in new window)

The U.S. Department of Energy Joint Genome Institute, a DOE Office of Science User Facility at Lawrence Berkeley National Laboratory, is committed to advancing genomics in support of DOE missions related to clean energy generation and environmental characterization and cleanup. JGI provides integrated high-throughput sequencing and computational analysis that enable systems-based scientific approaches to these challenges. Follow @jgi on Twitter.

DOE’s Office of Science is the largest supporter of basic research in the physical sciences in the United States, and is working to address some of the most pressing challenges of our time. For more information, please visit science.energy.gov.

Filed Under: News Releases

More topics:

  • COVID-19 Status
  • News
  • Science Highlights
  • Blog
  • Webinars
  • CSP Plans
  • Featured Profiles

Related Content:

JGI announces first round of 2023 New Investigator awardees

Digital ID card with 10 headshots reads: Congratulations to our 2023 New Investigator recipients!

JGI at 25: Following Fungi that Pry Apart Plant Polymers

A brown goat with white horns looks at green hay

Exploring Possibilities: 2022 JGI-UC Merced Interns

2022 JGI-UC Merced interns (Thor Swift/Berkeley Lab)

JGI at 25: Using team science to build communities around data

JGI at 25: Expanding Metagenomics to Capture Viral Diversity

Artist rendering of genome standards being applied to deciphering the extensive diversity of viruses. (Illustration by Leah Pantea)

A New Actinobacterial Chapter in the Genomic Encyclopedia of Bacteria and Archaea

Open book with circular representations of microbial genomes above, all against a green background
  • Careers
  • Contact Us
  • Events
  • User Meeting
  • MGM Workshops
  • Internal
  • Disclaimer
  • Credits
  • Policies
  • Emergency Info
  • Accessibility / Section 508 Statement
  • Flickr
  • LinkedIn
  • RSS
  • Twitter
  • YouTube
Lawrence Berkeley National Lab Biosciences Area
A project of the US Department of Energy, Office of Science

JGI is a DOE Office of Science User Facility managed by Lawrence Berkeley National Laboratory

© 1997-2023 The Regents of the University of California