WALNUT CREEK, CA— The U.S. Department of Energy Joint Genome Institute (DOE JGI) has released a complete draft assembly of the soybean (Glycine max) genetic code, making it widely available to the research community to advance new breeding strategies for one of the world’s most valuable plant commodities. Soybean not only accounts for 70 percent of the world’s edible protein, but also is an emerging feedstock for biodiesel production. Soybean is second only to corn as an agricultural commodity and is the leading U.S. agricultural export.
DOE JGI’s interest in sequencing the soybean centers on its use for biodiesel, a renewable, alternative fuel with the highest energy content of any alternative fuel. According to 2007 U.S. Census data, soybean is estimated to be responsible for more than 80 percent of biodiesel production.
“The genome sequence is the direct result of a memorandum of understanding between DOE and USDA to increase interagency collaboration in plant genomics,” said DOE Under Secretary for Science Dr. Raymond L. Orbach. “We are proud to support this major scientific breakthrough that will not only advance our knowledge of a key agricultural commodity but also lead to new insights into biodiesel production.”
“Soybeans have been an important food plant providing essential protein to people for hundreds of years,” said USDA Chief Scientist and Under Secretary for Research, Education, and Economics Dr. Gale A. Buchanan. “Now, with the new knowledge available through this joint DOE/USDA genome sequencing project, researchers everywhere will be able to further enhance important traits that make the soybean such a valuable plant. It’s a great day for agriculture and people everywhere.”
This effort was led by Dan Rokhsar and Jeremy Schmutz of the DOE JGI, Gary Stacey of the University of Missouri-Columbia, Randy Shoemaker of the U.S. Department of Agriculture (USDA)-Agricultural Research Service (USDA-ARS), Scott Jackson of Purdue University, with support from the DOE, the USDA, and the National Science Foundation (NSF). In addition, the United Soybean Board, the North Central Soybean Research Program, and the Gordon and Betty Moore Foundation, have supported the soybean genome effort.
“Soybean is the one of the largest and most complex plant genomes sequenced by the whole genome shotgun strategy,” noted Rokhsar. The process entails shearing the DNA into small fragments enabling the order of the nucleotides to be read and interpreted. Steven Cannon of the USDA-ARS collaborated with the DOE team to ensure the accuracy of the assembly.
Preliminary scientific details emerging from the sequence analysis will be presented by Schmutz at the International Conference on Legume Genomics and Genetics in Puerto Vallarta, Mexico, December 8, 2008. The soybean genome sequence information can be browsed at http://www.phytozome.net/soybean.
Schmutz and colleagues have begun to analyze the soybean genome, which at one billion nucleotides is roughly one-third the size of the human genome. Preliminary studies suggest as many as 66,000 genes—more than twice the number identified in the human genome sequence, and nearly half-again as many as the poplar genome, sequenced by DOE JGI and published in the journal Science in 2006.
“We have ordered and localized about 5,500 genetic markers on the sequence, which promise to be of particular importance to those researchers seeking to optimize certain qualities in soybean,” said Schmutz. Thousands of these markers were developed by Perry Cregan and colleagues of the USDA-ARS with support of the United Soybean Board. A genetic marker represents a known location on a chromosome that can be associated with a particular gene or trait. Prospective genome pathways of interest are those that directly influence yield, oil and protein content, as well as drought tolerance and resistance to nematodes and diseases such as the water mold Phytophthora sojae, previously sequenced by DOE JGI, which causes stem and root rot of soybean.
In 2007, soybean accounted for 56 percent of the world’s oilseed production. James Specht, Professor at the University of Nebraska, said that this nitrogen-fixing legume crop offers the dual benefit of a seed high in protein and oil—with room for improvement. “With the advent of low-cost re-sequencing technologies, soybean scientists now have the means to identify sequence differences responsible for yield potential–the most desired of all crop traits, but to date the most intractable.”
“The soybean genome sequence will be a valuable resource for the basic researcher and soybean breeder alike,” said Jim Collins, Assistant Director for the Biology Directorate at the NSF. Collins and Judith St. John of USDA Agricultural Research Service co-chair the Interagency Working Group on Plant Genomes, which oversees the National Plant Genome Initiative. “The close coordination between the DOE sequencing project and the NSF SoyMap project facilitated through the National Plant Genome Initiative has added value to the sequence and physical map resources for this important crop,” Collins said.
The soybean genome project is already making its mark out in the field.
“It’s tremendous that the soybean genome is out in the public’s hands.” Said Rick Stern, a New Jersey soybean farmer and chair of the Production Research program for the United Soybean Board (USB). “Now every breeder can go into this valuable library for the information that will help speed up the breeding process. It should cut traditional breeding time by half from the typical 15 years.”
The U.S. Department of Energy Joint Genome Institute, supported by the DOE Office of Science, unites the expertise of five national laboratories — Lawrence Berkeley, Lawrence Livermore, Los Alamos, Oak Ridge, and Pacific Northwest — along with the HudsonAlpha Institute for Biotechnology — to advance genomics in support of the DOE missions related to clean energy generation and environmental characterization and cleanup. DOE JGI’s Walnut Creek, CA, Production Genomics Facility provides integrated high-throughput sequencing and computational analysis that enable systems-based scientific approaches to these challenges.