Multi-species assays yield an unprecedented view of plant evolution and adaptation.
Understanding how genes are regulated is like deciphering the complex wiring behind a city’s power grid. Just as the flow of electricity controls when and where lights turn on or machines start humming, gene regulation determines when and where genes are activated to sustain life’s processes.
Traditionally, mapping this intricate network of regulatory “switches” in plants at a large scale was akin to trying to chart every electrical connection in a sprawling metropolis — an overwhelming challenge due to technical limitations and the sheer complexity of plant genomes.
Now, with improvements to DNA Affinity Purification Sequencing (DAP-seq), these barriers are falling. As published online in Nature Plants on July 15, researchers have developed multiplexed assays — experimental techniques that allow simultaneous measurement of multiple samples in a single experiment — and integrated these data with single-cell gene expression maps. Together, the scale of data enabled by these approaches creates a vast resource offering researchers the ability to chart regulatory networks across evolutionary time and diverse plant lineages. Aligning with U.S. Department of Energy (DOE) interests, this advancement holds the potential to expand opportunities for bioenergy and bioengineering applications.
“We worked with 10 species ranging from grasses to broccoli and trees like poplar, spanning about 150 million years of evolution,” said Leo Baumgart, first author on the paper and Technical Implementation Lead at the DOE Joint Genome Institute (JGI), a DOE Office of Science User Facility located at Lawrence Berkeley National Laboratory. “These proteins’ binding preferences are very, very stable over long evolutionary times. But, there’s a lot of action happening in terms of where those binding sites are in the genome — those sequences are moving around, disappearing and appearing in new places.”
With new insights available, the power grid of gene regulation is finally coming into clear view.
Behind the Breakthroughs: Advancing gene regulation mapping
Previous methods to identify where transcription factors (TFs) — the proteins that flip genetic switches — bind DNA were limited to a few model species and involved slow, labor-intensive experiments. The original DAP-seq method overcame some of these challenges by using in vitro-expressed TFs and fragmented genomic DNA, eliminating the need for living cells. This allowed researchers to map regulatory “switches” in species beyond traditional models.
Updates to DAP-seq build upon the previous version in several key ways. A research biologist by training, Baumgart and his colleagues have adapted the method to handle the complexity of large plant genomes. In addition: Multiplexed barcoding enables the simultaneous profiling of multiple species; stricter filtering criteria provides a robust community dataset of improved quality; deep insights identify the most important binding sites by leveraging trends in evolutionary conservation; and integration with single-cell RNA sequencing reveals how transcription factors control gene expression and cell identity.
DAP-seq now integrates transcription factor binding data with single-cell gene expression maps, providing insight into how regulatory pathways control specific cell types.
“A lot of these technologies were designed for samples from humans or mice,” said Sharon Greenblum, a co-first author on the paper and computational biologist at the JGI. “We had to work really hard to make them work for the kinds of plant samples we study — and even go beyond, putting four different species in there all at once. That let us profile more cells for the same cost, and it actually made the data cleaner.”
By integrating DAP-seq binding maps with single-cell RNA profiles, researchers can pinpoint which transcription factors regulate specific genes in distinct cell types, shedding light on the molecular basis of plant development and adaptation.
“You learn a lot more than you would from each one individually,” Baumgart added. “By combining these data, we can now infer which of the transcription factors are responsible for shaping what a certain cell type looks like.”
These recent innovations have propelled the creation of a vast resource: nearly 3,000 genome-wide binding maps for 360 transcription factors across 10 plant species, illuminating thousands of regulatory “switches” within multiple complex genomes. The end product is an expansive, multi-species dataset enabled by multiplexed barcoding.
Complementing this, researchers have profiled over 160,000 high-quality single-nuclei transcriptomes, enabling the identification of dozens of unique cell types in plants such as Arabidopsis and sorghum. Together, these datasets allow scientists to explore how gene regulation varies across cell types, species and evolutionary history.
DAP-seq’s broad applicability extends beyond plants. Currently, this technology is available to JGI users for studies on organisms with smaller genomes, such as bacteria and certain fungi. Scaling the technology’s availability to users is a long-term goal; its robustness will enable researchers to propose multi-genome projects, opening new avenues for comparative regulatory studies across diverse species.
“DAP-seq is a very generalizable method,” Baumgart said. “We’ve done it in fungi, plants, bacteria and archaea. It works on basically anything we’ve tried.”
DAP-seq Revelations Provide Holistic View of Plant Gene Regulation and Beyond
DAP-seq provides a comparative atlas for understanding plant evolution and adaptation. More specifically, it’s enabled several scientific discoveries that are key to understanding how to engineer similar plants with different traits.
One key discovery relates to conservation scores, which measure how well transcription factor binding sites are preserved across different species over evolutionary time. A high conservation score indicates that a binding site has been maintained through millions of years of evolution, suggesting it plays an important functional role in regulating gene expression. Focusing on these conserved sites, researchers can hone in on and identify the most critical regulatory elements that contribute to plant resilience and adaptability. By analyzing data from many species side-by-side in a single multiplexed experiment, they can confidently pinpoint vital regulatory elements while avoiding technical variability between separate experiments.
DAP-seq has revealed nearly identical binding sites for proteins from grasses and trees that diverged 150 million years ago, highlighting the deep evolutionary conservation of gene regulation. Together, these binding sites mark ancient regulons, or gene networks conserved throughout evolution.
The tool has also shed light on the widespread gain and loss of binding sites, a process known as regulatory rewiring that underpins plant adaptability. Although TF binding sites are scattered widely across genomes, the team found only a subset were functionally relevant. Those conserved across species tend to have strong effects on gene expression.
“We’re seeing how evolution can repurpose ancient gene networks for new functions, sometimes in a way that’s almost like flipping a switch between which TF is in charge,” Greenblum said, citing drought tolerance in grasses as an example, where these ancient networks are recruited to help plants survive environmental stress.
Unlocking the Future of Genome-Enabled Discovery
All data and code from DAP-seq are publicly available for researchers and scientists to build upon, allowing the platform to enable discoveries in plant biology and beyond at an unprecedented scale.
“All of the data are publicly accessible,” Baumgart stressed. “We want people to use it, find new things and push the field forward.”
By integrating transcription factor binding maps with single-cell gene expression data, DAP-seq enables researchers to map gene regulation across diverse species and cell types with exceptional scope and detail. This opens new pathways to design and engineer organisms tailored for bioproducts and ecosystem resilience. Aligned with DOE Biological and Environmental Research program goals, this work helps advance a fundamental understanding of genome biology and developing genome-scale engineering technologies to produce biofuels and bioproducts.
“We developed these methods and found some really exciting biology in our datasets, but we know that we’ve only scratched the surface,” Baumgart said. “By putting out this resource, we want to enable experts to investigate the specific organisms and genes they know best.”

