Cas14 proteins discovered from JGI’s IMG/M database and biochemically characterized at UC Berkeley and the Innovative Genomics Institute.
Researchers report the discovery of miniature Clustered Regularly Interspaced Short Palindromic Repeats (CRISPR) associated proteins that can target single-stranded DNA (ssDNA). The discovery was made possible by mining the datasets in the Integrated Microbial Genomes and Microbiomes (IMG/M) suite of tools managed by the U.S. Department of Energy (DOE) Joint Genome Institute (JGI), a DOE Office of Science User Facility. The sequences were then biochemically characterized by a team led by Jennifer Doudna’s group at the University of California (UC), Berkeley, which is also affiliated with the Innovative Genomics Institute.
The ability to accurately edit genomes (that is, repair gene mutations, and either delete or add genes in a precise fashion) has applications across many areas. In particular, gene editing is being used to develop drought, flooding, pest-resistant, and better-yielding crops. On the clinical side, gene editing is being advanced as a potential therapy for both genetic and complex disease. Finally, gene editing is being used to understand how a person’s genetic makeup predisposes them to, or protects them from, disease. Much of the work on genome editing has focused on the seminal CRISPR-Cas9 system, which targets double-stranded DNA. The discovery of Cas proteins that can target single-stranded DNA molecules broadens the range of applications for CRISPR-Cas systems. It also underscores the untapped potential waiting to be unearthed in sequencing and analyzing uncultivated microbes.
The CRISPR-Cas system is an immune mechanism in bacteria that confers resistance to foreign genetic elements by incorporating short sequences from infecting viruses and phages. In the event of a new infection, the microbes use the genetic information encoded in CRISPR sequences to target the virus and release attack enzymes in the form of Cas enzymes to cut the DNA and disable the virus. In Science, a team led by researchers from the University of California, Berkeley, report the identification of active Cas enzymes – dubbed Cas14 – that target ssDNA. In contrast, the seminal Cas9 proteins cleave double-stranded DNA.
Co-first author Lucas Harrington was a graduate student from study senior author Jennifer Doudna’s lab, and he worked with co-first author David Burstein, then a postdoctoral fellow with Doudna and longtime JGI collaborator Jill Banfield, also at UC Berkeley. Harrington is now at Mammoth Biosciences while Burstein is now at Tel Aviv University.
The Cas14 proteins are ~400–700 amino acids (aa) in size, half that of previously known class 2 CRISPR enzymes that are typically 950—1400 aa. They were initially identified by searching for Cas12d homologs across IMG/M’s assembled metagenomic data. It turned out that some of these were shorter than the typical Cas12d proteins and were also found next to cas1 genes. Further analysis led to the identification of a new family of Cas proteins, named Cas14. Using these sequences as a starting point, JGI data scientist David Paez-Espino in Nikos Kyrpides’ Microbiome Data Science group mined the IMG/M system with its large collection of publicly accessible metagenomic datasets from a wide variety of ecosystems around the world, conducting iterative searches using statistical analyses to continuously refine and improve the process.
The results yielded several CRISPR-Cas systems, and based on several experiments conducted by Doudna’s lab at UC Berkeley, close to 40 CRISPR-Cas14 systems belonging to eight subtypes were identified. Additionally, using Cas14a, the team was able to develop a Cas14-DETECTR that allows for CRISPR-based detection of ssDNA pathogens.
With few exceptions, the Cas14 proteins identified were found within the archaeal superphylum DPANN, named by a JGI-led team for the first five groups discovered: Diapherotrites, Parvarchaeota, Aenigmarchaeota, Nanoarchaeota, and Nanohaloarchaea. This work represents an excellent showcase of the unique capabilities provided from the IMG/M database in enabling new discoveries and is a continuation of previous collaboration of the JGI with the Doudna lab on the discovery of thermostable Cas9 genes.
The work also used resources at the National Energy Research Scientific Computing Center, a DOE Office of Science User Facility at Lawrence Berkeley National Laboratory.
Ramana Madupu, Ph.D.
Biological Systems Sciences Division
Office of Biological and Environmental Research
Office of Science
US Department of Energy
University of California, Berkeley
The work conducted by the US Department of Energy Joint Genome Institute, a DOE Office of Science User Facility, is supported by the Office of Science of the US Department of Energy under contract no. DE-AC02-05CH11231. The work was supported in part by grants from the Paul Allen Frontiers Group, the National Science Foundation Graduate Research Fellowships and the Howard Hughes Medical Institute. Support was also provided by the National Science Foundation, and the the Lawrence Berkeley National Laboratory’s Sustainable Systems Scientific Focus Area is funded by the U.S. Department of Energy.
- Harrington LB et al. Programmed DNA destruction by miniature CRISPR-Cas14 enzymes. Science. 2018 October 18. doi: 10.1126/science.aav4294.
- UC Berkeley News Release: “Smallest life forms have smallest working CRISPR system”
- Jennifer Doudna’s Lab at UC Berkeley: http://doudnalab.org/
- Jill Banfield’s Lab at UC Berkeley: http://geomicrobiology.berkeley.edu/
- Innovative Genomics Institute: https://innovativegenomics.org/
- IGI video: “CRISPR-Based Diagnostic Tools – A CRISPR Whiteboard Lesson”
- Integrated Microbial Genomes and Microbiomes (IMG/M): https://img.jgi.doe.gov/
- JGI News Release: “Tracking Microbial Diversity Through the Terrestrial Subsurface”
- JGI News Release: “Unveiled: Earth’s Viral Diversity”
- JGI News Release: “Boldly Illuminating Biology’s “Dark Matter”
- Supercomputing 17 (SC17) video: “#HPCConnects – CRISPR“