Genome Research 22(4) , 802-809 (Apr 2012)
In the process of clone-based genome sequencing, initial assemblies frequently contain cloning gaps that can be resolved using cloning-independent methods, but the reason for their occurrence is largely unknown. By analyzing 9,328,693 sequencing clones from 393 microbial genomes, we systematically mapped more than 15,000 genes residing in cloning gaps and experimentally showed that their expression products are toxic to the Escherichia coli host. A subset of these toxic sequences was further evaluated through a series of functional assays exploring the mechanisms of their toxicity. Among these genes, our assays revealed novel toxins and restriction enzymes, and new classes of small, non-coding toxic RNAs that reproducibly inhibit E. coil growth. Further analyses also revealed abundant, short, toxic DNA fragments that were predicted to suppress E. coli growth by interacting with the replication initiator DnaA. Our results show that cloning gaps, once considered the result of technical problems, actually serve as a rich source for the discovery of biotechnologically valuable functions, and suggest new modes of antimicrobial interventions.