Published in:
mSystems 9(9) , e00242-24 ( 2024)
Author(s):
DOI:
10.1128/msystems.00242-24
Abstract:
Although long-read sequencing has enabled obtaining high-quality and complete genomes from metagenomes, many challenges still remain to completely decompose a metagenome into its constituent prokaryotic and viral genomes. This study focuses on decomposing an estuarine metagenome to obtain a more accurate estimate of microbial diversity. To achieve this, we developed a new bead-based DNA extraction method, a novel bin refinement method, and obtained 150 Gbp of Nanopore sequencing. We estimate that there are ~500 bacterial and archaeal species in our sample and obtained 68 high-quality bins (>90% complete, <5% contamination, ≤5 contigs, contig length of >100 kbp, and all ribosomal and tRNA genes). We also obtained many contigs of picoeukaryotes, environmental DNA of larger eukaryotes such as mammals, and complete mitochondrial and chloroplast genomes and detected ~40,000 viral populations. Our analysis indicates that there are only a few strains that comprise most of the species abundances.
IMPORTANCE: Ocean and estuarine microbiomes play critical roles in global element cycling and ecosystem function. Despite the importance of these microbial communities, many species still have not been cultured in the lab. Environmental sequencing is the primary way the function and population dynamics of these communities can be studied. Long-read sequencing provides an avenue to overcome limitations of short-read technologies to obtain complete microbial genomes but comes with its own technical challenges, such as needed sequencing depth and obtaining high-quality DNA. We present here new sampling and bioinformatics methods to attempt decomposing an estuarine microbiome into its constituent genomes. Our results suggest there are only a few strains that comprise most of the species abundances from viruses to picoeukaryotes, and to fully decompose a metagenome of this diversity requires 1 Tbp of long-read sequencing. We anticipate that as long-read sequencing technologies continue to improve, less sequencing will be needed.