A sequence similarity network of a family of enzymes from the nitroreductase superfamily (some nitroreductases can reduce TNT, a significant soil contaminant). Nodes represent enzyme sequences, while edges represent pairwise similarities more significant than 1e-42 (BLAST E-value). Red and blue nodes represent enzymes found in public sequence databases and belong to two sub-families, and white nodes represent sequences found only in the JGI’s metagenomic database (IMG/M). Large nodes represent experimentally-characterized enzymes of diverse functions. Notably, a significant expansion of the sequence space is observed (from 300 enzymes to >10,000), revealing a new potential group of enzymes found only in the IMG/M. “Linkers” that are also unique to metagenomes, display sequence similarity to experimentally-characterized enzymes of diverse functions, ND serve as attractive targets for synthesis and biochemical assays for intermediate function. (Eyal Akiva and Patsy Babbitt)