Massive genomic data sets from high-throughput sequencing allow for new insights into complex biological systems such as microbial communities. Analyses of their diversity and structure are typically preceded by clustering millions of 16S rRNA gene sequences into OTUs. |