dRep: a tool for fast and accurate genomic comparisons that enables improved genome recovery from metagenomes through de-replication

MR Olm, CT Brown, B Brooks, JF Banfield - The ISME journal, 2017 - academic.oup.com
The ISME journal, 2017academic.oup.com
The number of microbial genomes sequenced each year is expanding rapidly, in part due to
genome-resolved metagenomic studies that routinely recover hundreds of draft-quality
genomes. Rapid algorithms have been developed to comprehensively compare large
genome sets, but they are not accurate with draft-quality genomes. Here we present dRep, a
program that reduces the computational time for pairwise genome comparisons by
sequentially applying a fast, inaccurate estimation of genome distance, and a slow, accurate …
Abstract
The number of microbial genomes sequenced each year is expanding rapidly, in part due to genome-resolved metagenomic studies that routinely recover hundreds of draft-quality genomes. Rapid algorithms have been developed to comprehensively compare large genome sets, but they are not accurate with draft-quality genomes. Here we present dRep, a program that reduces the computational time for pairwise genome comparisons by sequentially applying a fast, inaccurate estimation of genome distance, and a slow, accurate measure of average nucleotide identity. dRep achieves a 28 × increase in speed with perfect recall and precision when benchmarked against previously developed algorithms. We demonstrate the use of dRep for genome recovery from time-series datasets. Each metagenome was assembled separately, and dRep was used to identify groups of essentially identical genomes and select the best genome from each replicate set. This resulted in recovery of significantly more and higher-quality genomes compared to the set recovered using co-assembly.
Oxford University Press