BCFtools/RoH: a hidden Markov model approach for detecting autozygosity from next-generation sequencing data

V Narasimhan, P Danecek, A Scally, Y Xue… - …, 2016 - academic.oup.com
V Narasimhan, P Danecek, A Scally, Y Xue, C Tyler-Smith, R Durbin
Bioinformatics, 2016academic.oup.com
Runs of homozygosity (RoHs) are genomic stretches of a diploid genome that show identical
alleles on both chromosomes. Longer RoHs are unlikely to have arisen by chance but are
likely to denote autozygosity, whereby both copies of the genome descend from the same
recent ancestor. Early tools to detect RoH used genotype array data, but substantially more
information is available from sequencing data. Here, we present and evaluate
BCFtools/RoH, an extension to the BCFtools software package, that detects regions of …
Abstract
Summary: Runs of homozygosity (RoHs) are genomic stretches of a diploid genome that show identical alleles on both chromosomes. Longer RoHs are unlikely to have arisen by chance but are likely to denote autozygosity, whereby both copies of the genome descend from the same recent ancestor. Early tools to detect RoH used genotype array data, but substantially more information is available from sequencing data. Here, we present and evaluate BCFtools/RoH, an extension to the BCFtools software package, that detects regions of autozygosity in sequencing data, in particular exome data, using a hidden Markov model. By applying it to simulated data and real data from the 1000 Genomes Project we estimate its accuracy and show that it has higher sensitivity and specificity than existing methods under a range of sequencing error rates and levels of autozygosity.
Availability and implementation: BCFtools/RoH and its associated binary/source files are freely available from https://github.com/samtools/BCFtools.
Contact:  vn2@sanger.ac.uk or pd3@sanger.ac.uk
Supplementary information:  Supplementary data are available at Bioinformatics online.
Oxford University Press