OptiType: precision HLA typing from next-generation sequencing data

A Szolek, B Schubert, C Mohr, M Sturm… - …, 2014 - academic.oup.com
A Szolek, B Schubert, C Mohr, M Sturm, M Feldhahn, O Kohlbacher
Bioinformatics, 2014academic.oup.com
Motivation: The human leukocyte antigen (HLA) gene cluster plays a crucial role in adaptive
immunity and is thus relevant in many biomedical applications. While next-generation
sequencing data are often available for a patient, deducing the HLA genotype is difficult
because of substantial sequence similarity within the cluster and exceptionally high
variability of the loci. Established approaches, therefore, rely on specific HLA enrichment
and sequencing techniques, coming at an additional cost and extra turnaround time. Result …
Abstract
Motivation: The human leukocyte antigen (HLA) gene cluster plays a crucial role in adaptive immunity and is thus relevant in many biomedical applications. While next-generation sequencing data are often available for a patient, deducing the HLA genotype is difficult because of substantial sequence similarity within the cluster and exceptionally high variability of the loci. Established approaches, therefore, rely on specific HLA enrichment and sequencing techniques, coming at an additional cost and extra turnaround time.
Result: We present OptiType, a novel HLA genotyping algorithm based on integer linear programming, capable of producing accurate predictions from NGS data not specifically enriched for the HLA cluster. We also present a comprehensive benchmark dataset consisting of RNA, exome and whole-genome sequencing data. OptiType significantly outperformed previously published in silico approaches with an overall accuracy of 97% enabling its use in a broad range of applications.
Contact:  szolek@informatik.uni-tuebingen.de
Supplementary information:  Supplementary data are available at Bioinformatics online.
Oxford University Press