Detecting virus integration sites based on multiple related sequencing data by VirTect

Y Xia, Y Liu, M Deng, R Xi - BMC medical genomics, 2019 - Springer
Y Xia, Y Liu, M Deng, R Xi
BMC medical genomics, 2019Springer
Background Since tumor often has a high level of intra-tumor heterogeneity, multiple tumor
samples from the same patient at different locations or different time points are often
sequenced to study tumor intra-heterogeneity or tumor evolution. In virus-related tumors
such as human papillomavirus-and Hepatitis B Virus-related tumors, virus genome
integrations can be critical driving events. It is thus important to investigate the integration
sites of the virus genomes. Currently, a few algorithms for detecting virus integration sites …
Background
Since tumor often has a high level of intra-tumor heterogeneity, multiple tumor samples from the same patient at different locations or different time points are often sequenced to study tumor intra-heterogeneity or tumor evolution. In virus-related tumors such as human papillomavirus- and Hepatitis B Virus-related tumors, virus genome integrations can be critical driving events. It is thus important to investigate the integration sites of the virus genomes. Currently, a few algorithms for detecting virus integration sites based on high-throughput sequencing have been developed, but their insufficient performance in their sensitivity, specificity and computational complexity hinders their applications in multiple related tumor sequencing.
Results
We develop VirTect for detecting virus integration sites simultaneously from multiple related-sample data. This algorithm is mainly based on the joint analysis of short reads spanning breakpoints of integration sites from multiple samples. To achieve high specificity and breakpoint accuracy, a local precise sandwich alignment algorithm is used. Simulation and real data analyses show that, compared with other algorithms, VirTect is significantly more sensitive and has a similar or lower false discovery rate.
Conclusions
VirTect can provide more accurate breakpoint position and is computationally much more efficient in terms both memory requirement and computational time.
Springer