Critical evaluation of copy number variant calling methods using DNA methylation

V Kilaru, AK Knight, S Katrinli, D Cobb… - Genetic …, 2020 - Wiley Online Library
Genetic epidemiology, 2020Wiley Online Library
Recent technological and methodological developments have enabled the use of array‐
based DNA methylation data to call copy number variants (CNVs). ChAMP, Conumee, and
cnAnalysis450k are popular methods currently used to call CNVs using methylation data.
However, so far, no studies have analyzed the reliability of these methods using real
samples. Data from a cohort of individuals with genotype and DNA methylation data
generated using the HumanMethylation450 and MethylationEPIC BeadChips were used to …
Abstract
Recent technological and methodological developments have enabled the use of array‐based DNA methylation data to call copy number variants (CNVs). ChAMP, Conumee, and cnAnalysis450k are popular methods currently used to call CNVs using methylation data. However, so far, no studies have analyzed the reliability of these methods using real samples. Data from a cohort of individuals with genotype and DNA methylation data generated using the HumanMethylation450 and MethylationEPIC BeadChips were used to assess the consistency between the CNV calls generated by methylation and genotype data. We also took advantage of repeated measures of methylation data collected from the same individuals to compare the reliability of CNVs called by ChAMP, Conumee, and cnAnalysis450k for both the methylation arrays. ChAMP identified more CNVs than Conumee and cnAnalysis450k for both the arrays and, as a consequence, had a higher overlap (~62%) with the calls from the genotype data. However, all methods had relatively low reliability. For the MethylationEPIC array, Conumee had the highest reliability (57.6%), whereas for the HumanMethylation450 array, cnAnalysis450k had the highest reliability (43.0%). Overall, the MethylationEPIC array provided significant gains in reliability for CNV calling over the HumanMethylation450 array but not for overlap with CNVs called using genotype data.
Wiley Online Library