A Bayesian hierarchical model to detect differentially methylated loci from single nucleotide resolution sequencing data

H Feng, KN Conneely, H Wu - Nucleic acids research, 2014 - academic.oup.com
Nucleic acids research, 2014academic.oup.com
DNA methylation is an important epigenetic modification that has essential roles in cellular
processes including gene regulation, development and disease and is widely dysregulated
in most types of cancer. Recent advances in sequencing technology have enabled the
measurement of DNA methylation at single nucleotide resolution through methods such as
whole-genome bisulfite sequencing and reduced representation bisulfite sequencing. In
DNA methylation studies, a key task is to identify differences under distinct biological …
Abstract
DNA methylation is an important epigenetic modification that has essential roles in cellular processes including gene regulation, development and disease and is widely dysregulated in most types of cancer. Recent advances in sequencing technology have enabled the measurement of DNA methylation at single nucleotide resolution through methods such as whole-genome bisulfite sequencing and reduced representation bisulfite sequencing. In DNA methylation studies, a key task is to identify differences under distinct biological contexts, for example, between tumor and normal tissue. A challenge in sequencing studies is that the number of biological replicates is often limited by the costs of sequencing. The small number of replicates leads to unstable variance estimation, which can reduce accuracy to detect differentially methylated loci (DML). Here we propose a novel statistical method to detect DML when comparing two treatment groups. The sequencing counts are described by a lognormal-beta-binomial hierarchical model, which provides a basis for information sharing across different CpG sites. A Wald test is developed for hypothesis testing at each CpG site. Simulation results show that the proposed method yields improved DML detection compared to existing methods, particularly when the number of replicates is low. The proposed method is implemented in the Bioconductor package DSS.
Oxford University Press