[HTML][HTML] A practical solution to pseudoreplication bias in single-cell studies

KD Zimmerman, MA Espeland, CD Langefeld - Nature communications, 2021 - nature.com
KD Zimmerman, MA Espeland, CD Langefeld
Nature communications, 2021nature.com
Cells from the same individual share common genetic and environmental backgrounds and
are not statistically independent; therefore, they are subsamples or pseudoreplicates. Thus,
single-cell data have a hierarchical structure that many current single-cell methods do not
address, leading to biased inference, highly inflated type 1 error rates, and reduced
robustness and reproducibility. This includes methods that use a batch effect correction for
individual as a means of accounting for within-sample correlation. Here, we document this …
Abstract
Cells from the same individual share common genetic and environmental backgrounds and are not statistically independent; therefore, they are subsamples or pseudoreplicates. Thus, single-cell data have a hierarchical structure that many current single-cell methods do not address, leading to biased inference, highly inflated type 1 error rates, and reduced robustness and reproducibility. This includes methods that use a batch effect correction for individual as a means of accounting for within-sample correlation. Here, we document this dependence across a range of cell types and show that pseudo-bulk aggregation methods are conservative and underpowered relative to mixed models. To compute differential expression within a specific cell type across treatment groups, we propose applying generalized linear mixed models with a random effect for individual, to properly account for both zero inflation and the correlation structure among measures from cells within an individual. Finally, we provide power estimates across a range of experimental conditions to assist researchers in designing appropriately powered studies.
nature.com