[HTML][HTML] Differential gene expression in disease: a comparison between high-throughput studies and the literature

R Rodriguez-Esteban, X Jiang - BMC medical genomics, 2017 - Springer
BMC medical genomics, 2017Springer
Background Differential gene expression is important to understand the biological
differences between healthy and diseased states. Two common sources of differential gene
expression data are microarray studies and the biomedical literature. Methods With the aid
of text mining and gene expression analysis we have examined the comparative properties
of these two sources of differential gene expression data. Results The literature shows a
preference for reporting genes associated to higher fold changes in microarray data, rather …
Background
Differential gene expression is important to understand the biological differences between healthy and diseased states. Two common sources of differential gene expression data are microarray studies and the biomedical literature.
Methods
With the aid of text mining and gene expression analysis we have examined the comparative properties of these two sources of differential gene expression data.
Results
The literature shows a preference for reporting genes associated to higher fold changes in microarray data, rather than genes that are simply significantly differentially expressed. Thus, the resemblance between the literature and microarray data increases when the fold-change threshold for microarray data is increased. Moreover, the literature has a reporting preference for differentially expressed genes that (1) are overexpressed rather than underexpressed; (2) are overexpressed in multiple diseases; and (3) are popular in the biomedical literature at large. Additionally, the degree to which diseases are similar depends on whether microarray data or the literature is used to compare them. Finally, vaguely-qualified reports of differential expression magnitudes in the literature have only small correlation with microarray fold-change data.
Conclusions
Reporting biases of differential gene expression in the literature can be affecting our appreciation of disease biology and of the degree of similarity that actually exists between different diseases.
Springer