Positional gene enrichment analysis of gene sets for high-resolution identification of overrepresented chromosomal regions

K De Preter, R Barriot, F Speleman… - Nucleic acids …, 2008 - academic.oup.com
Nucleic acids research, 2008academic.oup.com
The search for feature enrichment is a widely used method to characterize a set of genes.
While several tools have been designed for nominal features such as Gene Ontology
annotations or KEGG Pathways, very little has been proposed to tackle numerical features
such as the chromosomal positions of genes. For instance, microarray studies typically
generate gene lists that are differentially expressed in the sample subgroups under
investigation, and when studying diseases caused by genome alterations, it is of great …
Abstract
The search for feature enrichment is a widely used method to characterize a set of genes. While several tools have been designed for nominal features such as Gene Ontology annotations or KEGG Pathways, very little has been proposed to tackle numerical features such as the chromosomal positions of genes. For instance, microarray studies typically generate gene lists that are differentially expressed in the sample subgroups under investigation, and when studying diseases caused by genome alterations, it is of great interest to delineate the chromosomal regions that are significantly enriched in these lists. In this article, we present a positional gene enrichment analysis method (PGE) for the identification of chromosomal regions that are significantly enriched in a given set of genes. The strength of our method relies on an original query optimization approach that allows to virtually consider all the possible chromosomal regions for enrichment, and on the multiple testing correction which discriminates truly enriched regions versus those that can occur by chance. We have developed a Web tool implementing this method applied to the human genome ( http://www.esat.kuleuven.be/~bioiuser/pge ). We validated PGE on published lists of differentially expressed genes. These analyses showed significant overrepresentation of known aberrant chromosomal regions.
Oxford University Press