Predicting survival from microarray data—a comparative study

HM Bøvelstad, S Nygård, HL Størvold, M Aldrin… - …, 2007 - academic.oup.com
HM Bøvelstad, S Nygård, HL Størvold, M Aldrin, Ø Borgan, A Frigessi, OC Lingjærde
Bioinformatics, 2007academic.oup.com
Motivation: Survival prediction from gene expression data and other high-dimensional
genomic data has been subject to much research during the last years. These kinds of data
are associated with the methodological problem of having many more gene expression
values than individuals. In addition, the responses are censored survival times. Most of the
proposed methods handle this by using Cox's proportional hazards model and obtain
parameter estimates by some dimension reduction or parameter shrinkage estimation …
Abstract
Motivation: Survival prediction from gene expression data and other high-dimensional genomic data has been subject to much research during the last years. These kinds of data are associated with the methodological problem of having many more gene expression values than individuals. In addition, the responses are censored survival times. Most of the proposed methods handle this by using Cox's proportional hazards model and obtain parameter estimates by some dimension reduction or parameter shrinkage estimation technique. Using three well-known microarray gene expression data sets, we compare the prediction performance of seven such methods: univariate selection, forward stepwise selection, principal components regression (PCR), supervised principal components regression, partial least squares regression (PLS), ridge regression and the lasso.
Results: Statistical learning from subsets should be repeated several times in order to get a fair comparison between methods. Methods using coefficient shrinkage or linear combinations of the gene expression values have much better performance than the simple variable selection methods. For our data sets, ridge regression has the overall best performance.
Availability: Matlab and R code for the prediction methods are available at http://www.med.uio.no/imb/stat/bmms/software/microsurv/.
Contact:  hegembo@math.uio.no
Oxford University Press