Survival associated pathway identification with group Lp penalized global AUC maximization
1 Greenebaum Cancer Center, University of Maryland, 22 South Greene Street, Baltimore, MD 21201, USA
2 Department of Epidemiology and Preventive Medicine, The University of Maryland, Baltimore, MD 21201, USA
3 Division of Biostatistics, Department of Pharmacology and Experimental Therapeutics, Thomas Jefferson University, Philadelphia, PA 19107, USA
4 Department of Oncology and Diagnosis Sciences, The University of Maryland Dental School, Baltimore, MD 21201, USA
Algorithms for Molecular Biology 2010, 5:30 doi:10.1186/1748-7188-5-30Published: 16 August 2010
It has been demonstrated that genes in a cell do not act independently. They interact with one another to complete certain biological processes or to implement certain molecular functions. How to incorporate biological pathways or functional groups into the model and identify survival associated gene pathways is still a challenging problem. In this paper, we propose a novel iterative gradient based method for survival analysis with group Lp penalized global AUC summary maximization. Unlike LASSO, Lp (p < 1) (with its special implementation entitled adaptive LASSO) is asymptotic unbiased and has oracle properties . We first extend Lp for individual gene identification to group Lp penalty for pathway selection, and then develop a novel iterative gradient algorithm for penalized global AUC summary maximization (IGGAUCS). This method incorporates the genetic pathways into global AUC summary maximization and identifies survival associated pathways instead of individual genes. The tuning parameters are determined using 10-fold cross validation with training data only. The prediction performance is evaluated using test data. We apply the proposed method to survival outcome analysis with gene expression profile and identify multiple pathways simultaneously. Experimental results with simulation and gene expression data demonstrate that the proposed procedures can be used for identifying important biological pathways that are related to survival phenotype and for building a parsimonious model for predicting the survival times.