Time course gene expression profiling is increasingly applied in biomedical research to monitor the progression of diseases and effects of drug treatments. An important goal of the computational analysis is to predict clinical outcomes based on microarray data of patients over time. . However, most existing approaches of prediction use data at one time point only. Here we propose a method for patient outcome prediction using longitudinal gene expression.
A dataset of 454 arrays was used in this study, including 429 arrays of blood samples from trauma patients at days 0, 1, 4 post injury and 25 arrays from healthy controls. The variable for clinical outcomes is time of recovery (ToR).
Given longitudinal microarray data and survival-type of outcome variable, the main objective is to utilize the temporal pattern of gene expression in feature selection for the prediction model. First, we find the projection of longitude gene expression most correlated with the outcome variable. Second, we compute the Cox score for each feature using the projected data. Third, we form a reduced data matrix consisting of only those features whose score exceeds a threshold. Fourth, we compute the first or first few principal components of the reduced data matrix to form the predictors. Finally, we use these reduced principal components in a Cox proportional hazards model to make predictions.
Our results on simulated data and real trauma data show that the use of the longitudinal structure in the data yields significant improvement in prediction. On simulated data, the method achieved a very significant log-rank test (p-value < 0.001) in prediction, which was much better than using individual time point data (p-value ~ 0.1). On the trauma dataset of three time points (days 0,1, and 4) after injury, the log-rank p-value of prediction was 0.004 using the proposed method. In comparison, the p-value was about 0.02 using individual time point data. In addition, we observed that the longitudinal prediction method yields more stable results.
We have presented a method for utilizing temporal gene expression patterns for survival-type of prediction of patient clinical outcomes. In our analysis of simulated data and trauma patient data, we showed that better and more robust predictions of outcomes can be achieved using time course gene information.