A New Approach for Predicting the Value of Gene Expression: Two-way Collaborative Filtering


Tuncay Bayrak and Hasan Oğul*   Pages 1 - 11 ( 11 )


Background: Predicting the exact value of gene expression in a given condition is a challenging topic in computational systems biology. Only a limited number of studies in this area have provided solutions to predict the expression in a particular pattern, whether or not it can be done effectively. However, the exact value for the measurement is usually needed for further meta-data analysis.

Method: Since the problem is considered as a regression task where a feature representation of the gene under consideration is fed into a trained model to predict a continuous variable that refers to its exact expression level, we introduced a novel feature representation scheme to support work on such task based on two-way collaborative filtering. At this point, our main argument is that the expressions of other genes in the current condition are as important as the expression of the current gene in other conditions. For regression, linear regression, and a recently popular method, called Relevance Vector Machine (RVM), are used. Pearson and Spearman correlation coefficients, and Root Mean Squared Error are used for evaluation. The effects of regression model type, RVM kernel functions, and parameters have been analyzed in our study in a gene expression profiling data comprising a set of prostate cancer samples.

Results: According to the findings of this study, in addition to promising results from the experimental studies, integrating data from another disease type, such as colon cancer in our case, can be significantly improved by the prediction performance of regression model.

Conclusion: The results also showed that the performed new feature representation approach and RVM regression model are promising for many machine learning problems in microarray and high throughput sequencing analysis.


Relevance vector machine, Two-way collaborative filtering, Gene expression predicting, Regression , Microarray


Computer Engineering Department, Baskent University, Eskisehir Road 20. Km Baglica Campus, 06560, Ankara, Faculty of Computer Science, Østfold University College, P.O. Box 700, 1757 Halden

