Yunyun Liang and Shengli Zhang*
The function of protein is closely related to its subcellular location. Therefore, prediction of protein subcellular localization is a meaningful and challenging task. Apoptosis proteins have a key role in the development and the homeostasis of the organism, and are very important for understanding the mechanism of cell proliferation and death. In this paper, we develop a novel prediction model to predict apoptosis protein subcellular location by using PSSM-based second-order moving average descriptor, nonnegative matrix factorization based on Kullback-Leibler divergence and over-sampling algorithms. This model is named by SOMAP-KLNMF-OS and constructed on the ZD98, ZW225 and CL317 benchmark datasets. Then, the support vector machine is adopted as the classifier, and the bias-free jackknife test method is used for evaluating the accuracy. Our prediction system achieves the favorable and promising performance of the overall accuracy on the three datasets and also outperforms the other listed models. The results show that our model offers a high throughput tool for identification of apoptosis protein subcellular localization.
Subcellular localization, Position-specific scoring matrix, Second-order moving Average, Nonnegative matrix factorization, Over-sampling
School of Science, Xi’an Polytechnic University, Xi’an 710048, P. R, School of Mathematics and Statistics, Xidian University, Xi’an 710071, P. R.