Submit Manuscript  

Article Details


Identification of Cancerlectins by Using Cascade Linear Discriminant Analysis and Optimal G-Gap Tripeptide Composition

Author(s):

Liangwei Yang , Hui Gao *, Keyu Wu, Haotian Zhang , Changyu Li and Lixia Tang  

Abstract:


Background: Lectins are a diverse group of glycoproteins or glycoconjugate proteins that can be extracted from plants, invertebrates and higher animals. Cancerlectins, a kind of lectins that plays a key role in the process of tumor cells interacting with each other and is being employed as therapeutic agents. A full understanding of cancerlectins is significant because it provides a tool for future cancer therapy direction.

Objective: To develop an accurate and practically useful timesaving tool to identify cancerlectins. A novel sequence-based method is proposed along with a correlative webserver to access the proposed tool.

Method: Firstly, protein features were extracted in a newly feature building way termed, g-gap tripeptide composition. After which a proposed cascade linear discriminant analysis (Cascade LDA) is used to alleviate the high dimensional difficulties with the analysis of variance (ANOVA) as a feature importance criterion. Finally, support vector machine (SVM) is used as the classifier to identify cancerlectins.

Results: The proposed method achieved an accuracy of 91.34% with sensitivity of 89.89%, specificity of 92.48% and an 0.8318 Mathew’s correlation coefficient based on only 13 fusion features in jackknife cross validation, the result of which is superior to other published methods in this domain.

Conclusion: In this study, a new method based only on primary structure of protein is proposed and experimental result shows that it could be a promising tool to identify cancerlectins. An open-access webserver is made available in this work to facilitate other related works.

Keywords:

cancerlectin, Cascade LDA, g-gap tripeptide composition, ANOVA.

Affiliation:

Center for Informational Biology, School of Computer Science and Engineering, University of Electronic Science and Technology of China, Chengdu, Center for Informational Biology, School of Computer Science and Engineering, University of Electronic Science and Technology of China, Chengdu, Center for Informational Biology, School of Life Science and Technology, University of Electronic Science and Technology of China, Chengdu, Center for Informational Biology, School of Life Science and Technology, University of Electronic Science and Technology of China, Chengdu, Center for Informational Biology, School of Life Science and Technology, University of Electronic Science and Technology of China, Chengdu, Center for Informational Biology, School of Life Science and Technology, University of Electronic Science and Technology of China, Chengdu



Full Text Inquiry