Komal Patil* and Usha Chouhan
Protein fold prediction is a fundamental step in Structural Bioinformatics. The tertiary structure of a protein determines its function and to predict its tertiary structure, fold prediction serves an important role. Protein fold is simply the arrangement of the secondary structure elements relative to each other in space. Numbers of studies have been carried out till date by different research groups working worldwide in this field by using the combination of different benchmark datasets, different types of descriptors, features and classification techniques. In this study, we have tried to put all these contributions together, analyze their study and to compare different techniques used by them. Different features are derived from protein sequence, its secondary structure, different physicochemical properties of amino acids, domain composition, Position Specific Scoring Matrix, profile and threading techniques. Combination of these different features can improve classification accuracy to large extent. With the help of this survey, one can know the most suitable feature/attribute set and classification technique for this multi-class protein fold classification problem.
Protein Fold, Protein Features, Descriptors, Data mining, Machine learning, Classification
Department Of Mathematics, Maulana Azad National Institute of Technology (MANIT), Bhopal,462003 M.P. , Department Of Mathematics, Maulana Azad National Institute of Technology (MANIT), Bhopal,462003 M.P.