List of Publications That Used the Features in PROFEAT

(Return to PROFEAT HOME page, please click HERE)

Feature Group 1,2 [G1, G2]: Amino Acid Composition, Dipeptide Composition

  • Reczko, M. and Bohr, H. (1994) The DEF data base of sequence based protein fold class predictions. Nucleic Acids Res, 22, 3616-3619.

  • Grassmann, J., Reczko, M., Suhai, S. and Edler, L. (1999) Protein fold class prediction: new methods of statistical classification. Proc Int Conf Intell Syst Mol Biol, 106-112.

  • Hua, S. and Sun, Z. (2001) Support vector machine approach for protein subcellular localization prediction. Bioinformatics, 17, 721-728.

  • Chou, K.C. and Cai, Y.D. (2002) Using functional domain composition and support vector machines for prediction of protein subcellular location. J Biol Chem, 277, 45765-45769.

  • Bhasin, M. and Raghava, G.P. (2004) Classification of nuclear receptors based on amino acid composition and dipeptide composition. J Biol Chem, 279, 23262-23266.

Feature Group 3 [G3]: Autocorrelation Descriptor

  • Feng, Z.P. and Zhang, C.T. (2000) Prediction of membrane protein types based on the hydrophobic index of amino acids. J Protein Chem, 19, 269-275.

  • Lin, Z. and Pan, X.M. (2001) Accurate prediction of protein secondary structural content. J Protein Chem, 20, 217-220.

  • Horne, D.S. (1988) Prediction of protein helix content from an autocorrelation analysis of sequence hydrophobicities. Biopolymers, 27, 451-477.
  • Sokal, R.R. and Thomson, B.A. (2006) Population structure inferred by local spatial autocorrelation: an example from an Amerindian tribal population. Am J Phys Anthropol, 129, 121-131.

Feature Group 4 [G4]: Composition,Transition,Distribution

  • Dubchak, I., Muchnik, I., Holbrook, S.R. and Kim, S.H. (1995) Prediction of protein folding class using global description of amino acid sequence. Proc Natl Acad Sci U S A, 92, 8700-8704.

  • Dubchak, I., Muchnik, I., Mayor, C., Dralyuk, I. and Kim, S.H. (1999) Recognition of a protein fold in the context of the Structural Classification of Proteins (SCOP) classification. Proteins, 35, 401-407.

  • Bock, J.R. and Gough, D.A. (2001) Predicting protein--protein interactions from primary structure. Bioinformatics, 17, 455-460.

  • Cai, C.Z., Han, L.Y., Ji, Z.L., Chen, X. and Chen, Y.Z. (2003) SVM-Prot: Web-based support vector machine software for functional classification of a protein from its primary sequence. Nucleic Acids Res, 31, 3692-3697.

  • Cai, C.Z., Han, L.Y., Ji, Z.L. and Chen, Y.Z. (2004) Enzyme family classification by support vector machines. Proteins, 55, 66-76.

  • Han, L.Y., Cai, C.Z., Lo, S.L., Chung, M.C. and Chen, Y.Z. (2004) Prediction of RNA-binding proteins from primary sequence by a support vector machine approach. RNA, 10, 355-368.

  • Han, L.Y., Cai, C.Z., Ji, Z.L., Cao, Z.W., Cui, J. and Chen, Y.Z. (2004) Predicting functional family of novel enzymes irrespective of sequence similarity: a statistical learning approach. Nucleic Acids Res, 32, 6437-6444.

  • Lo, S.L., Cai, C.Z., Chen, Y.Z. and Chung, M.C. (2005) Effect of training datasets on support vector machine prediction of protein-protein interactions. Proteomics, 5, 876-884.

  • Lin, H.H., Han, L.Y., Cai, C.Z., Ji, Z.L. and Chen, Y.Z. (2006) Prediction of transporter family from protein sequence by support vector machine approach. Proteins, 62, 218-231.

  • H.H. Lin, L.Y. Han, H.L. Zhang, C.J. Zheng, B. Xie, and Y.Z. Chen. (2006) Prediction of the Functional Class of Lipid-Binding Proteins from Sequence Derived Properties Irrespective of Sequence Similarity. J. Lipid Res. 47(4):824-31.

  • H.H. Lin, L.Y. Han, H.L. Zhang, C.J. Zheng, B. Xie, and Y.Z. Chen. (2006) Prediction of the Functional Class of Metal-Binding Proteins from Sequence Derived Physicochemical Properties by Support Vector Machine Approach. BMC Bioinformatics 7(Suppl 5): S13.

  • Cui, J., Han, L.Y., Lin, H.H., Zhang, H.L., Tang, Z.Q., Zheng, C.J., Cao, Z.W. and Chen, Y.Z. (2007) Prediction of MHC-Binding Peptides of Flexible Lengths from Sequence-Derived Structural and Physicochemical Properties. Mol. Immunol. 44: 866-877.

  • J. Cui, L.Y. Han, H.H. Lin, Z.Q. Tang, C.J. Zheng, Z.W. Cao, and Y.Z. Chen (2007). Computer Prediction of Allergen Proteins from Sequence-Derived Protein Structural and Physicochemical Properties. Mol. Immunol. 44(4): 514-520.

  • L.Y. Han, C.J. Zheng, B. Xie, J. Jia, X.H. Ma, F. Zhu, H.H. Lin, X. Chen, and Y.Z. Chen. (2007) Support vector machines approach for predicting druggable proteins: recent progress in its exploration and investigation of its usefulness. Drug Discovery Today 12(7-8): 304-313.

Feature Group 5 [G5]: Quasi-Sequence-Order (QSO) Descriptors

  • Chou, K.C. (2000) Prediction of protein subcellular locations by incorporating quasi-sequence-order effect. Biochem Biophys Res Commun, 278, 477-483.

  • Chou, K.C. and Cai, Y.D. (2004) Prediction of protein subcellular locations by GO-FunD-PseAA predictor. Biochem Biophys Res Commun, 320, 1236-1239.

Feature Group 6 [G6]: Pseudo Amino Acid Composition Descriptor

  • Cai YD, Chou KC.(2005) Predicting enzyme subclass by functional domain composition and pseudo amino acid composition. J Proteome Res. 4(3):967-71.

  • Gao Y, Shao S, Xiao X, Ding Y, Huang Y, Huang Z, Chou KC. (2005) Using pseudo amino acid composition to predict protein subcellular location: approached with Lyapunov index, Bessel function, and Chebyshev filter. Amino Acids. 28(4):373-6.

  • Liu H, Yang J, Wang M, Xue L, Chou KC. (2005) Using fourier spectrum analysis and pseudo amino acid composition for prediction of membrane protein types. Protein J. 24(6):385-9.

  • Shen HB, Chou KC. (2005) Predicting protein subnuclear location with optimized evidence-theoretic K-nearest classifier and pseudo amino acid composition. Biochem Biophys Res Commun. 337(3):752-6.

  • Xiao X, Shao S, Ding Y, Huang Z, Chou KC.(2006) Using cellular automata images and pseudo amino acid composition to predict protein subcellular location. Amino Acids. 30(1):49-54.

  • Cai YD, Chou KC.(2006) Predicting membrane protein type by functional domain composition and pseudo-amino acid composition. J Theor Biol.;238(2):395-400.

  • Shen HB, Yang J, Chou KC. (2006) Fuzzy KNN for predicting membrane protein types from pseudo-amino acid composition. J Theor Biol. 240(1):9-13.

  • Chou KC, Cai YD. (2006) Predicting protein-protein interactions from sequences in a hybridization space. J Proteome Res. 5(2):316-22.

  • Zhou GP, Cai YD. (2006) Predicting protease types by hybridizing gene ontology and pseudo amino acid composition. Proteins. 63(3):681-4.

  • Xiao X, Shao SH, Huang ZD, Chou KC.(2006) Using pseudo amino acid composition to predict protein structural classes: approached with complexity measure factor. J Comput Chem. 27(4):478-82.

  • Zhang SW, Pan Q, Zhang HC, Shao ZC, Shi JY. (2006) Prediction of protein homo-oligomer types by pseudo amino acid composition: Approached with an improved feature extraction and Naive Bayes Feature Fusion. Amino Acids. 30(4):461-8

  • Zhang T, Ding Y, Chou KC. (2006) Prediction of protein subcellular location using hydrophobic patterns of amino acid sequence. Comput Biol Chem. 30(5):367-71.

  • Chen C, Zhou X, Tian Y, Zou X, Cai P. (2006) Predicting protein structural class with pseudo-amino acid composition and support vector machine fusion network. Anal Biochem. 357(1):116-21.

  • Chen C, Tian YX, Zou XY, Cai PX, Mo JY. (2006) Using pseudo-amino acid composition and support vector machine to predict protein structural class. J Theor Biol. 7;243(3):444-8.

  • Mondal S, Bhavna R, Mohan Babu R, Ramakumar S.(2006) Pseudo amino acid composition and multi-class support vector machines approach for conotoxin superfamily classification. J Theor Biol. 243(2):252-60

  • Shen HB, Chou KC. (2007) Using ensemble classifier to identify membrane protein types. Amino Acids. 32(4):483-8.

  • Lin H, Li QZ. (2007) Using pseudo amino acid composition to predict protein structural class: approached by incorporating 400 dipeptide components. J Comput Chem. 28(9):1463-6.

Feature Group 7 [G7]: Amphiphilic Pseudo-Amino Acid Composition

  • Chou KC. (2005) Using amphiphilic pseudo amino acid composition to predict enzyme subfamily classes. Bioinformatics, 21(1):10-19.

  • Ding H, Luo L, Lin H. (2009). Prediction of cell wall lytic enzymes using Chou's amphiphilic pseudo amino acid composition. Protein Pept Lett. 16(4):351-5.

  • Zhou XB, Chen C., Li,ZC., Zou XY.(2007) Using Chou’s amphiphilic pseudo-amino acid composition and support vector machine for prediction of enzyme subfamily classes. Journal of Theoretical Biology 248(3): 546–551.

  • Huang WL, Tung CW, Huang HL, Ho SY.(2009) Predicting protein subnuclear localization using GO-amino-acid composition features. Biosystems. 98(2):73-9.

  • Khan A, Majid A, Choi TS. (2010) Predicting protein subcellular location: exploiting amino acid based sequence of feature spaces and fusion of diverse classifiers.Amino Acids. 38(1):347-50.

  • Huang WL, Tung CW, Ho SW, Hwang SF, Ho SY. (2008) ProLoc-GO: utilizing informative Gene Ontology terms for sequence-based prediction of protein subcellular localization.BMC Bioinformatics. 9:80.

  • Zhang GY, Fang BS.(2008) Predicting the cofactors of oxidoreductases based on amino acid composition distribution and Chou's amphiphilic pseudo-amino acid composition. Theor Biol. 253(2):310-5.

  • Chou KC, Shen HB.(2006) Predicting protein subcellular location by fusing multiple classifiers.J Cell Biochem.99(2):517-27.

  • Chou KC, Cai YD.(2005) Prediction of membrane protein types by incorporating amphipathic effects.J Chem Inf Model. 45(2):407-13.

Feature Group 8 [G8]: Topological Descriptors at Atomic Level

  • Philip D.Mosier, Anne E. Counterman and Peter C. Jurs (2002) Prediction of peptide icon collision cross sections from topological molecular structure and amino acid parameters. Anal Chem,74:1360-1370

  • Mao S, Huo DD, Mei H,Liang GZ, Zhang M, Li ZL.(2008) New descriptors of amino acids and its applications to peptide quantitative structure-activity relationships. Chinese J.Struct.Chem. 27:1375-1383.

  • Zhao C, Zhang H, Luan F, Zhang R, Liu M, Hu Z, Fan B.(2007) QSAR method for prediction of protein-peptide binding affinity: application to MHC class I molecule HLA-A*0201.J Mol Graph Model. 26(1):246-54.

  • Todeschini R, Consonni V. Handbook of Molecular Descriptors; Wiley-VCH: Weinheim, 2000.

Feature Group 9 [G9]: Total Amino Acid Properties

  • Gromiha MM, Suwa M.(2006) Influence of amino acid properties for discriminating outer membrane proteins at better accuracy.Biochimica of Biophysica Acta, 1764,1493-1497.

  • HUANG LT, GROMIHA MM.(2008) Analysis and Prediction of Protein Folding Rates Using Quadratic Response Surface Models. J Comput Chem 29: 1675–1683.

  • Gromiha, MM.(2003) Importance of Native-State Topology for Determining the Folding Rate of Two-State Proteins.J. Chem. Inf. Comput. Sci. 43(5):1481-1485.

Feature Group 10,11 [G10, G11]: Network Descriptor

  • Barabási AL, Oltvai ZN. (2004) Network Biology: Understanding the Cell's Functional Organization. Nat Rev Genet. 5(2):101-13.

  • Barabási AL, Gulbahce N, Loscalzo J. (2011) Network Medicine: a Network-Based Approach to Human Disease. Nat Rev Genet. 12(1):56-68.

  • Hopkins AL. (2008) Network Pharmacology: The Next Paradigm in Drug Discovery. Nat Chem Biol. 4(11):682-90.

  • Yildirim MA, Goh KI, et al. (2007) Drug-target network. Nat Biotechnol. 25(10):1119-26.

  • Hopkins AL. (2007) Network Pharmacology. Nat Biotechnol. 25(10):1110-1.

  • Goh KI, Cusick ME, Valle D, Vidal M, Barabási AL. (2007) The Human Disease Network. Proc Natl Acad Sci USA. 104(21):8685-90.

  • Stelzl U, Worm U, et al. (2005) A Human Protein-Protein Interaction Network: a Resource for Annotating the Proteome. Cell. 122(6):957-68.

  • Pujol A, Mosca R, Farrés J, Aloy P. (2010) Unveiling the Role of Network and Systems Biology in Drug Discovery. Trends Pharmacol Sci. 31(3):115-23.

  • Chandra N, Padiadpu J. (2013) Network Approaches to Drug Discovery. Expert Opin Drug Discov. 8(1):7-20.

  • Yook SH, Oltvai ZN, Barabási AL. (2004) Functional and Topological Characterization of Protein Interaction Networks. Proteomics. 4(4):928-42.

  • Dong J, Horvath S. (2007) Understanding Network Concepts in Modules. BMC Syst Biol. 1:24.

  • Rubinov M, Sporns O. (2010) Complex Network Measures of Brain Connectivity: Uses and Interpretations. Neuroimage. 52(3):1059-69.

  • Pritykin Y, Singh M. (2013) Simple topological features reflect dynamics and modularity in protein interaction networks. PLoS Comput Biol. 9(10):e1003243.

  • Haiyuan Yu, Philip M Kim, Mark Gerstein, et al. (2007) The Importance of Bottlenecks in Protein Networks: Correlation with Gene Essentiality and Expression Dynamics. PLoS Comput Biol. 3(4): e59.

  • Dyer MD, Murali TM, Sobral BW. (2008) The Landscape of Human Proteins Interacting With Viruses and Other Pathogens. PLoS Pathog. 4(2):e32.

  • Joyce KE, Laurienti PJ, Burdette JH, Hayasaka S. (2010) A New Measure of Centrality for Brain Networks. PLoS One. 5(8):e12200.

  • Zhang B, Horvath S. (2005) A General Framework for Weighted Gene Co-expression Network Analysis. Stat Appl Genet Mol Biol. 4: Article17.

  • Emig D, Ivliev A, Pustovalova O, Nikolsky Y, Bessarabova M. (2013) Drug Target Prediction and Repositioning Using an Integrated Network-Based Approach. PLoS One. 8(4):e60618.

  • Hsu CL, Huang YH, Hsu CT, Yang UC. (2011) Prioritizing Disease Candidate Genes by a Gene Interconnectedness-Based Approach. BMC Genomics. 12 Suppl 3:S25.

  • Zhu C, Kushwaha A, Berman K, Jegga AG. (2012) A Vertex Similarity-Based Framework to Discover and Rank Orphan Disease-Related Genes. BMC Syst Biol. 6 Suppl 3:S8.

  • David F. Gleich. (2015) PageRank Beyond the Web. SIAM Rev. 57(3), 321-363.

  • Koschützki D, Schreiber F. (2008) Centrality Analysis Methods for Biological Networks and Their Application to Gene Regulatory Networks. Gene Regul Syst Bio. 2:193-201.

  • Bánky D, Iván G, Grolmusz V. (2013) Equal Opportunity for Low-Degree Network Nodes: a PageRank-Based Method for Protein Target Identification in Metabolic Graphs. PLoS One. 8(1):e54204.

  • Iván G, Grolmusz V. (2010) When the Web Meets the Cell: Using Personalized PageRank for Analyzing Protein Interaction Networks. Bioinformatics. 27 (3): 405-407.

Department of Computational Science | National University of Singapore | Blk S17, 3 Science Drive 2, Singapore 117543