CMASA: an accurate algorithm for detecting local protein structural similarity and its application to enzyme catalytic site annotation

Gong-Hua Li; Jing-Fei Huang
January 2010
BMC Bioinformatics;2010, Vol. 11, p439
Academic Journal
Background: The rapid development of structural genomics has resulted in many "unknown function" proteins being deposited in Protein Data Bank (PDB), thus, the functional prediction of these proteins has become a challenge for structural bioinformatics. Several sequence-based and structure-based methods have been developed to predict protein function, but these methods need to be improved further, such as, enhancing the accuracy, sensitivity, and the computational speed. Here, an accurate algorithm, the CMASA (Contact MAtrix based local Structural Alignment algorithm), has been developed to predict unknown functions of proteins based on the local protein structural similarity. This algorithm has been evaluated by building a test set including 164 enzyme families, and also been compared to other methods. Results: The evaluation of CMASA shows that the CMASA is highly accurate (0.96), sensitive (0.86), and fast enough to be used in the large-scale functional annotation. Comparing to both sequence-based and global structure-based methods, not only the CMASA can find remote homologous proteins, but also can find the active site convergence. Comparing to other local structure comparison-based methods, the CMASA can obtain the better performance than both FFF (a method using geometry to predict protein function) and SPASM (a local structure alignment method); and the CMASA is more sensitive than PINTS and is more accurate than JESS (both are local structure alignment methods). The CMASA was applied to annotate the enzyme catalytic sites of the nonredundant PDB, and at least 166 putative catalytic sites have been suggested, these sites can not be observed by the Catalytic Site Atlas (CSA). Conclusions: The CMASA is an accurate algorithm for detecting local protein structural similarity, and it holds several advantages in predicting enzyme active sites. The CMASA can be used in large-scale enzyme active site annotation. The CMASA can be available by the mail-based server ( htm).


Related Articles

  • TOPSAN: a collaborative annotation environment for structural genomics. Weekes, Dana; Krishna, S. Sri; Bakolitsa, Constantina; Wilson, Ian A.; Godzik, Adam; Wooley, John // BMC Bioinformatics;2010, Vol. 11, p426 

    Background: Many protein structures determined in high-throughput structural genomics centers, despite their significant novelty and importance, are available only as PDB depositions and are not accompanied by a peerreviewed manuscript. Because of this they are not accessible by the standard...

  • Cloud4Psi: cloud computing for 3D protein structure similarity searching. Mrozek, Dariusz; Małysiak-Mrozek, Bożena; Kłapciński, Artur // Bioinformatics;Oct2014, Vol. 30 Issue 19, p2822 

    Summary: Popular methods for 3D protein structure similarity searching, especially those that generate high-quality alignments such as Combinatorial Extension (CE) and Flexible structure Alignment by Chaining Aligned fragment pairs allowing Twists (FATCAT) are still time consuming. As a...

  • ETAscape: analyzing protein networks to predict enzymatic function and substrates in Cytoscape. Bachman, Benjamin J.; Venner, Eric; Lua, Rhonald C.; Erdin, Serkan; Lichtarge, Olivier // Bioinformatics;Aug2012, Vol. 28 Issue 16, p2186 

    Summary: Most proteins lack experimentally validated functions. To address this problem, we implemented the Evolutionary Trace Annotation (ETA) method in the Cytoscape network visualization environment. The result is the ETAscape plugin, which builds a structural genomics network based on local...

  • Fast and accurate protein substructure searching with simulated annealing and GPUs. Stivala, Alex D.; Stuckey, Peter J.; Wirth, Anthony I. // BMC Bioinformatics;2010, Vol. 11, p446 

    Background: Searching a database of protein structures for matches to a query structure, or occurrences of a structural motif, is an important task in structural biology and bioinformatics. While there are many existing methods for structural similarity searching, faster and more accurate...

  • Algorithms for optimal protein structure alignment. Poleksic, Aleksandar // Bioinformatics;Nov2009, Vol. 25 Issue 21, p2751 

    Motivation: Structural alignment is an important tool for understanding the evolutionary relationships between proteins. However, finding the best pairwise structural alignment is difficult, due to the infinite number of possible superpositions of two structures. Unlike the sequence alignment...

  • Towards the development of standardized methods for comparison, ranking and evaluation of structure alignments. Slater, Alex W.; Castellanos, Javier I.; Sippl, Manfred J.; Melo, Francisco // Bioinformatics;Jan2013, Vol. 29 Issue 1, p47 

    Motivation: Pairwise alignment of protein structures is a fundamental task in structural bioinformatics. There are numerous computer programs in the public domain that produce alignments for a given pair of protein structures, but the results obtained by the various programs generally differ...

  • TOPS++FATCAT: Fast flexible structural alignment using constraints derived from TOPS+ Strings Model. Veeramalai, Mallika; Yuzhen Ye; Godzik, Adam // BMC Bioinformatics;2008, Vol. 9, Special section p1 

    Background: Protein structure analysis and comparison are major challenges in structural bioinformatics. Despite the existence of many tools and algorithms, very few of them have managed to capture the intuitive understanding of protein structures developed in structural biology, especially in...

  • Multiple structure alignment with msTALI. Shealy, Paul; Valafar, Homayoun // BMC Bioinformatics;2012, Vol. 13 Issue 1, p105 

    Background: Multiple structure alignments have received increasing attention in recent years as an alternative to multiple sequence alignments. Although multiple structure alignment algorithms can potentially be applied to a number of problems, they have primarily been used for protein core...

  • Fast protein structure alignment using Gaussian overlap scoring of backbone peptide fragment similarity. Ritchie, David W.; Ghoorah, Anisah W.; Mavridis, Lazaros; Venkatraman, Vishwesh // Bioinformatics;Dec2012, Vol. 28 Issue 24, p3274 

    Motivation: Aligning and comparing protein structures is important for understanding their evolutionary and functional relationships. With the rapid growth of protein structure databases in recent years, the need to align, superpose and compare protein structures rapidly and accurately has never...


Read the Article


Sorry, but this item is not currently available from your library.

Try another library?
Sign out of this library

Other Topics