Introducing Sequence-Order Constraint into Prediction of Protein Binding Sites with Automatically Extracted Templates

Yi-Zhong Weng; Chien-Kang Huang; Yu-Feng Huang; Chi-Yuan Yu; Darby Tien-Hao Chang
May 2009
Proceedings of World Academy of Science: Engineering & Technolog;May2009, Vol. 53, p284
Academic Journal
Search for a tertiary substructure that geometrically matches the 3D pattern of the binding site of a well-studied protein provides a solution to predict protein functions. In our previous work, a web server has been built to predict protein-ligand binding sites based on automatically extracted templates. However, a drawback of such templates is that the web server was prone to resulting in many false positive matches. In this study, we present a sequence-order constraint to reduce the false positive matches of using automatically extracted templates to predict protein-ligand binding sites. The binding site predictor comprises i) an automatically constructed template library and ii) a local structure alignment algorithm for querying the library. The sequence-order constraint is employed to identify the inconsistency between the local regions of the query protein and the templates. Experimental results reveal that the sequence-order constraint can largely reduce the false positive matches and is effective for template-based binding site prediction.


Related Articles

  • Probing Metagenomics by Rapid Cluster Analysis of Very Large Datasets. Weizhong Li; Wooley, John C.; Godzik, Adam // PLoS ONE;2008, Vol. 3 Issue 10, p1 

    Background: The scale and diversity of metagenomic sequencing projects challenge both our technical and conceptual approaches in gene and genome annotations. The recent Sorcerer II Global Ocean Sampling (GOS) expedition yielded millions of predicted protein sequences, which significantly altered...

  • A Deterministic Algorithm for Alpha-Numeric Sequence Comparison with Application to Protein Sequence Detection. Kheniche, A.; Brahimi, N.; Salhi, A. // Journal of Algorithms & Computational Technology;Sep2015, Vol. 9 Issue 3, p323 

    This paper is an extension of a deterministic algorithm, [1, 2], that was initially designed to measure the rate of similarity between DNA sequences, and any sequences made up with symbols of alphabets of cardinality 4. Here, a modified and extended version to handle sequences of symbols from...

  • Multiple graph regularized protein domain ranking. Jim Jing-Yan Wang; Bensmail, Halima; Xin Gao // BMC Bioinformatics;2012, Vol. 13 Issue 1, p1 

    Background: Protein domain ranking is a fundamental task in structural biology. Most protein domain ranking methods rely on the pairwise comparison of protein domains while neglecting the global manifold structure of the protein domain database. Recently, graph regularized ranking that exploits...

  • Family classification without domain chaining. Jacob M. Joseph; Dannie Durand // Bioinformatics;Jun2009, Vol. 25 Issue 12, pi45 

    Motivation: Classification of gene and protein sequences into homologous families, i.e. sets of sequences that share common ancestry, is an essential step in comparative genomic analyses. This is typically achieved by construction of a sequence homology network, followed by clustering to...

  • Recent advances in developing web-servers for predicting protein attributes. Chou, K. C.; Shen, H. B. // Natural Science;Sep2009, Vol. 1 Issue 2, p63 

    Recent advance in large-scale genome sequencing has generated a huge volume of protein sequences. In order to timely utilize the information hidden in these newly discovered sequences, it is highly desired to develop computational methods for efficiently identifying their various attributes...

  • Pairwise Protein Substring Alignment With Latent Semantic Analysis and Support Vector Machines To Detect Remote Protein Homology. Ismail, Surayati; Othman, Razib M.; Kasim, Shahreen; Hassan, Rohayanti; Asmuni, Hishammuddin; Taliba, Jumail // International Journal of Bio-Science & Bio-Technology;Sep2011, Vol. 3 Issue 3, p17 

    Remote protein homology detection has been widely used as a part of the analysis of protein structure and function. In this study, the good quality of protein feature vectors is the main aspect to detect remote protein homology; as it will assist discriminative classifier model to discriminate...

  • SAF: A Substitution and Alignment Free Similarity Measure for Protein Sequences. Kelil, Abdellali; Shengrui Wang; Brzezinski, Ryszard // International Journal of Biological & Life Sciences;Nov2010, Vol. 6 Issue 4, p181 

    The literature reports a large number of approaches for measuring the similarity between protein sequences. Most of these approaches estimate this similarity using alignment-based techniques that do not necessarily yield biologically plausible results, for two reasons. First, for the case of...

  • A FAST SEARCH METHOD FOR DNA SEQUENCE DATABASE USING HISTOGRAM INFORMATION. Qiu Chen; Kotani, Koji; Feifei Lee; Ohmi, Tadahiro // International Journal of Bioinformatics Research;2011, Vol. 3 Issue 1, p161 

    DNA sequence search is a fundamental topic in bioinformatics. The Smith-Waterman algorithm achieved highest accuracy among various sequence alignment tools, but it usually spends much computational time to search on large DNA sequence database. On the contrary, BLAST and FASTA have improved the...

  • Improving pairwise sequence alignment accuracy using near-optimal protein sequence alignments. Sierk, Michael L.; Smoot, Michael E.; Bass, Ellen J.; Pearson, William R. // BMC Bioinformatics;2010, Vol. 11, p146 

    Background: While the pairwise alignments produced by sequence similarity searches are a powerful tool for identifying homologous proteins - proteins that share a common ancestor and a similar structure; pairwise sequence alignments often fail to represent accurately the structural alignments...


Read the Article


Sorry, but this item is not currently available from your library.

Try another library?
Sign out of this library

Other Topics