HangOut: generating clean PSI-BLAST profiles for domains with long insertions

Kim, Bong-Hyun; Cong, Qian; Grishin, Nick V.
June 2010
Bioinformatics;Jun2010, Vol. 26 Issue 12, p1564
Academic Journal
Summary: Profile-based similarity search is an essential step in structure-function studies of proteins. However, inclusion of non-homologous sequence segments into a profile causes its corruption and results in false positives. Profile corruption is common in multidomain proteins, and single domains with long insertions are a significant source of errors. We developed a procedure (HangOut) that, for a single domain with specified insertion position, cleans erroneously extended PSI-BLAST alignments to generate better profiles.


Related Articles

  • Directionality in protein fold prediction. Ellis, Jonathan J.; Huard, Fabien P. E.; Deane, Charlotte M.; Srivastava, Sheenal; Wood, Graham R. // BMC Bioinformatics;2010, Vol. 11, p172 

    Background: Ever since the ground-breaking work of Anfinsen et al. in which a denatured protein was found to refold to its native state, it has been frequently stated by the protein fold prediction community that all the information required for protein folding lies in the amino acid sequence....

  • Length Variations amongst Protein Domain Superfamilies and Consequences on Structure and Function. Sandhya, Sankaran; Rani, Saane Sudha; Pankaj, Barah; Govind, Madabosse Kande; Offmann, Bernard; Srinivasan, Narayanaswamy; Sowdhamini, Ramanathan // PLoS ONE;2009, Vol. 4 Issue 3, p1 

    Background: Related protein domains of a superfamily can be specified by proteins of diverse lengths. The structural and functional implications of indels in a domain scaffold have been examined. Methodology: In this study, domain superfamilies with large length variations (more than 30%...

  • APIS: accurate prediction of hot spots in protein interfaces by combining protrusion index with solvent accessibility. Jun-Feng Xia; Xing-Ming Zhao; Jiangning Song; De-Shuang Huang // BMC Bioinformatics;2010, Vol. 11, p174 

    Background: It is well known that most of the binding free energy of protein interaction is contributed by a few key hot spot residues. These residues are crucial for understanding the function of proteins and studying their interactions. Experimental hot spots detection methods such as alanine...

  • Loose and Strict Repeats in Weighted Sequences of Proteins. Hui Zhang; Qing Guo; Jing Fan; Iliopoulos, Costas S. // Protein & Peptide Letters;Sep2010, Vol. 17 Issue 9, p1136 

    A weighted sequence is a string in which a set of characters may appear at each position with respective probabilities of occurrence. Weighted sequences are able to summarize poorly defined short sequences, as well as the profiles of protein families and complete chromosome sequences. Thus it is...

  • A Comprehensive Benchmark Study of Multiple Sequence Alignment Methods: Current Challenges and Future Perspectives. Thompson, Julie D.; Linard, Benjamin; Lecompte, Odile; Poch, Olivier // PLoS ONE;2011, Vol. 6 Issue 3, p1 

    Multiple comparison or alignmentof protein sequences has become a fundamental tool in many different domains in modern molecular biology, from evolutionary studies to prediction of 2D/3D structure, molecular function and intermolecular interactions etc. By placing the sequence in the framework...

  • Improving pairwise sequence alignment accuracy using near-optimal protein sequence alignments. Sierk, Michael L.; Smoot, Michael E.; Bass, Ellen J.; Pearson, William R. // BMC Bioinformatics;2010, Vol. 11, p146 

    Background: While the pairwise alignments produced by sequence similarity searches are a powerful tool for identifying homologous proteins - proteins that share a common ancestor and a similar structure; pairwise sequence alignments often fail to represent accurately the structural alignments...

  • In Silico Protein-Protein Interaction Prediction with Sequence Alignment and Classifier Stacking. Marini, Simone; Qian Xu; Qiang Yang // Current Protein & Peptide Science;Nov2011, Vol. 12 Issue 7, p614 

    Protein-Protein Interaction (PPI) prediction is a well known problem in Bioinformatics, for which a large number of techniques have been proposed in the past. However, prediction results have not been sufficiently satisfactory for guiding biologists in web-lab experiments. One reason is that not...

  • pGenTHREADER and pDomTHREADER: new methods for improved protein fold recognition and superfamily discrimination. Anna Lobley; Michael I. Sadowski; David T. Jones // Bioinformatics;Jul2009, Vol. 25 Issue 14, p1761 

    Motivation: Generation of structural models and recognition of homologous relationships for unannotated protein sequences are fundamental problems in bioinformatics. Improving the sensitivity and selectivity of methods designed for these two tasks therefore has downstream benefits for many other...

  • Protein sequences classification by means of feature extraction with substitution matrices. Saidi, Rabie; Maddouri, Mondher; Nguifo, Engelbert Mephu // BMC Bioinformatics;2010, Vol. 11, p175 

    Background: This paper deals with the preprocessing of protein sequences for supervised classification. Motif extraction is one way to address that task. It has been largely used to encode biological sequences into feature vectors to enable using well-known machine-learning classifiers which...


Read the Article


Sorry, but this item is not currently available from your library.

Try another library?
Sign out of this library

Other Topics