CUSP: an algorithm to distinguish structurally conserved and unconserved regions in protein domain alignments and its application in the study of large length variations

Sandhya, Sankaran; Pankaj, Barah; Govind, Madabosse Kande; Offmann, Bernard; Srinivasan, Narayanaswamy; Sowdhamini, Ramanathan
January 2008
BMC Structural Biology;2008, Vol. 8, Special section p1
Academic Journal
Background: Distantly related proteins adopt and retain similar structural scaffolds despite length variations that could be as much as two-fold in some protein superfamilies. In this paper, we describe an analysis of indel regions that accommodate length variations amongst related proteins. We have developed an algorithm CUSP, to examine multi-membered PASS2 superfamily alignments to identify indel regions in an automated manner. Further, we have used the method to characterize the length, structural type and biochemical features of indels in related protein domains. Results: CUSP, examines protein domain structural alignments to distinguish regions of conserved structure common to related proteins from structurally unconserved regions that vary in length and type of structure. On a non-redundant dataset of 353 domain superfamily alignments from PASS2, we find that 'length- deviant' protein superfamilies show > 30% length variation from their average domain length. 60% of additional lengths that occur in indels are short-length structures (< 5 residues) while 6% of indels are > 15 residues in length. Structural types in indels also show class-specific trends. Conclusion: The extent of length variation varies across different superfamilies and indels show class-specific trends for preferred lengths and structural types. Such indels of different lengths even within a single protein domain superfamily could have structural and functional consequences that drive their selection, underlying their importance in similarity detection and computational modelling. The availability of systematic algorithms, like CUSP, should enable decision making in a domain superfamily-specific manner.


Related Articles

  • Predicting intrinsic disorder in proteins: an overview. Bo He; Kejun Wang; Yunlong Liu; Bin Xue; Uversky, Vladimir N.; Dunker, A. Keith // Cell Research;Aug2009, Vol. 19 Issue 8, p929 

    The discovery of intrinsically disordered proteins (IDP) (i.e., biologically active proteins that do not possess stable secondary and/or tertiary structures) came as an unexpected surprise, as the existence of such proteins is in contradiction to the traditional...

  • Polymer Uncrossing and Knotting in Protein Folding, and Their Role in Minimal Folding Pathways. Mohazab, Ali R.; Plotkin, Steven S.; Levy, Yaakov Koby // PLoS ONE;Jan2013, Vol. 8 Issue 1, Special section p1 

    We introduce a method for calculating the extent to which chain non-crossing is important in the most efficient, optimal trajectories or pathways for a protein to fold. This involves recording all unphysical crossing events of a ghost chain, and calculating the minimal uncrossing cost that would...

  • Protein structure prediction via combinatorial assembly of sub-structural units. Y. Inbar; H. Benyamini; R. Nussinov; H.J. Wolfson // Bioinformatics;Jan2009 Supplement, Vol. 19, p158 

    Following the hierarchical nature of protein folding, we propose a three-stage scheme for the prediction of a protein structure from its sequence. First, the sequence is cut to fragments that are each assigned a structure. Second, the assigned structures are combinatorially assembled to form the...

  • Fast tree search for enumeration of a lattice model of protein folding. Cejtin, Henry; Edler, Jan; Gottlieb, Allan; Helling, Robert; Li, Hao; Philbin, James; Wingreen, Ned; Tang, Chao // Journal of Chemical Physics;1/1/2002, Vol. 116 Issue 1, p352 

    Using a fast tree-searching algorithm and a Pentium cluster, we enumerated all the sequences and compact conformations (structures) for a protein folding model on a cubic lattice of size 4×3×3. We used two types of amino acids—hydrophobic (H) and polar (P)—to make up the...

  • Fold and sequence independent protein binding sites prediction algorithm. Janezˇicˇ, Dusˇanka; Konc, Janez; Konc, Joanna Trykowska; Penca, Matej; Janezˇicˇ, Matej // AIP Conference Proceedings;Dec2012, Vol. 1504 Issue 1, p729 

    We have developed fold and sequence independent algorithm for structural similarity search in large databases of protein structures to find conserved binding regions on proteins involved in protein-protein interactions. We have used this algorithm to find conserved regions on protein surfaces...

  • Intrinsic disorder prediction from the analysis of multiple protein fold recognition models.  // Bioinformatics;Aug2008, Vol. 24 Issue 16, p1798 

    Motivation: Intrinsic protein disorder is functionally implicated in numerous biological roles and is, therefore, ubiquitous in proteins from all three kingdoms of life. Determining the disordered regions in proteins presents a challenge for experimental methods and so recently there has been...

  • Protein Folding.  // Encyclopedic Reference of Molecular Pharmacology;2004, p757 

    An encyclopedia entry for protein folding is presented. It explains that proteins fold on a time scale from µseconds to seconds. Starting from a random coil conformation, proteins can find their stable fold quickly, although the number of possible conformation is astronomically high.

  • When is a potential accurate enough for structure prediction? Theory and application to a random heteropolymer model of protein folding. Bryngelson, Joseph D. // Journal of Chemical Physics;4/15/1994, Vol. 100 Issue 8, p6038 

    Attempts to predict molecular structure often try to minimize some potential function over a set of structures. Much effort has gone into creating potential functions and into creating algorithms for minimizing these potential functions. This paper develops a formalism that addresses a...

  • Protein Folding, Misfolding and Aggregation: Evolving Concepts and Conformational Diseases. Ramos, Carlos H. I.; Ferreira, S�rgio T. // Protein & Peptide Letters;Apr2005, Vol. 12 Issue 3, p213 

    Proteins carry out many vital cellular functions determined by their precise 3-dimensional structures (the native conformations). Understanding how proteins fold has long been a major goal and can be of great therapeutic value. Failure to reach or maintain the correct folded structure can have...


Read the Article


Sorry, but this item is not currently available from your library.

Try another library?
Sign out of this library

Other Topics