Systematic analysis of short internal indels and their impact on protein folding

RyangGuk Kim; Jun-tao Guo
January 2010
BMC Structural Biology;2010, Vol. 10, p24
Academic Journal
Background: Protein sequence insertions/deletions (indels) can be introduced during evolution or through alternative splicing (AS). Alternative splicing is an important biological phenomenon and is considered as the major means of expanding structural and functional diversity in eukaryotes. Knowledge of the structural changes due to indels is critical to our understanding of the evolution of protein structure and function. In addition, it can help us probe the evolution of alternative splicing and the diversity of functional isoforms. However, little is known about the effects of indels, in particular the ones involving core secondary structures, on the folding of protein structures. The long term goal of our study is to accurately predict the protein AS isoform structures. As a first step towards this goal, we performed a systematic analysis on the structural changes caused by short internal indels through mining highly homologous proteins in Protein Data Bank (PDB). Results: We compiled a non-redundant dataset of short internal indels (2-40 amino acids) from highly homologous protein pairs and analyzed the sequence and structural features of the indels. We found that about one third of indel residues are in disordered state and majority of the residues are exposed to solvent, suggesting that these indels are generally located on the surface of proteins. Though naturally occurring indels are fewer than engineered ones in the dataset, there are no statistically significant differences in terms of amino acid frequencies and secondary structure types between the "Natural" indels and "All" indels in the dataset. Structural comparisons show that all the protein pairs with short internal indels in the dataset preserve the structural folds and about 85% of protein pairs have global RMSDs (root mean square deviations) of 2Ã… or less, suggesting that protein structures tend to be conserved and can tolerate short insertions and deletions. A few pairs with high RMSDs are results of relative domain positions of the proteins, probably due to the intrinsically dynamic nature of the proteins. Conclusions: The analysis demonstrated that protein structures have the "plasticity" to tolerate short indels. This study can provide valuable guides in modeling protein AS isoform structures and homologous proteins with indels through placing the indels at the right locations since the accuracy of sequence alignments dictate model qualities in homology modeling.


Related Articles

  • Protein structure alignment considering phenotypic plasticity. Gergely Csaba; Fabian Birzele; Ralf Zimmer // Bioinformatics;Aug2008, Vol. 24 Issue 16, pi98 

    Motivation: Protein structure comparison exhibits differences and similarities of proteins and protein families and may help to elucidate protein sequence and structure evolution. Despite many methods to score protein structure similarity with and without.exibility and to align proteins...

  • APDB: a novel measure for benchmarking sequence alignment methods without reference alignments. O. O'Sullivan; M. Zehnder; D. Higgins; P. Bucher; A. Grosdidier; C. Notredame // Bioinformatics;Jan2009 Supplement, Vol. 19, p215 

    Motivation: We describe APDB, a novel measure for evaluating the quality of a protein sequence alignment, given two or more PDB structures. This evaluation does not require a reference alignment or a structure superposition. APDB is designed to efficiently and objectively benchmark multiple...

  • ProtSA: a web application for calculating sequence specific protein solvent accessibilities in the unfolded ensemble. Estrada, Jorge; Bernadó, Pau; Blackledge, Martin; Sancho, Javier // BMC Bioinformatics;2009, Vol. 10, Special section p1 

    Background: The stability of proteins is governed by the heat capacity, enthalpy and entropy changes of folding, which are strongly correlated to the change in solvent accessible surface area experienced by the polypeptide. While the surface exposed in the folded state can be easily determined,...

  • Helical ambivalency induced by point mutations. Bhattacharjee, Nicholus; Biswas, Parbati // BMC Structural Biology;2013, Vol. 13 Issue 1, p1 

    Background: Mutation of amino acid sequences in a protein may have diverse effects on its structure and function. Point mutations of even a single amino acid residue in the helices of the non-redundant database may lead to sequentially identical peptides which adopt different secondary...

  • Parameters of in vitroEvolution of Proteins and Peptides. Valuev, V. P.; Afonnikov, D. A.; Petrenko, O. I.; Grigorovich, D. A. // Molecular Biology;Nov2001, Vol. 35 Issue 6, p898 

    In vitroevolution is used to study protein sequences, structures, and interactions and to obtain proteins with new properties. To analyze the specific features of this process in phage display experiments, we studied the amino acid composition of selected sequences, constructed a matrix of amino...

  • Dependency of codon usage on protein sequence patterns: a statistical study. Foroughmand-Araabi, Mohammad-Hadi; Goliaei, Bahram; Alishahi, Kasra; Sadeghi, Mehdi // Theoretical Biology & Medical Modelling;2014, Vol. 11 Issue 1, p1 

    Background Codon degeneracy and codon usage by organisms is an interesting and challenging problem. Researchers demonstrated the relation between codon usage and various functions or properties of genes and proteins, such as gene regulation, translation rate, translation efficiency, mRNA...

  • Statistical analysis of unstructured amino acid residues in protein structures. Lobanov, M. Yu.; Garbuzynskiy, S.; Galzitskaya, O. // Biochemistry (00062979);Feb2010, Vol. 75 Issue 2, p192 

    We have performed a statistical analysis of unstructured amino acid residues in protein structures available in the databank of protein structures. Data on the occurrence of disordered regions at the ends and in the middle part of protein chains have been obtained: in the regions near the ends...

  • Phylogenetic Analysis of Protein Sequences Based on Distribution of Length About Common Substring. Guisong Chang; Tianming Wang // Protein Journal;Mar2011, Vol. 30 Issue 3, p167 

    Up to now, various approaches for phylogenetic analysis have been developed. Almost all of them put stress on analyzing nucleic acid sequences or protein primary sequences. In this paper, we propose a new sequence distance for efficient reconstruction of phylogenetic trees based on the...

  • Primary sequence contribution to the optical function of the eye lens. Mahendiran, K.; Elie, C.; Nebel, J.-C.; Ryan, A.; Pierscionek, B. K. // Scientific Reports;6/6/2014, p1 

    The crystallins have relatively high refractive increments compared to other proteins. The Greek key motif in bγ-crystallins was compared with that in other proteins, using predictive analysis from a protein database, to see whether this may be related to the refractive increment. Crystallins...


Read the Article


Sorry, but this item is not currently available from your library.

Try another library?
Sign out of this library

Other Topics