PURE: A webserver for the prediction of domains in unassigned regions in proteins

Reddy, Chilamakuri C. S.; Shameer, Khader; Offmann, Bernard O.; Sowdhamini, Ramanathan
January 2008
BMC Bioinformatics;2008, Vol. 9, Special section p1
Academic Journal
Background: Protein domains are the structural and functional units of proteins. The ability to parse proteins into different domains is important for effective classification, understanding of protein structure, function, and evolution and is hence biologically relevant. Several computational methods are available to identify domains in the sequence. Domain finding algorithms often employ stringent thresholds to recognize sequence domains. Identification of additional domains can be tedious involving intense computation and manual intervention but can lead to better understanding of overall biological function. In this context, the problem of identifying new domains in the unassigned regions of a protein sequence assumes a crucial importance. Results: We had earlier demonstrated that accumulation of domain information of sequence homologues can substantially aid prediction of new domains. In this paper, we propose a computationally intensive, multi-step bioinformatics protocol as a web server named as PURE (Prediction of Unassigned REgions in proteins) for the detailed examination of stretches of unassigned regions in proteins. Query sequence is processed using different automated filtering steps based on length, presence of coiled-coil regions, transmembrane regions, homologous sequences and percentage of secondary structure content. Later, the filtered sequence segments and their sequence homologues are fed to PSI-BLAST, cd-hit and Hmmpfam. Data from the various programs are integrated and information regarding the probable domains predicted from the sequence is reported. Conclusion: We have implemented PURE protocol as a web server for rapid and comprehensive analysis of unassigned regions in the proteins. This server integrates data from different programs and provides information about the domains encoded in the unassigned regions.


Related Articles

  • iDBPs: a web server for the identification of DNA binding proteins. Nimrod, Guy; Schushan, Maya; Szilágyi, András; Leslie, Christina; Ben-Tal, Nir // Bioinformatics;Mar2010, Vol. 26 Issue 5, p692 

    Summary: The iDBPs server uses the three-dimensional (3D) structure of a query protein to predict whether it binds DNA. First, the algorithm predicts the functional region of the protein based on its evolutionary profile; the assumption is that large clusters of conserved residues are good...

  • MINS2: Revisiting the molecular code for transmembrane-helix recognition by the Sec61 translocon. Yungki Park; Volkhard Helms // Bioinformatics;Aug2008, Vol. 24 Issue 16, p1819 

    Summary: To be fully functional, membrane proteins should not only fold, but also get inserted into the membrane, which is mediated by the Sec61 translocon. Recent experimental studies have attempted to elucidate how the Sec61 translocon accomplishes this delicate task by measuring the...

  • MICAlign: a sequence-to-structure alignment tool integrating multiple sources of information in conditional random fields. Xuefeng Xia; Song Zhang; Yu Su; Zhirong Sun // Bioinformatics;Jun2009, Vol. 25 Issue 11, p1433 

    Summary: Sequence-to-structure alignment in template-based protein structure modeling for remote homologs remains a difficult problem even following the correct recognition of folds. Here we present MICAlign, a sequence-to-structure alignment tool that incorporates multiple sources of...

  • PINALOG: a novel approach to align protein interaction networks—implications for complex detection and function prediction. Phan, Hang T. T.; Sternberg, Michael J. E. // Bioinformatics;May2012, Vol. 28 Issue 9, p1239 

    Motivation: Analysis of protein–protein interaction networks (PPINs) at the system level has become increasingly important in understanding biological processes. Comparison of the interactomes of different species not only provides a better understanding of species evolution but also...

  • Protein contact order prediction from primary sequences. Yi Shi; Jianjun Zhou; Arndt, David; Wishart, David S.; Guohui Lin // BMC Bioinformatics;2008, Vol. 9, Special section p1 

    Background: Contact order is a topological descriptor that has been shown to be correlated with several interesting protein properties such as protein folding rates and protein transition state placements. Contact order has also been used to select for viable protein folds from ab initio protein...

  • SALIGN: a web server for alignment of multiple protein sequences and structures. Braberg, Hannes; Webb, Benjamin M.; Tjioe, Elina; Pieper, Ursula; Sali, Andrej; Madhusudhan, M.S. // Bioinformatics;8/1/2012, Vol. 28 Issue 15, p2072 

    Summary: Accurate alignment of protein sequences and/or structures is crucial for many biological analyses, including functional annotation of proteins, classifying protein sequences into families, and comparative protein structure modeling. Described here is a web interface to SALIGN, the...

  • 3dswap-pred: Prediction of 3D Domain Swapping from Protein Sequence Using Random Forest Approach. Shameer, Khader; Pugalenthi, Ganesan; Kandaswamy, Krishna Kumar; Sowdhamini, Ramanathan // Protein & Peptide Letters;Oct2011, Vol. 18 Issue 10, p1010 

    3D domain swapping is a protein structural phenomenon that mediates the formation of the higher order oligomers in a variety of proteins with different structural and functional properties. 3D domain swapping is associated with a variety of biological functions ranging from oligomerization to...

  • Prediction of protein structural classes for low-homology sequences based on predicted secondary structure. Jian-Yi Yang; Zhen-Ling Peng; Xin Chen // BMC Bioinformatics;2010 Supplement 1, Vol. 11, Special section p1 

    Background: Prediction of protein structural classes (α, β, α + β and α/β) from amino acid sequences is of great importance, as it is beneficial to study protein function, regulation and interactions. Many methods have been developed for high-homology protein sequences, and the...

  • TESE: generating specific protein structure test set ensembles. Francesco Sirocco; Silvio C. E. Tosatto // Bioinformatics;Nov2008, Vol. 24 Issue 22, p2632 

    Summary: TESE is a web server for the generation of test sets of protein sequences and structures fulfilling a number of different criteria. At least three different use cases can be envisaged: (i) benchmarking of novel methods; (ii) test sets tailored for special needs and (iii) extending...


Read the Article


Sorry, but this item is not currently available from your library.

Try another library?
Sign out of this library

Other Topics