Improving protein structure similarity searches using domain boundaries based on conserved sequence information

Thompson, Kenneth Evan; Yanli Wang; Madej, Tom; Bryant, Stephen H.
January 2009
BMC Structural Biology;2009, Vol. 9, Special section p1
Academic Journal
Background: The identification of protein domains plays an important role in protein structure comparison. Domain query size and composition are critical to structure similarity search algorithms such as the Vector Alignment Search Tool (VAST), the method employed for computing related protein structures in NCBI Entrez system. Currently, domains identified on the basis of structural compactness are used for VAST computations. In this study, we have investigated how alternative definitions of domains derived from conserved sequence alignments in the Conserved Domain Database (CDD) would affect the domain comparisons and structure similarity search performance of VAST. Results: Alternative domains, which have significantly different secondary structure composition from those based on structurally compact units, were identified based on the alignment footprints of curated protein sequence domain families. Our analysis indicates that domain boundaries disagree on roughly 8% of protein chains in the medium redundancy subset of the Molecular Modeling Database (MMDB). These conflicting sequence based domain boundaries perform slightly better than structure domains in structure similarity searches, and there are interesting cases when structure similarity search performance is markedly improved. Conclusion: Structure similarity searches using domain boundaries based on conserved sequence information can provide an additional method for investigators to identify interesting similarities between proteins with known structures. Because of the improvement in performance of structure similarity searches using sequence domain boundaries, we are in the process of implementing their inclusion into the VAST search and MMDB resources in the NCBI Entrez system.


Related Articles

  • SANA: an algorithm for sequential and non-sequential protein structure alignment. Lin Wang; Ling-Yun Wu; Yong Wang; Xiang-Sun Zhang; Luonan Chen // Amino Acids;Jul2010, Vol. 39 Issue 2, p417 

    Protein structure alignment algorithms play an important role in the studies of protein structure and function. In this paper, a novel approach for structure alignment is presented. Specifically, core regions in two protein structures are first aligned by identifying connected components in a...

  • Predicting intrinsic disorder in proteins: an overview. Bo He; Kejun Wang; Yunlong Liu; Bin Xue; Uversky, Vladimir N.; Dunker, A. Keith // Cell Research;Aug2009, Vol. 19 Issue 8, p929 

    The discovery of intrinsically disordered proteins (IDP) (i.e., biologically active proteins that do not possess stable secondary and/or tertiary structures) came as an unexpected surprise, as the existence of such proteins is in contradiction to the traditional...

  • Into the fold. Hunter, Philip // EMBO Reports;Mar2006, Vol. 7 Issue 3, p249 

    The article focuses on the role of advances in technology and algorithms in facilitating development in protein structure prediction. There are three categories of structure prediction methods namely comparative modelling, fold recognition and ab initio. The methods may combine more than one...

  • Membranes: Shaping biological matter. Frolov, Vadim A.; Zimmerberg, Joshua // Nature Materials;Mar2009, Vol. 8 Issue 3, p173 

    The article discusses the results of studies concerning membranes and the shaping of biological matter. Biological membranes form an extremely dynamic and complex network in cells, guided by specialized protein machinery. One study has developed a novel algorithm that analyzes membrane shape to...

  • A binary matrix factorization algorithm for protein complex prediction. Shikui Tu; Runsheng Chen; Lei Xu // Proteome Science;2011 Supplement 1, Vol. 9 Issue Suppl 1, p1 

    Background: Identifying biologically relevant protein complexes from a large protein-protein interaction (PPI) network, is essential to understand the organization of biological systems. However, high-throughput experimental techniques that can produce a large amount of PPIs are known to yield...

  • Automated functional classification of experimental and predicted protein structures. Kai Wang; Samudrala, Ram // BMC Bioinformatics;2006, Vol. 7, p278 

    Background: Proteins that are similar in sequence or structure may perform different functions in nature. In such cases, function cannot be inferred from sequence or structural similarity. Results: We analyzed experimental structures belonging to the Structural Classification of Proteins (SCOP)...

  • Unsupervised Integration of Multiple Protein Disorder Predictors: The Method and Evaluation on CASP7, CASP8 and CASP9 Data. Ping Zhang; Obradovic, Zoran // Proteome Science;2011 Supplement 1, Vol. 9 Issue Suppl 1, p1 

    Background: Studies of intrinsically disordered proteins that lack a stable tertiary structure but still have important biological functions critically rely on computational methods that predict this property based on sequence information. Although a number of fairly successful models for...

  • A global optimization algorithm for protein surface alignment. Bertolazzi, Paola; Guerra, Concettina; Liuzzi, Giampaolo // BMC Bioinformatics;2010, Vol. 11, p488 

    Background: A relevant problem in drug design is the comparison and recognition of protein binding sites. Binding sites recognition is generally based on geometry often combined with physico-chemical properties of the site since the conformation, size and chemical composition of the protein...

  • Analysing the origin of long-range interactions in proteins using lattice models. Noivirt-Brik, Orly; Unger, Ron; Horovitz, Amnon // BMC Structural Biology;2009, Vol. 9, Special section p1 

    Background: Long-range communication is very common in proteins but the physical basis of this phenomenon remains unclear. In order to gain insight into this problem, we decided to explore whether long-range interactions exist in lattice models of proteins. Lattice models of proteins have proven...


Read the Article


Sorry, but this item is not currently available from your library.

Try another library?
Sign out of this library

Other Topics