Linear-time protein 3-D structure searching with insertions and deletions

Shibuya, Tetsuo; Jansson, Jesper; Sadakane, Kunihiko
January 2010
Algorithms for Molecular Biology;2010, Vol. 5, p1
Academic Journal
Background: Two biomolecular 3-D structures are said to be similar if the RMSD (root mean square deviation) between the two molecules' sequences of 3-D coordinates is less than or equal to some given constant bound. Tools for searching for similar structures in biomolecular 3-D structure databases are becoming increasingly important in the structural biology of the post-genomic era. Results: We consider an important, fundamental problem of reporting all substructures in a 3-D structure database of chain molecules (such as proteins) which are similar to a given query 3-D structure, with consideration of indels (i.e., insertions and deletions). This problem has been believed to be very difficult but its exact computational complexity has not been known. In this paper, we first prove that the problem in unbounded dimensions is NPhard. We then propose a new algorithm that dramatically improves the average-case time complexity of the problem in 3-D in case the number of indels k is bounded by a constant. Our algorithm solves the above problem for a query of size m and a database of size N in average-case O(N) time, whereas the time complexity of the previously best algorithm was O(Nmk+1). Conclusions: Our results show that although the problem of searching for similar structures in a database based on the RMSD measure with indels is NP-hard in the case of unbounded dimensions, it can be solved in 3-D by a simple average-case linear time algorithm when the number of indels is bounded by a constant.


Related Articles

  • Function prediction from networks of local evolutionary similarity in protein structure. Erdin, Serkan; Venner, Eric; Lisewski, Andreas Martin; Lichtarge, Olivier // BMC Bioinformatics;2013, Vol. 14 Issue S3, p1 

    Background: Annotating protein function with both high accuracy and sensitivity remains a major challenge in structural genomics. One proven computational strategy has been to group a few key functional amino acids into templates and search for these templates in other protein structures, so as...

  • A multi-template combination algorithm for protein comparative modeling. Jianlin Cheng // BMC Structural Biology;2008, Vol. 8, Special section p1 

    Background: Multiple protein templates are commonly used in manual protein structure prediction. However, few automated algorithms of selecting and combining multiple templates are available. Results: Here we develop an effective multi-template combination algorithm for protein comparative...

  • Research Notes.  // Nature Structural & Molecular Biology;May2004, Vol. 11 Issue 5, p393 

    Presents several research on molecular biology in the U.S. Interpretation of the attachment of ubiquitin and ubiquitin-like modifier proteins; Detection of misfolded proteins in neurodegenerative diseases; Proteins in transport vesicles.

  • Pursuing proteins' potential. Adams, Brent // Indianapolis Business Journal;3/8/2004, Vol. 24 Issue 53, p19 

    Focuses on proteomics in Indianapolis, Indiana. Study of how different proteins interact with other molecules in cells and organs; One of the fastest growing disciplines of life sciences; Efforts of dozens of researchers from private laboratories and public universities to search for ways to...

  • Alternating evolutionary pressure in a genetic algorithm facilitates protein model selection. Offman, Marc N.; Tournier, Alexander L.; Bates, Paul A. // BMC Structural Biology;2008, Vol. 8, Special section p1 

    Background: Automatic protein modelling pipelines are becoming ever more accurate; this has come hand in hand with an increasingly complicated interplay between all components involved. Nevertheless, there are still potential improvements to be made in template selection, refinement and protein...

  • A Review of the Internet Site “Practical Molecular Biology”. Georgieva, S. G.; Nabirochkina, E. N.; Soldatov, A. V.; Krasnov, A. N. // Molecular Biology;Nov2001, Vol. 35 Issue 6, p961 

    In 2000, in the middle of September, the Russian Web site Practical Molecular Biology (http://molbiol.edu.ru) was opened. The main tasks of the project are (i) distribution of experimental work experience among Russian and foreign laboratories and (ii) organization of an information database in...

  • Clustering of gene expression data: performance and similarity analysis. Longde Yin; Chun-Hsi Huang; Ni, Jun // BMC Bioinformatics;2006 Supplement 4, Vol. 7, pS19 

    Background: DNA Microarray technology is an innovative methodology in experimental molecular biology, which has produced huge amounts of valuable data in the profile of gene expression. Many clustering algorithms have been proposed to analyze gene expression data, but little guidance is...

  • The Chemical Information Ontology: Provenance and Disambiguation for Chemical Data on the Biological Semantic Web. Hastings, Janna; Chepelev, Leonid; Willighagen, Egon; Adams, Nico; Steinbeck, Christoph; Dumontier, Michel // PLoS ONE;2011, Vol. 6 Issue 10, p1 

    Cheminformatics is the application of informatics techniques to solve chemical problems in silico. There are many areas in biology where cheminformatics plays an important role in computational research, including metabolism, proteomics, and systems biology. One critical aspect in the...

  • Solving Hard Computational Problems Efficiently: Asymptotic Parametric Complexity 3-Coloring Algorithm. H., José Antonio Martín // PLoS ONE;Jan2013, Vol. 8 Issue 1, Special section p1 

    Many practical problems in almost all scientific and technological disciplines have been classified as computationally hard (NP-hard or even NP-complete). In life sciences, combinatorial optimization problems frequently arise in molecular biology, e.g., genome sequencing; global alignment of...


Read the Article


Sorry, but this item is not currently available from your library.

Try another library?
Sign out of this library

Other Topics