Saturating representation of loop conformational fragments in structure databanks

Fernandez-Fuentes, Narcis; Fiser, András
January 2006
BMC Structural Biology;2006, Vol. 6, p15
Academic Journal
Background: Short fragments of proteins are fundamental starting points in various structure prediction applications, such as in fragment based loop modeling methods but also in various full structure build-up procedures. The applicability and performance of these approaches depend on the availability of short fragments in structure databanks. Results: We studied the representation of protein loop fragments up to 14 residues in length. All possible query fragments found in sequence databases (Sequence Space) were clustered and cross referenced with available structural fragments in Protein Data Bank (Structure Space). We found that the expansion of PDB in the last few years resulted in a dense coverage of loop conformational fragments. For each loops of length 8 in the current Sequence Space there is at least one loop in Structure Space with 50% or higher sequence identity. By correlating sequence and structure clusters of loops we found that a 50% sequence identity generally guarantees structural similarity. These percentages of coverage at 50% sequence cutoff drop to 96, 94, 68, 53, 33 and 13% for loops of length 9, 10, 11, 12, 13, and 14, respectively. There is not a single loop in the current Sequence Space at any length up to 14 residues that is not matched with a conformational segment that shares at least 20% sequence identity. This minimum observed identity is 40% for loops of 12 residues or shorter and is as high as 50% for 10 residue or shorter loops. We also assessed the impact of rapidly growing sequence databanks on the estimated number of new loop conformations and found that while the number of sequentially unique sequence segments increased about six folds during the last five years there are almost no unique conformational segments among these up to 12 residues long fragments. Conclusion: The results suggest that fragment based prediction approaches are not limited any more by the completeness of fragments in databanks but rather by the effective scoring and search algorithms to locate them. The current favorable coverage and trends observed will be further accentuated with the progress of Protein Structure Initiative that targets new protein folds and ultimately aims at providing an exhaustive coverage of the structure space.


Related Articles

  • A t-private k-database information retrieval scheme. Blundo, Carlo; D’Arco, Paolo; De Santis, Alfredo // International Journal of Information Security;2001, Vol. 1 Issue 1, p64 

    A private information retrieval scheme enables a user to privately recover an item from a public accessible database. In this paper we present a private information retrieval scheme for k replicated databases. The scheme is information-theoretic secure for coalitions of databases of size t...

  • Electricity a concern in large data centers. Schwartz, Ephraim; Briody, Dan // InfoWorld;03/27/2000, Vol. 22 Issue 13, p32 

    Deals with the problems faced by the increasing numbers of data centers in anticipation for the growth of outsourcing. Factors that make electric companies halt their data center upgrade; Key to most data centers in offering their service.

  • Foreword to the Special Issue on "New Trends in Databases and Information Systems". PECHENIZKIY, MYKOLA; WOJCIECHOWSKI, MAREK // Control & Cybernetics;2012, Vol. 41 Issue 4, p711 

    A foreword to "New Trends in Databases and Information Systems" is presented.

  • Evaluating the Data on Data Loggers. Aldis-Wilson, Scott M. // R&D Magazine;Dec2000, Vol. 42 Issue 12, p59 

    Focuses on how to evaluate the data on data loggers. Capabilities of data loggers; Primary advantage of using loggers; Details on the operation of data loggers.

  • Chapter 2: Entity Extraction: Rule-based Methods.  // Foundations & Trends in Databases;2007, Vol. 1 Issue 3, p282 

    Chapter 2 of the book "Foundations and Trends in Databases: Information Extraction," Vol. 1, is presented. It discusses the rule-based methods for entity extraction which consist of two parts including a collection of rules and a set of policies to control the firings of multiple rules. It also...

  • Randomised trial of personalised computer based information for cancer patients. Jones, Ray; Pearson, Janne // BMJ: British Medical Journal (International Edition);11/06/99, Vol. 319 Issue 7219, p1241 

    Provides information on a study which compared the use and effect of a computer based information system for cancer patients that is personalized. Overview of the information system; Participants and methods; Results and conclusion.

  • Constraint-based Interoperability of Spatiotemporal Databases*. Chomicki, Jan; Revesz, Peter Z. // GeoInformatica;Sep1999, Vol. 3 Issue 3, p211 

    We propose constraint databases as an intermediate level facilitating the interoperability of spatiotemporal data models. Constraint query languages are used to express translations between different data models. We illustrate our approach in the context of a number of temporal, spatial, and...

  • Correspondence. Blomqvist, Lennart; Torkzad, Michael // Acta Radiologica;Jul2001, Vol. 42 Issue 4, p430 

    No abstract available.

  • The Next Era in Appraisal: Opportunity vs. Obsolescence. Dorchester Jr., John D. // Appraisal Journal;Jan1985, Vol. 53 Issue 1, p9 

    Presents an outlook on the next era in appraisal. Historical background; Computer revolution; Appraisal methodology as it applies to databasing and information processing concepts; Possible applications for new technologies; Current status of data information resources.


Read the Article


Sorry, but this item is not currently available from your library.

Try another library?
Sign out of this library

Other Topics