PCAS -- a precomputed proteome annotation database resource

Yong Zhang; Yanbin Yin; Yunjia Chen; Ge Gao; Peng Yu; Jingchu Luo; Ying Jiang
January 2003
BMC Genomics;2003, Vol. 4, p42
Academic Journal
Background: Many model proteomes or "complete" sets of proteins of given organisms are now publicly available. Much effort has been invested in computational annotation of those "draft" proteomes. Motif or domain based algorithms play a pivotal role in functional classification of proteins. Employing most available computational algorithms, mainly motif or domain recognition algorithms, we set up to develop an online proteome annotation system with integrated proteome annotation data to complement existing resources. Results: We report here the development of PCAS (ProteinCentric Annotation System) as an online resource of pre-computed proteome annotation data. We applied most available motif or domain databases and their analysis methods, including hmmpfam search of HMMs in Pfam, SMART and TIGRFAM, RPS-PSIBLAST search of PSSMs in CDD, pfscan of PROSITE patterns and profiles, as well as PSI-BLAST search of SUPERFAMILY PSSMs. In addition, signal peptide and TM are predicted using SignalP and TMHMM respectively. We mapped SUPERFAMILY and COGs to InterPro, so the motif or domain databases are integrated through InterPro. PCAS displays table summaries of precomputed data and a graphical presentation of motifs or domains relative to the protein. As of now, PCAS contains human IPI, mouse IPI, and rat IPI, A. thaliana, C. elegans, D. melanogaster, S. cerevisiae, and S. pombe proteome. PCAS is available at http://pak.cbi.pku.edu.cn/proteome/gca.php Conclusion: PCAS gives better annotation coverage for model proteomes by employing a wider collection of available algorithms. Besides presenting the most confident annotation data, PCAS also allows customized query so users can inspect statistically less significant boundary information as well. Therefore, besides providing general annotation information, PCAS could be used as a discovery platform. We plan to update PCAS twice a year. We will upgrade PCAS when new proteome annotation algorithms identified.


Related Articles

  • Fast protein structure alignment using Gaussian overlap scoring of backbone peptide fragment similarity. Ritchie, David W.; Ghoorah, Anisah W.; Mavridis, Lazaros; Venkatraman, Vishwesh // Bioinformatics;Dec2012, Vol. 28 Issue 24, p3274 

    Motivation: Aligning and comparing protein structures is important for understanding their evolutionary and functional relationships. With the rapid growth of protein structure databases in recent years, the need to align, superpose and compare protein structures rapidly and accurately has never...

  • Application of a sensitive collection heuristic for very large protein families: Evolutionary relationship between adipose triglyceride lipase (ATGL) and classic mammalian lipases. Schneider, Georg; Neuberger, Georg; Wildpaner, Michael; Sun Tian; Berezovsky, Igor; Eisenhaber, Frank // BMC Bioinformatics;2006, Vol. 7, p164 

    Background: Manually finding subtle yet statistically significant links to distantly related homologues becomes practically impossible for very populated protein families due to the sheer number of similarity searches to be invoked and analyzed. The unclear evolutionary relationship between...

  • Functional Evolution of BRCT Domains from Binding DNA to Protein. Zi-Zhang Sheng; Yu-Qi Zhao; Jing-Fei Huang // Evolutionary Bioinformatics;2011, Issue 7, p87 

    The BRCT domain (BRCA1 C-terminal domain) is an important signaling and protein targeting motif in the DNA damage response system. The BRCT domain, which mainly occurs as a singleton (single BRCT) or tandem pair (double BRCT), contains a phosphate-binding pocket that can bind the phosphate from...

  • What's in a Likelihood? Simple Models of Protein Evolution and the Contribution of Structurally Viable Reconstructions to the Likelihood. LAKNER, CLEMENS; HOLDER, MARK T.; GOLDMAN, NICK; NAYLOR, GAVIN J. P. // Systematic Biology;Mar2011, Vol. 60 Issue 2, p161 

    Most phylogenetic models of protein evolution assume that sites are independent and identically distributed. Interactions between sites are ignored, and the likelihood can be conveniently calculated as the product of the individual site likelihoods. The calculation considers all possible...

  • Birth and Rapid Subcellular Adaptation of a Hominoid-Specific CDC14 Protein. Rosso, Lia; Marques, Ana Claudia; Weier, Manuela; Lambert, Nelle; Lambot, Marie-Alexandra; Vanderhaeghen, Pierre; Kaessmann, Henrik // PLoS Biology;Jun2008, Vol. 6 Issue 6, pe140 

    Evolutionary and experimental analyses of the hominoid CDC14 retrogene uncover a novel mode for the emergence of new gene function: selectively driven subcellular adaptation of the encoded protein.

  • The Origins of Specificity in Polyketide Synthase Protein Interactions. Thattai, Mukund; Burak, Yoram; Shraiman, Boris I. // PLoS Computational Biology;Sep2007, Vol. 3 Issue 9, p1827 

    Polyketides, a diverse group of heteropolymers with antibiotic and antitumor properties, are assembled in bacteria by multiprotein chains of modular polyketide synthase (PKS) proteins. Specific protein--protein interactions determine the order of proteins within a multiprotein chain, and thereby...

  • Where Does the Alignment Score Distribution Shape Come from? Ortet, Philippe; Bastien, Olivier // Evolutionary Bioinformatics;2010, Issue 6, p159 

    Alignment algorithms are powerful tools for searching for homologous proteins in databases, providing a score for each sequence present in the database. It has been well known for 20 years that the shape of the score distribution looks like an extreme value distribution. The extremely large...

  • Workflow management systems for gene sequence analysis and evolutionary studies - A Review. Sharma, Anu; Rai, Anil; Lal, S. B. // Bioinformation;2013, Vol. 9 Issue 13, p663 

    Post 'omic' era has resulted in the development of many primary, secondary and derived databases. Many analytical and visualization bioinformatics tools have been developed to manage and analyze the data available through large sequencing projects. Availability of heterogeneous databases and...

  • The SPECIES and ORGANISMS Resources for Fast and Accurate Identification of Taxonomic Names in Text. Pafilis, Evangelos; Frankild, Sune P.; Fanini, Lucia; Faulwetter, Sarah; Pavloudi, Christina; Vasileiadou, Aikaterini; Arvanitidis, Christos; Jensen, Lars Juhl // PLoS ONE;Jun2013, Vol. 8 Issue 6, p1 

    The exponential growth of the biomedical literature is making the need for efficient, accurate text-mining tools increasingly clear. The identification of named biological entities in text is a central and difficult task. We have developed an efficient algorithm and implementation of a...


Read the Article


Sorry, but this item is not currently available from your library.

Try another library?
Sign out of this library

Other Topics