RAId_DbS: Peptide Identification using Database Searches with Realistic Statistics

Alves, Gelio; Ogurtsov, Aleksey Y.; Yi-Kuo Yu
January 2007
Biology Direct;2007, Vol. 2, p25
Academic Journal
Background: The key to mass-spectrometry-based proteomics is peptide identification. A major challenge in peptide identification is to obtain realistic E-values when assigning statistical significance to candidate peptides. Results: Using a simple scoring scheme, we propose a database search method with theoretically characterized statistics. Taking into account possible skewness in the random variable distribution and the effect of finite sampling, we provide a theoretical derivation for the tail of the score distribution. For every experimental spectrum examined, we collect the scores of peptides in the database, and find good agreement between the collected score statistics and our theoretical distribution. Using Student's t-tests, we quantify the degree of agreement between the theoretical distribution and the score statistics collected. The T-tests may be used to measure the reliability of reported statistics. When combined with reported P-value for a peptide hit using a score distribution model, this new measure prevents exaggerated statistics. Another feature of RAId_DbS is its capability of detecting multiple co-eluted peptides. The peptide identification performance and statistical accuracy of RAId_DbS are assessed and compared with several other search tools. The executables and data related to RAId_DbS are freely available upon request.


Related Articles

  • ASSESSING METHODS--MANY VARIABLES. Gore, Sheila M. // British Medical Journal (Clinical Research Edition);10/3/1981, Vol. 283 Issue 6296, p901 

    Focuses on methods in assessing dependent or outcome variables. Efficacy of statistical method; Importance of tables or graphs in multivariate methods; Details on the methods of analysis.

  • New Filter method for categorical variables' selection. Bouhamed, Heni; Lecroq, Thierry; Rebaï, Ahmed // International Journal of Computer Science Issues (IJCSI);May2012, Vol. 9 Issue 3, p10 

    It is worth noting that the variable-selection process has become an increasingly exciting challenge, given the dramatic increase in the size of databases and the number of variables to be explored and modelized. Therefore, several strategies and methods have been developed with the aim of...

  • A shorter proof of Kanter�s Bessel function concentration bound. Mattner, Lutz; Roos, Bero // Probability Theory & Related Fields;Sep2007, Vol. 139 Issue 1/2, p191 

    We give a shorter proof of Kanter�s (J. Multivariate Anal. 6, 222�236, 1976) sharp Bessel function bound for concentrations of sums of independent symmetric random vectors. We provide sharp upper bounds for the sum of modified Bessel functions I0( x) + I1( x), which might be of...

  • The negative association property for the absolute values of random variables equidistributed on a generalized Orlicz ball. Marcin Pilipczuk; Jakub Wojtaszczyk // Positivity;Jul2008, Vol. 12 Issue 3, p421 

    Abstract  Random variables equidistributed on convex bodies have received quite a lot of attention in the last few years. In this paper we prove the negative association property (which generalizes the subindependence of coordinate slabs) for generalized Orlicz balls. This allows us to...

  • Selective influence through conditional independence. Dzhafarov, Ehtibar // Psychometrika;Mar2003, Vol. 68 Issue 1, p7 

    Let each of several (generally interdependent) random vectors, taken separately, be influenced by a particular set of external factors. Under what kind of the joint dependence of these vectors on the union of these factor sets can one say that each vector is selectively influenced by “its...

  • ON THE NUMBER OF COMPONENT FAILURES IN SYSTEMS WHOSE COMPONENT LIVES ARE EXCHANGEABLE. Ross, Sheldon M.; Shahshahanis, Mehrdad; Weiss, Gideon // Mathematics of Operations Research;Aug80, Vol. 5 Issue 3, p358 

    We consider a system that is composed of finitely many independent components each of which is either "on" or "off" at any time. The components are initially on and they have common on-time distributions. Once a component goes off, it remains off forever. The system is monotone in the sense that...

  • Testing unidimensionality in polytomous Rasch models. CHRISTENSEN, KARL BANG; BJORNER, JAKOB BUE; KREINER, SVEND; PETERSEN, JØRGEN HOLM // Psychometrika;Dec2002, Vol. 67 Issue 4, p563 

    A fundamental assumption of most IRT models is that items measure the same unidimensional latent construct. For the polytomous Rasch model two ways of testing this assumption against specific multidimensional alternatives are discussed. One, a marginal approach assuming a multidimensional...

  • Deterministic Transformations of Random Variables and the Comparative Statics of Risk. Meyer, Jack; Ormiston, Michael B. // Journal of Risk & Uncertainty;Jun1989, Vol. 2 Issue 2, p179 

    In this article, a general class of deterministic transformations that can be interpreted as changes in risk are identified. This provides a fourth characterization of a Rothschild-Stiglitz increase in risk. In addition, a particular subclass of these transformations, termed simple...

  • Combining assays for estimating prevalence of human herpesvirus 8 infection using multivariate mixture models. Ruth M. Pfeiffer; Raymond J. Carroll; William Wheeler; Denise Whitby; Sam Mbulaiteye // Biostatistics;Jan2008, Vol. 9 Issue 1, p137 

    For many diseases, it is difficult or impossible to establish a definitive diagnosis because a perfect “gold standard” may not exist or may be too costly to obtain. In this paper, we propose a method to use continuous test results to estimate prevalence of disease in a given...


Read the Article


Sorry, but this item is not currently available from your library.

Try another library?
Sign out of this library

Other Topics