Power and Predictive Accuracy of Polygenic Risk Scores

Dudbridge, Frank
March 2013
PLoS Genetics;Mar2013, Vol. 9 Issue 3, Special section p1
Academic Journal
Polygenic scores have recently been used to summarise genetic effects among an ensemble of markers that do not individually achieve significance in a large-scale association study. Markers are selected using an initial training sample and used to construct a score in an independent replication sample by forming the weighted sum of associated alleles within each subject. Association between a trait and this composite score implies that a genetic signal is present among the selected markers, and the score can then be used for prediction of individual trait values. This approach has been used to obtain evidence of a genetic effect when no single markers are significant, to establish a common genetic basis for related disorders, and to construct risk prediction models. In some cases, however, the desired association or prediction has not been achieved. Here, the power and predictive accuracy of a polygenic score are derived from a quantitative genetics model as a function of the sizes of the two samples, explained genetic variance, selection thresholds for including a marker in the score, and methods for weighting effect sizes in the score. Expressions are derived for quantitative and discrete traits, the latter allowing for case/control sampling. A novel approach to estimating the variance explained by a marker panel is also proposed. It is shown that published studies with significant association of polygenic scores have been well powered, whereas those with negative results can be explained by low sample size. It is also shown that useful levels of prediction may only be approached when predictors are estimated from very large samples, up to an order of magnitude greater than currently available. Therefore, polygenic scores currently have more utility for association testing than predicting complex traits, but prediction will become more feasible as sample sizes continue to grow.


Related Articles

  • An Empirical Bayes Mixture Model for Effect Size Distributions in Genome-Wide Association Studies. Thompson, Wesley K.; Wang, Yunpeng; Schork, Andrew J.; Witoelar, Aree; Zuber, Verena; Xu, Shujing; Werge, Thomas; Holland, Dominic; null, null; Andreassen, Ole A.; Dale, Anders M. // PLoS Genetics;12/29/2015, Vol. 11 Issue 12, p1 

    Characterizing the distribution of effects from genome-wide genotyping data is crucial for understanding important aspects of the genetic architecture of complex traits, such as number or proportion of non-null loci, average proportion of phenotypic variance explained per non-null effect, power...

  • Comparison of statistical procedures for estimating polygenic effects using dense genome-wide marker data. Pimentel, Eduardo C. G.; König, Sven; Schenkel, Flavio S.; Simianer, Henner // BMC Proceedings;2009 Supplement 1, Vol. 3, p1 

    In this study we compared different statistical procedures for estimating SNP effects using the simulated data set from the XII QTL-MAS workshop. Five procedures were considered and tested in a reference population, i.e., the first four generations, from which phenotypes and genotypes were...

  • Pre-selection of most significant SNPS for the estimation of genomic breeding values. Macciotta, Nicolò P. P.; Gaspa, Giustino; Steri, Roberto; Pieramati, Camillo; Carnier, Paolo; Dimauro, Corrado // BMC Proceedings;2009 Supplement 1, Vol. 3, p1 

    The availability of a large amount of SNP markers throughout the genome of different livestock species offers the opportunity to estimate genomic breeding values (GEBVs). However, the estimation of many effects in a data set of limited size represent a severe statistical problem. A preselection...

  • Microphathalmia associated with Neurofibromatosis 1 and PAX6 mutation. Henderson, Alex; Lynch, S.A.; Clarke, M.; Williamson, K.; van Heyningen, V. // Journal of Medical Genetics;Sep2003 Supplement, Vol. 40, pS33 

    The aetiology of microphthalmia is poorly understood but it is likely that both environmental and genetic mechanisms are involved. Recently interest has focussed on polygenic inheritance as an explanation for variable expressivity of a number of genetic disorders. We present a family with a...

  • A new SNP-based vision of the genetics of sex determination in European sea bass (Dicentrarchus labrax). Palaiokostas, Christos; Bekaert, Michaël; Taggart, John B.; Gharbi, Karim; McAndrew, Brendan J.; Chatain, Béatrice; Penman, David J.; Vandeputte, Marc // Genetics Selection Evolution;9/4/2015, Vol. 47 Issue 1, p1 

    Background: European sea bass (Dicentrarchus labrax) is one of the most important farmed species in Mediterranean aquaculture. The observed sexual growth and maturity dimorphism in favour of females adds value towards deciphering the sex determination system of this species. Current knowledge...

  • Molecular analysis of new sources of resistance to Pseudoperonospora cubensis (Berk. et Curt.) Rostovzev in cucumber. Szczechura, W.; Staniaszek, M.; Klosinska, U.; Kozik, E. // Russian Journal of Genetics;Oct2015, Vol. 51 Issue 10, p974 

    Downy mildew of cucumber ( Cucumis sativus L.), caused by Pseudoperonospora cubensis (Berk. et Curt.) Rostovzev, is one of the most important foliar diseases of cucurbit crops. Two parental lines resistant PI 197085, susceptible PI 175695 and their F2 generation were used in our study....

  • A unified test of linkage analysis and rare-variant association for analysis of pedigree sequence data. Hu, Hao; Roach, Jared C; Coon, Hilary; Guthery, Stephen L; Voelkerding, Karl V; Margraf, Rebecca L; Durtschi, Jacob D; Tavtigian, Sean V; Shankaracharya; Wu, Wilfred; Scheet, Paul; Wang, Shuoguo; Xing, Jinchuan; Glusman, Gustavo; Hubley, Robert; Li, Hong; Garg, Vidu; Moore, Barry; Hood, Leroy; Galas, David J // Nature Biotechnology;Jul2014, Vol. 32 Issue 7, p663 

    High-throughput sequencing of related individuals has become an important tool for studying human disease. However, owing to technical complexity and lack of available tools, most pedigree-based sequencing studies rely on an ad hoc combination of suboptimal analyses. Here we present...

  • The Genetics of Hypodontia. SUAREZ, BRIAN K.; SPENCE, M. ANNE // Journal of Dental Research;Jul1974, Vol. 53 Issue 4, p781 

    A large body of family data was analyzed to explain the genetics of hypodontia. Two multiple threshold models that were developed for quasicontinuous traits were used. The data fit the polygenic model much better than they fit the single major gene model.

  • Estimating genomic breeding values from the QTL-MAS Workshop Data using a single SNP and haplotype/IBD approach. Calus, Mario P. L.; de Roos, Sander P. W.; Veerkamp, Roel F. // BMC Proceedings;2009 Supplement 1, Vol. 3, p1 

    Genomic breeding values were estimated using a Gibbs sampler that avoided the use of the Metropolis-Hastings step as implemented in the BayesB model of Meuwissen et al., Genetics 2001, 157:1819-1829. Two models that estimated genomic estimated breeding values (EBVs) were applied: one used...


Read the Article


Sorry, but this item is not currently available from your library.

Try another library?
Sign out of this library

Other Topics