Clustering by genetic ancestry using genome-wide SNP data

Solovieff, Nadia; Hartley, Stephen W.; Baldwin, Clinton T.; Perls, Thomas T.; Steinberg, Martin H.; Sebastiani, Paola
January 2010
BMC Genetics;2010, Vol. 11, p108
Academic Journal
Background: Population stratification can cause spurious associations in a genome-wide association study (GWAS), and occurs when differences in allele frequencies of single nucleotide polymorphisms (SNPs) are due to ancestral differences between cases and controls rather than the trait of interest. Principal components analysis (PCA) is the established approach to detect population substructure using genome-wide data and to adjust the genetic association for stratification by including the top principal components in the analysis. An alternative solution is genetic matching of cases and controls that requires, however, well defined population strata for appropriate selection of cases and controls. Results: We developed a novel algorithm to cluster individuals into groups with similar ancestral backgrounds based on the principal components computed by PCA. We demonstrate the effectiveness of our algorithm in real and simulated data, and show that matching cases and controls using the clusters assigned by the algorithm substantially reduces population stratification bias. Through simulation we show that the power of our method is higher than adjustment for PCs in certain situations. Conclusions: In addition to reducing population stratification bias and improving power, matching creates a clean dataset free of population stratification which can then be used to build prediction models without including variables to adjust for ancestry. The cluster assignments also allow for the estimation of genetic heterogeneity by examining cluster specific effects.


Related Articles

  • A hypervariable STR polymorphism in the complement factor I (CFI) gene: Asian-specific alleles. Yuasa, Isao; Irizawa, Yoshito; Nishimukai, Hiroaki; Fukumori, Yasuo; Umetsu, Kazuo; Nakayashiki, Nori; Saitou, Naruya; Henke, Lotte; Henke, Jürgen // International Journal of Legal Medicine;Jan2011, Vol. 125 Issue 1, p121 

    In this study, a short tandem repeat (STR) polymorphism in intron 7 of the human complement factor I (CFI) gene was studied in 637 DNA samples obtained from African, German, Thai, and Japanese populations and German and Japanese families. A total of 41 alleles were observed and classified into...

  • Highlight: Family Ties—New Evidence Simplifies Human Evolutionary Tree. Venton, Danielle // Genome Biology & Evolution;Sep2012, Vol. 4 Issue 9, p1146 

    A letter to the editor is presented in response to the article "Reconstructing the demographic history of the human lineage using whole-genome sequences from human and three great apes" by Y. Hara, T. Imanishi, and Y. Satta in the previous issue.

  • Low-Pass Genome-Wide Sequencing and Variant Inference Using Identity-by-Descent in an Isolated Human Population. Gusev, A.; Shah, M. J.; Kenny, E. E.; Ramachandran, A.; Lowe, J. K.; Salit, J.; Lee, C. C.; Levandowsky, E. C.; Weaver, T. N.; Doan, Q. C.; Peckham, H. E.; McLaughlin, S. F.; Lyons, M. R.; Sheth, V. N.; Stoffel, M.; De La Vega, F. M.; Friedman, J. M.; Breslow, J. L.; Pe'er, I. // Genetics;Feb2012, Vol. 190 Issue 2, p679 

    Whole-genome sequencing in an isolated population with few founders directly ascertains variants from the population bottleneck that may be rare elsewhere. In such populations, shared haplotypes allow imputation of variants in unsequenced samples without resorting to complex statistical methods...

  • Population genetics?making sense out of sequence. Chakravarti, Aravinda // Nature Genetics;Jan99 Supplement, Vol. 21, p56 

    The complete human genome nucleotide sequence and technologies for assessing sequence variation on a genome?scale will prompt comprehensive studies of comparative genomic diversity in human populations across the globe. These studies, besides rejuvenating population genetics and our interest in...

  • The Human Genome Diversity Project as a Complement to Human Population Genetics. Bowman, James E. // Politics & the Life Sciences;Sep99, Vol. 18 Issue 2, p289 

    Presents the Human Genome Diversity Project as a complement to human population genetics. Prevalence of genetic disorders in certain groups of people; Futility of associating genes with crimes and intelligence levels; Concerns about a European bias in the study of genomes.

  • Genome-wide analysis of the human Alu Yb-lineage. Carter, Anthony B.; Salem, Abdel-HaIim; Hedges, Dale J.; Keegan, Catherine Nguyen; Kimball, Beth; Walker, Jerilyn A.; Watkins, W. Scott; Jorde, Lynn B.; Batzer, Mark A. // Human Genomics;Mar2004, Vol. 1 Issue 3, p167 

    The Alu Yb-lineage is a 'young' primarily human-specific group of short interspersed element (SINE) subfamilies that have integrated throughout the human genome. In this study. we have computationally screened the draft sequence of the human genome for Alu Yb-lineage subfamily members present on...

  • Evolution of genetic and genomic features unique to the human lineage. O'Bleness, Majesta; Searles, Veronica B.; Varki, Ajit; Gagneux, Pascal; Sikela, James M. // Nature Reviews Genetics;Dec2012, Vol. 13 Issue 12, p853 

    Given the unprecedented tools that are now available for rapidly comparing genomes, the identification and study of genetic and genomic changes that are unique to our species have accelerated, and we are entering a golden age of human evolutionary genomics. Here we provide an overview of these...

  • SCIENCE AND SOCIETY: Human genome diversity: What about the other human genome project? Greely, Henry T. // Nature Reviews Genetics;Mar2001, Vol. 2 Issue 3, p222 

    Although the Human Genome Project has been successful, the Human Genome Diversity Project, proposed in 1991, has so far failed to thrive. One of the main values in studying the human genome, however, will come from examining its variations and their effects. To do that in a systematic way, an...

  • Paleo-Balkan and Slavic Contributions to the Genetic Pool of Moldavians: Insights from the Y Chromosome. Varzari, Alexander; Kharkov, Vladimir; Nikitin, Alexey G.; Raicu, Florina; Simonova, Kseniya; Stephan, Wolfgang; Weiss, Elisabeth H.; Stepanov, Vadim // PLoS ONE;Jan2013, Vol. 8 Issue 1, Special section p1 

    Moldova has a rich historical and cultural heritage, which may be reflected in the current genetic makeup of its population. To date, no comprehensive studies exist about the population genetic structure of modern Moldavians. To bridge this gap with respect to paternal lineages, we analyzed 37...


Read the Article


Sorry, but this item is not currently available from your library.

Try another library?
Sign out of this library

Other Topics