Silhouette scores for assessment of SNP genotype clusters

Lovmar, Lovisa; Ahlford, Annika; Jonsson, Mats; Syvänen, Ann-Christine
January 2005
BMC Genomics;2005, Vol. 6, p35
Academic Journal
Background: High-throughput genotyping of single nucleotide polymorphisms (SNPs) generates large amounts of data. In many SNP genotyping assays, the genotype assignment is based on scatter plots of signals corresponding to the two SNP alleles. In a robust assay the three clusters that define the genotypes are well separated and the distances between the data points within a cluster are short. "Silhouettes" is a graphical aid for interpretation and validation of data clusters that provides a measure of how well a data point was classified when it was assigned to a cluster. Thus "Silhouettes" can potentially be used as a quality measure for SNP genotyping results and for objective comparison of the performance of SNP assays at different circumstances. Results: We created a program (ClusterA) for calculating "Silhouette scores", and applied it to assess the quality of SNP genotype clusters obtained by single nucleotide primer extension ("minisequencing") in the Tag-microarray format. A Silhouette score condenses the quality of the genotype assignment for each SNP assay into a single numeric value, which ranges from 1.0, when the genotype assignment is unequivocal, down to -1.0, when the genotype assignment has been arbitrary. In the present study we applied Silhouette scores to compare the performance of four DNA polymerases in our minisequencing system by analyzing 26 SNPs in both DNA polarities in 16 DNA samples. We found Silhouettes to provide a relevant measure for the quality of SNP assays at different reaction conditions, illustrated by the four DNA polymerases here. According to our result, the genotypes can be unequivocally assigned without manual inspection when the Silhouette score for a SNP assay is > 0.65. All four DNA polymerases performed satisfactorily in our Tagarray minisequencing system. Conclusion: "Silhouette scores" for assessing the quality of SNP genotyping clusters is convenient for evaluating the quality of SNP genotype assignment, and provides an objective, numeric measure for comparing the performance of SNP assays. The program we created for calculating Silhouette scores is freely available, and can be used for quality assessment of the results from all genotyping systems, where the genotypes are assigned by cluster analysis using scatter plots.


Related Articles

  • SNP detection exploiting multiple sources of redundancy in large EST collections improves validation rates. Ben J. Hayes; Kjetil Nilsen; Paul R. Berg; Eli Grindflek; Sigbjørn Lien // Bioinformatics;Jul2007, Vol. 23 Issue 13, p1692 

    Motivation: Single nucleotide polymorphism (SNP) detection exploiting redundancy in expressed sequence tag (EST) collections that arises from the presence of transcripts of the same gene from different individuals has been used to generate large collections of SNPs for many species. A second...

  • Polymorphisms of Nucleotide Excision Repair Genes Predict Melanoma Survival. Li, Chunying; Yin, Ming; Wang, Li-E; Amos, Christopher I; Zhu, Dakai; Lee, Jeffrey E; Gershenwald, Jeffrey E; Grimm, Elizabeth A; Wei, Qingyi // Journal of Investigative Dermatology;Jul2013, Vol. 133 Issue 7, p1813 

    Melanoma is the most highly malignant skin cancer, and nucleotide excision repair (NER) is involved in melanoma susceptibility. In this analysis of 1,042 melanoma patients, we evaluated whether genetic variants of NER genes may predict survival outcome of melanoma patients. We used genotyping...

  • High Fidelity SNP Genotyping Using Sequence-Specific Primer Elongation and Fluorescence Correlation Spectroscopy. Hori, K.; Shin, W.S.; Hemmi, C.; Toyo-oka, T.; Makino, T. // Current Pharmaceutical Biotechnology;Dec2003, Vol. 4 Issue 6, p477 

    Reliable, efficient and cost-effective modalities are urgently needed for mass screening of gene mutations. Previous reports have shown that SSCP or genechip methods require substantial time and monetary costs, thus limiting their appeal. Sequence Specific Primer Polymerase Chain Reaction...

  • On the Selection and Evolution of Regulatory DNA Motifs. Gerland, Ulrich; Hwa, Terence // Journal of Molecular Evolution;Oct2002, Vol. 55 Issue 4, p386 

    The mutation and selection of regulatory DNA sequences are presented as an ideal model system of molecular evolution where genotype, phenotype, and fitness can be explicitly and independently characterized. In this theoretical study, we construct an explicit model for the evolution of regulatory...

  • Clock T3111C and Per2 C111G SNPs do not influence circadian rhythmicity in healthy Italian population. Choub, Anna; Mancuso, Michelangelo; Coppedè, Fabio; LoGerfo, Annalisa; Orsucci, Daniele; Petrozzi, Lucia; DiCoscio, Elisa; Maestri, Michelangelo; Rocchi, Anna; Bonanni, Enrica; Siciliano, Gabriele; Murri, Luigi // Neurological Sciences;Feb2011, Vol. 32 Issue 1, p89 

    possible relationship between human circadian rhythmicity and polymorphisms in clock genes have been documented. However, these data are controversial, and studies both corroborating and denying them have been reported. T3111C Clock polymorphism had been associated with the human evening...

  • The Cys allele (the Ser311Cys polymorphism) of the dopamine d2 receptor is associated with schizophrenia and impairments to selective attention in patients. Golimbet, V.; Lebedeva, I.; Monakhov, M.; Korovaitseva, G.; Lezheiko, T.; Abramova, L.; Kaleda, V.; Karpov, V. // Neuroscience & Behavioral Physiology;Jan2011, Vol. 41 Issue 1, p22 

    We report here our studies of the Ser311Cys polymorphism of the D2 dopamine receptor gene in 366 patients with schizophrenia and 387 control subjects. The incidence of the Cys allele was found to be greater ( p < 0.009) among patients than controls (8.5% and 3.9%, respectively). Selective...

  • The Complex Genetic Architecture of the Metabolome.  // PLoS Genetics;Nov2010, Vol. 6 Issue 11, p1 

    No abstract available.

  • Rapid Mapping and Identification of Mutations in Caenorhabditis elegans by Restriction Site-Associated DNA Mapping and Genomic Interval Pull-Down Sequencing. O'Rourke, Sean M.; Yochem, John; Connolly, Amy A.; Price, Meredith H.; Carter, Luke; Lowry, Joshua B.; Turnbull, Douglas W.; Kamps-Hughes, Nick; Stiffler, Nicholas; Miller, Michael R.; Johnson, Eric A.; Bowerman, Bruce // Genetics;Nov2011, Vol. 189 Issue 3, p767 

    Forward genetic screens provide a powerful approach for inferring gene function on the basis of the phenotypes associated with mutated genes. However, determining the causal mutation by traditional mapping and candidate gene sequencing is often the rate-limiting step, especially when analyzing...

  • Systematic Inference of Copy-Number Genotypes from Personal Genome Sequencing Data Reveals Extensive Olfactory Receptor Gene Content Diversity. Waszak, Sebastian M.; Hasin, Yehudit; Zichner, Thomas; Olender, Tsviya; Keydar, Ifat; Khen, Miriam; Stütz, Adrian M.; Schlattl, Andreas; Lancet, Doron; Korbel, Jan O. // PLoS Computational Biology;Nov2010, Vol. 6 Issue 11, p1 

    Copy-number variations (CNVs) are widespread in the human genome, but comprehensive assignments of integer locus copy-numbers (i.e., copy-number genotypes) that, for example, enable discrimination of homozygous from heterozygous CNVs, have remained challenging. Here we present CopySeq, a novel...


Read the Article


Sorry, but this item is not currently available from your library.

Try another library?
Sign out of this library

Other Topics