Comparison of gene coverage of mouse oligonucleotide microarray platforms

Verdugo, Ricardo A; Medrano, Juan F
January 2006
BMC Genomics;2006, Vol. 7, p58
Academic Journal
Background: The increasing use of DNA microarrays for genetical genomics studies generates a need for platforms with complete coverage of the genome. We have compared the effective gene coverage in the mouse genome of different commercial and noncommercial oligonucleotide microarray platforms by performing an in-house gene annotation of probes. We only used information about probes that is available from vendors and followed a process that any researcher may take to find the gene targeted by a given probe. In order to make consistent comparisons between platforms, probes in each microarray were annotated with an Entrez Gene id and the chromosomal position for each gene was obtained from the UCSC Genome Browser Database. Gene coverage was estimated as the percentage of Entrez Genes with a unique position in the UCSC Genome database that is tested by a given microarray platform. Results: A MySQL relational database was created to store the mapping information for 25,416 mouse genes and for the probes in five microarray platforms (gene coverage level in parenthesis): Affymetrix430 2.0 (75.6%), ABI Genome Survey (81.24%), Agilent (79.33%), Codelink (78.09%), Sentrix (90.47%); and four array-ready oligosets: Sigma (47.95%), Operon v.3 (69.89%), Operon v.4 (84.03%), and MEEBO (84.03%). The differences in coverage between platforms were highly conserved across chromosomes. Differences in the number of redundant and unspecific probes were also found among arrays. The database can be queried to compare specific genomic regions using a web interface. The software used to create, update and query the database is freely available as a toolbox named ArrayGene. Conclusion: The software developed here allows researchers to create updated custom databases by using public or proprietary information on genes for any organisms. ArrayGene allows easy comparisons of gene coverage between microarray platforms for any region of the genome. The comparison presented here reveals that the commercial microarray Sentrix, which is based on the MEEBO public oligoset, showed the best mouse genome coverage currently available. We also suggest the creation of guidelines to standardize the minimum set of information that vendors should provide to allow researchers to accurately evaluate the advantages and disadvantages of using a given platform.


Related Articles

  • SNPinProbe_l.0: A database for filtering out probes in the Affymetrix GeneChip Human Exon 1.0 ST array potentially affected by SNPs. Shiwei Duan; Wei Zhang; Bleibel, Wasim Kamel; Cox, Nancy Jean; Dolan, M. Eileen // Bioinformation;2008, Vol. 2 Issue 10, p469 

    The Affymetrix GeneChip® Human Exon 1.0 ST array (exon array) is designed to measure both gene-level and exon-level expression in human samples. This exon array contains ~1.4 million probesets consisting of ~5.4 million probes and profiles over 17,000 well-annotated gene transcripts in the...

  • An analysis of intra array repeats: the good, the bad and the non informative. Elbez, Yedid; Farkash-Amar, Shlomit; Simon, Itamar // BMC Genomics;2006, Vol. 7, p136 

    Background: On most common microarray platforms many genes are represented by multiple probes. Although this is quite common no one has systematically explored the concordance between probes mapped to the same gene. Results: Here we present an analysis of all the cases of multiple probe sets...

  • Comparative physical mapping of targeted regions of the rat genome. Summers, Tyrone J.; Thomas, James W.; Lee-Lin, Shih-Queen; Maduro, Valerie V.B.; Idol, Jacquelyn R.; Green, Eric D. // Mammalian Genome;Jul2001, Vol. 12 Issue 7, p508 

    The comparative mapping and sequencing of vertebrate genomes is now a key priority for the Human Genome Project. In addition to finishing the human genome sequence and generating a `working draft' of the mouse genome sequence, significant attention is rapidly turning to the analysis of other...

  • Choosing blindly but wisely: differentially private solicitation of DNA datasets for disease marker discovery. Yongan Zhao; Xiaofeng Wang; Xiaoqian Jiang; Lucila Ohno-Machado; Haixu Tang // Journal of the American Medical Informatics Association;Jan2015, Vol. 22 Issue 1, p100 

    Objective To propose a new approach to privacy preserving data selection, which helps the data users access human genomic datasets efficiently without undermining patients' privacy. Methods Our idea is to let each data owner publish a set of differentially-private pilot data, on which a data...

  • A LAD-based method for selecting short oligo probes for genotyping applications. Kwangsoo Kim; Hong Ryoo // OR Spectrum;Apr2008, Vol. 30 Issue 2, p249 

    Specializing a general framework of logical analysis of data for efficiently handling large-scale genomic data, we develop in this paper a probe design method for selecting short oligo probes for genotyping applications. When tested on genomic sequences obtained from the National Center of...

  • Finding Alu in primate genomes with AF-1. Shankar, Ravi; Kataria, Bhavesh; Mukerji, Mitali // Bioinformation;2009, Vol. 3 Issue 7, p287 

    Repetitive sequences occupy more than 40% of the human genome which is much larger compared to the 2% occupied by the coding DNA. Amongst these Alu elements are the second largest class of repeats, occupying nearly 10% of the whole genome. Alus have been implicated in many genomic processes,...

  • Bioinformatic screening of human ESTs for differentially expressed genes in normal and tumor tissues. Aouacheria, Abdel; Navratil, Vincent; Barthelaix, Audrey; Mouchiroud, Dominique; Gautier, Christian // BMC Genomics;2006, Vol. 7, p94 

    Background: Owing to the explosion of information generated by human genomics, analysis of publicly available databases can help identify potential candidate genes relevant to the cancerous phenotype. The aim of this study was to scan for such genes by whole-genome in silico subtraction using...

  • angaGEDUCI: Anopheles gambiae gene expression database with integrated comparative algorithms for identifying conserved DNA motifs in promoter sequences. Sumudu N Dissanayake; Osvaldo Marinotti; Jose Marcos C Ribeiro; Anthony A James // BMC Genomics;2006, Vol. 7, p1 

    Background: The completed sequence of the Anopheles gambiae genome has enabled genomewide analyses of gene expression and regulation in this principal vector of human malaria. These investigations have created a demand for efficient methods of cataloguing and analyzing the large quantities of...

  • Rice pseudomolecule-anchored cross-species DNA sequence alignments indicate regional genomic variation in expressed sequence conservation. Armstead, Ian; Lin Huang; King, Julie; Ougham, Helen; Thomas, Howard; King, Ian // BMC Genomics;2007, Vol. 8, p283 

    Background: Various methods have been developed to explore inter-genomic relationships among plant species. Here, we present a sequence similarity analysis based upon comparison of transcript-assembly and methylation-filtered databases from five plant species and physically anchored rice coding...


Read the Article


Sorry, but this item is not currently available from your library.

Try another library?
Sign out of this library

Other Topics