A weighted average difference method for detecting differentially expressed genes from microarray data

Kadota, Koji; KNakai, Yuji; KShimizu, Kentaro
January 2008
Algorithms for Molecular Biology;2008, Vol. 3, Special section p1
Academic Journal
Background: Identification of differentially expressed genes (DEGs) under different experimental conditions is an important task in many microarray studies. However, choosing which method use for a particular application is problematic because its performance depends on the evaluation metric, the dataset, and so on. In addition, when using the Affymetrix GeneChip® system, researchers must select a preprocessing algorithm from a number of competing algorithms such MAS, RMA, and DFW, for obtaining expression-level measurements. To achieve optimal performance for detecting DEGs, a suitable combination of gene selection method preprocessing algorithm needs to be selected for a given probe-level dataset. Results: We introduce a new fold-change (FC)-based method, the weighted average difference method (WAD), for ranking DEGs. It uses the average difference and relative average signal intensity so that highly expressed genes are highly ranked on the average for the different conditions. The idea is based on our observation that known or potential marker genes proteins) tend to have high expression levels. We compared WAD with seven other methods; average difference (AD), FC, rank products (RP), moderated t statistic (modT), significance analysis of microarrays (samT), shrinkage t statistic (shrinkT), and intensity-based moderated t statistic (ibmT). The evaluation was performed using a total of 38 different binary (two-class) probe-level datasets: two artificial "spike-in" datasets and 36 real experimental datasets. The results indicate that WAD outperforms the other methods when sensitivity and specificity are considered simultaneously: the area under the receiver operating characteristic curve for WAD was highest on average for the 38 datasets. The gene ranking for WAD was also the most consistent when subsets of top-ranked genes produced from three different preprocessed data (MAS, RMA, and DFW) were compared. Overall, WAD performed the best for MAS-preprocessed data and FC-based methods (AD, WAD, FC, or RP) performed well for RMA and DFW-preprocessed data. Conclusion: WAD is a promising alternative to existing methods for ranking DEGs with two classes. Its high performance should increase researchers' confidence in microarray analyses.


Related Articles

  • Integrative pathway analysis of genome-wide association studies and gene expression data in prostate cancer. Peilin Jia; Yang Liu; Zhongming Zhao // BMC Systems Biology;2012, Vol. 6 Issue Suppl 3, p1 

    Background: Pathway analysis of large-scale omics data assists us with the examination of the cumulative effects of multiple functionally related genes, which are difficult to detect using the traditional single gene/marker analysis. So far, most of the genomic studies have been conducted in a...

  • Microarray analysis after RNA amplification can detect pronounced differences in gene expression using limma. Diboun, Ilhem; Wernisch, Lorenz; Orengo, Christine Anne; Koltzenburg, Martin // BMC Genomics;2006, Vol. 7, p252 

    Background: RNA amplification is necessary for profiling gene expression from small tissue samples. Previous studies have shown that the T7 based amplification techniques are reproducible but may distort the true abundance of targets. However, the consequences of such distortions on the ability...

  • Using a Genetic Algorithm and a Perceptron for Feature Selection and Supervised Class Learning in DNA Microarray Data. Michal Karzynski; Álvaro Mateos; Javier Herrero; Joaquín Dopazo // Artificial Intelligence Review;Oct2003, Vol. 20 Issue 1/2, p39 

    Class prediction and feature selection is key in the context of diagnostic applications of DNA microarrays. Microarray data is noisy and typically composed of a low number of samples and a large number of genes. Perceptrons can constitute an efficient tool for accurate classification of...

  • Relationship between gene co-expression and probe localization on microarray slides. Kluger, Yuval; Haiyuan Yu; Jiang Qian; Gerstein, Mark // BMC Genomics;2003, Vol. 4, p49 

    Background: Microarray technology allows simultaneous measurement of thousands of genes in a single experiment. This is a potentially useful tool for evaluating co-expression of genes and extraction of useful functional and chromosomal structural information about genes. Results: In this work we...

  • A class of models for analyzing GeneChip gene expression analysis array data. Wenhong Fan; Pritchard, Joel I.; James M. Olson; Khalid, Najma; Lue Ping Zhao // BMC Genomics;2005, Vol. 6, p1 

    Background: Various analytical methods exist that first quantify gene expression and then analyze differentially expressed genes from Affymetrix GeneChip® gene expression analysis array data. These methods differ in the choice of probe measure (quantification of probe hybridization),...

  • Assessing probe-specific dye and slide biases in two-color microarray data. Ruixiao Lu; Geun-Cheol Lee; Shultz, Michael; Dardick, Chris; Kihong Jung; Phetsom, Jirapa; Yi Jia; Rice, Robert H.; Goldberg, Zelanna; Schnable, Patrick S.; Ronald, Pamela; Rocke, David M. // BMC Bioinformatics;2008, Vol. 9, Special section p1 

    Background: A primary reason for using two-color microarrays is that the use of two samples labeled with different dyes on the same slide, that bind to probes on the same spot, is supposed to adjust for many factors that introduce noise and errors into the analysis. Most users assume that any...

  • Cross-platform comparison of microarray data using order restricted inference. Klinglmueller, Florian; Tuechler, Thomas; Posch, Martin // Bioinformatics;Apr2011, Vol. 27 Issue 7, p953 

    Motivation: Titration experiments measuring the gene expression from two different tissues, along with total RNA mixtures of the pure samples, are frequently used for quality evaluation of microarray technologies. Such a design implies that the true mRNA expression of each gene, is either...

  • A Hybrid Both Filter and Wrapper Feature Selection Method for Microarray Classification. Li-Yeh Chuang; Chao-Hsuan Ke; Cheng-Hong Yang // International MultiConference of Engineers & Computer Scientists;2008, p146 

    Gene expression data is widely used in disease analysis and cancer diagnosis. However, since gene expression data could contain thousands of genes simultaneously, successful microarray classification is rather difficult. Feature selection is an important pre-treatment for any classification...

  • Dimension reduction with redundant gene elimination for tumor classification. Xue-Qiang Zeng; Guo-Zheng Li; Yang, Jack Y.; Yang, Mary Qu; Geng-Feng Wu // BMC Bioinformatics;2008 Supplement 6, Vol. 9, Special section p1 

    Background: Analysis of gene expression data for tumor classification is an important application of bioinformatics methods. But it is hard to analyse gene expression data from DNA microarray experiments by commonly used classifiers, because there are only a few observations but with thousands...


Read the Article


Sorry, but this item is not currently available from your library.

Try another library?
Sign out of this library

Other Topics