Double feature selection and cluster analyses in mining of microarray data from cotton

Alabady, Magdy S.; Eunseog Youn; Wilkins, Thea A.
January 2008
BMC Genomics;2008, Vol. 9, Special section p1
Academic Journal
Background: Cotton fiber is a single-celled seed trichome of major biological and economic importance. In recent years, genomic approaches such as microarray-based expression profiling were used to study fiber growth and development to understand the developmental mechanisms of fiber at the molecular level. The vast volume of microarray expression data generated requires a sophisticated means of data mining in order to extract novel information that addresses fundamental questions of biological interest. One of the ways to approach microarray data mining is to increase the number of dimensions/levels to the analysis, such as comparing independent studies from different genotypes. However, adding dimensions also creates a challenge in finding novel ways for analyzing multi-dimensional microarray data. Results: Mining of independent microarray studies from Pima and Upland (TM1) cotton using double feature selection and cluster analyses identified species-specific and stage-specific gene transcripts that argue in favor of discrete genetic mechanisms that govern developmental programming of cotton fiber morphogenesis in these two cultivated species. Double feature selection analysis identified the highest number of differentially expressed genes that distinguish the fiber transcriptomes of developing Pima and TM1 fibers. These results were based on the finding that differences in fibers harvested between 17 and 24 day post-anthesis (dpa) represent the greatest expressional distance between the two species. This powerful selection method identified a subset of genes expressed during primary (PCW) and secondary (SCW) cell wall biogenesis in Pima fibers that exhibits an expression pattern that is generally reversed in TM1 at the same developmental stage. Cluster and functional analyses revealed that this subset of genes are primarily regulated during the transition stage that overlaps the termination of PCW and onset of SCW biogenesis, suggesting that these particular genes play a major role in the genetic mechanism that underlies the phenotypic differences in fiber traits between Pima and TM1. Conclusion: The novel application of double feature selection analysis led to the discovery of species- and stage-specific genetic expression patterns, which are biologically relevant to the genetic programs that underlie the differences in the fiber phenotypes in Pima and TM1. These results promise to have profound impacts on the ongoing efforts to improve cotton fiber traits.


Related Articles

  • Size matters: network inference tackles the genome scale. Hayete, Boris; Gardner, Timothy S; Collins, James J // Molecular Systems Biology;2007, Vol. 3 Issue 1, p77 

    The article studies the inferring of molecular-level regulation, with frequent focus on transcriptional regulatory networks. The model archaeon Halobacterium NRC-I was used to show that, at least for a small genome, it is possible to determine a sizable portion of the transcriptional regulatory...

  • Tiling Arrays Undergo Commercial Rebirth. Tolchin, Elizabeth // Genomics & Proteomics;Oct2005, Vol. 5 Issue 8, p25 

    Reports on the reemergence of DNA tiling arrays as an instrumental tool for the genome-wide identification and characterization of functional elements. Inherent bias of other methods; Understanding of transcriptional activity in the genome; Mapping of promoters in the human genome; ENCODE...

  • Data quality in genomics and microarrays. Hanlee Ji; Davis, Ronald W. // Nature Biotechnology;Sep2006, Vol. 24 Issue 9, p1112 

    The article reports on the significance of objective quality control indices in facilitating clinical implementation of DNA microarrays for transcriptional profiling and in genomics. The development of the Microarray Quality Control and the External RNA Controls Consortium projects set the basis...

  • Knowledge-based analysis of microarrays for the discovery of transcriptional regulation relationships. Junhee Seok; Kaushal, Amit; Davis, Ronald W.; Wenzhong Xiao // BMC Bioinformatics;2010 Supplement 1, Vol. 11, Special section p1 

    Background: The large amount of high-throughput genomic data has facilitated the discovery of the regulatory relationships between transcription factors and their target genes. While early methods for discovery of transcriptional regulation relationships from microarray data often focused on the...

  • Construction, Visualisation, and Clustering of Transcription Networks from Microarray Expression Data. Freeman, Tom C.; Goldovsky, Leon; Brosch, Markus; Van Dongen, Stijn; Mazière, Pierre; Grocock, Russell J.; Freilich, Shiri; Thornton, Janet; Enright, Anton J. // PLoS Computational Biology;Oct2007, Vol. 3 Issue 10, pe206 

    Network analysis transcends conventional pairwise approaches to data analysis as the context of components in a network graph can be taken into account. Such approaches are increasingly being applied to genomics data, where functional linkages are used to connect genes or proteins. However,...

  • Complex Transcription Mechanisms in Mammalian Genomes - The Transcriptome of FANTOM3. Katayama, Shintaro; Hayashizaki, Yoshihide // Current Genomics;Dec2005, Vol. 6 Issue 8, p619 

    Systematic analysis of a biological system requires elucidation of its components. However, genome sequencing is only the first step; any analysis of transcription control and further functional genomics require the identification of all transcribed transcripts. FANTOM is the international...

  • Differential analysis for high density tiling microarray data. Ghosh, Srinka; Hirsch, Heather A.; Sekinger, Edward A.; Kapranov, Philipp; Struhl, Kevin; Gingeras, Thomas R. // BMC Bioinformatics;2007 Supplement 2, Vol. 8, p359 

    Background: High density oligonucleotide tiling arrays are an effective and powerful platform for conducting unbiased genome-wide studies. The ab initio probe selection method employed in tiling arrays is unbiased, and thus ensures consistent sampling across coding and non-coding regions of the...

  • Genevestigator Transcriptome Meta-Analysis and Biomarker Search Using Rice and Barley Gene Expression Databases. Zimmermann, Philip; Laule, Oliver; Schmitz, Josy; Hruz, Tomas; Bleuler, Stefan; Gruissem, Wilhelm // Molecular Plant (Oxford University Press / USA);Sep2008, Vol. 1 Issue 5, p851 

    The wide-spread use of microarray technologies to study plant transcriptomes has led to important discoveries and to an accumulation of profiling data covering a wide range of different tissues, developmental stages, perturbations, and genotypes. Querying a large number of microarray experiments...

  • The noncoding universe. Jarvis, Kester; Robertson, Miranda // BMC Biology;2011, Vol. 9 Issue 1, p52 

    The article reflects on extensive transcription of RNA. It is stated that rare transcripts, which are detected by microarray analysis may be missed by RNA-seq. It is also given that extensive transcription of RNA from regions of the genome which do not code for proteins and where there is no...


Read the Article


Sorry, but this item is not currently available from your library.

Try another library?
Sign out of this library

Other Topics