A bootstrap based analysis pipeline for efficient classification of phylogenetically related animal miRNAs

Yong Huang; Xun Gu
January 2007
BMC Genomics;2007, Vol. 8, p66
Academic Journal
Background: Phylogenetically related miRNAs (miRNA families) convey important information of the function and evolution of miRNAs. Due to the special sequence features of miRNAs, pair-wise sequence identity between miRNA precursors alone is often inadequate for unequivocally judging the phylogenetic relationships between miRNAs. Most of the current methods for miRNA classification rely heavily on manual inspection and lack measurements of the reliability of the results. Results: In this study, we designed an analysis pipeline (the Phylogeny-Bootstrap-Cluster (PBC) pipeline) to identify miRNA families based on branch stability in the bootstrap trees derived from overlapping genome-wide miRNA sequence sets. We tested the PBC analysis pipeline with the miRNAs from six animal species, H. sapiens, M. musculus, G. gallus, D. rerio, D. melanogaster, and C. elegans. The resulting classification was compared with the miRNA families defined in miRBase. The two classifications were largely consistent. Conclusion: The PBC analysis pipeline is an efficient method for classifying large numbers of heterogeneous miRNA sequences. It requires minimum human involvement and provides measurements of the reliability of the classification results.


Related Articles

  • Evolution of genes and genomes on the Drosophila phylogeny. Drosophila // Nature;11/8/2007, Vol. 450 Issue 7167, p203 

    Comparative analysis of multiple genomes in a phylogenetic framework dramatically improves the precision and sensitivity of evolutionary inference, producing more robust results than single-genome analyses can provide. The genomes of 12 Drosophila species, ten of which are presented here for the...

  • Insights into the evolution of the snail superfamily from metazoan wide molecular phylogenies and expression data in annelids. Kerner, Pierre; Hung, Johanne; Béhague, Julien; Le Gouar, Martine; Balavoine, Guillaume; Vervoort, Michel // BMC Evolutionary Biology;2009, Vol. 9, Special section p1 

    Background: An important issue concerning the evolution of duplicated genes is to understand why paralogous genes are retained in a genome even though the most likely fate for a redundant duplicated gene is nonfunctionalization and thereby its elimination. Here we study a complex superfamily...

  • Impact of duplicate gene copies on phylogenetic analysis and divergence time estimates in butterflies. Pohl, Nélida; Sison-Mangus, Marilou P.; Yee, Emily N.; Liswi, Saif W.; Briscoe, Adriana D. // BMC Evolutionary Biology;2009, Vol. 9, Special section p1 

    Background: The increase in availability of genomic sequences for a wide range of organisms has revealed gene duplication to be a relatively common event. Encounters with duplicate gene copies have consequently become almost inevitable in the context of collecting gene sequences for inferring...

  • The Evolutionary History of Protein Domains Viewed by Species Phylogeny. Song Yang; Bourne, Philip E. // PLoS ONE;2009, Vol. 4 Issue 12, p1 

    Background: Protein structural domains are evolutionary units whose relationships can be detected over long evolutionary distances. The evolutionary history of protein domains, including the origin of protein domains, the identification of domain loss, transfer, duplication and combination with...

  • The ITS1-5.8S-ITS2 Sequence Region in the Musaceae: Structure, Diversity and Use in Molecular Phylogeny. Hřibová, Eva; Čížková, Jana; Christelová, Pavla; Taudien, Stefan; de Langhe, Edmond; Doležel, Jaroslav // PLoS ONE;2011, Vol. 6 Issue 3, p1 

    Genes coding for 45S ribosomal RNA are organized in tandem arrays of up to several thousand copies and contain 18S, 5.8S and 26S rRNA units separated by internal transcribed spacers ITS1 and ITS2. While the rRNA units are evolutionary conserved, ITS show high level of interspecific divergence...

  • Discovery of functional elements in 12 Drosophila genomes using evolutionary signatures. Stark, Alexander; Lin, Michael F.; Kheradpour, Pouya; Pedersen, Jakob S.; Parts, Leopold; Carlson, Joseph W.; Crosby, Madeline A.; Rasmussen, Matthew D.; Roy, Sushmita; Deoras, Ameya N.; Ruby, J. Graham; Brennecke, Julius; Hodges, Emily; Hinrichs, Angie S.; Caspi, Anat; Paten, Benedict; Seung-Won Park; Han, Mira V.; Maeder, Morgan L.; Polansky, Benjamin J. // Nature;11/8/2007, Vol. 450 Issue 7167, p219 

    Sequencing of multiple related species followed by comparative genomics analysis constitutes a powerful approach for the systematic understanding of any genome. Here, we use the genomes of 12 Drosophila species for the de novo discovery of functional elements in the fly. Each type of functional...

  • Analysis of complete mitochondrial genomes from extinct and extant rhinoceroses reveals lack of phylogenetic resolution. Willerslev, Eske; Gilbert, M. Thomas P.; Binladen, Jonas; Ho, Simon Y. W.; Campos, Paula F.; Ratan, Aakrosh; Tomsho, Lynn P.; da Fonseca, Rute R.; Sher, Andrei; Kuznetsova, Tatanya V.; Nowak-Kemp, Malgosia; Roth, Terri L.; Miller, Webb; Schuster, Stephan C. // BMC Evolutionary Biology;2009, Vol. 9, Special section p1 

    Background: The scientific literature contains many examples where DNA sequence analyses have been used to provide definitive answers to phylogenetic problems that traditional (non-DNA based) approaches alone have failed to resolve. One notable example concerns the rhinoceroses, a group for...

  • An Atlas of the Speed of Copy Number Changes in Animal Gene Families and Its Implications. Deng Pan; Liqing Zhang // PLoS ONE;2009, Vol. 4 Issue 10, p1 

    The notion that gene duplications generating new genes and functions is commonly accepted in evolutionary biology. However, this assumption is more speculative from theory rather than well proven in genome-wide studies. Here, we generated an atlas of the rate of copy number changes (CNCs) in all...

  • Genomics: Building a giant in tiny steps. Muers, Mary // Nature Reviews Genetics;Feb2010, Vol. 11 Issue 2, p91 

    The article presents a study which shows the de novo genome assembly. It mentions the <100bp short reads of the throughput and efficiency of next-generation sequencing platforms. The authors are noted to have experimented that DNA of a female giant panda and found out that the genome will be...


Read the Article


Sorry, but this item is not currently available from your library.

Try another library?
Sign out of this library

Other Topics