FRAGS: estimation of coding sequence substitution rates from fragmentary data

Swart, Estienne C.; Hide, Winston A.; Seoighe, Cathal
January 2004
BMC Bioinformatics;2004, Vol. 5, p8
Academic Journal
Background: Rates of substitution in protein-coding sequences can provide important insights into evolutionary processes that are of biomedical and theoretical interest. Increased availability of coding sequence data has enabled researchers to estimate more accurately the coding sequence divergence of pairs of organisms. However the use of different data sources, alignment protocols and methods to estimate substitution rates leads to widely varying estimates of key parameters that define the coding sequence divergence of orthologous genes. Although complete genome sequence data are not available for all organisms, fragmentary sequence data can provide accurate estimates of substitution rates provided that an appropriate and consistent methodology is used and that differences in the estimates obtainable from different data sources are taken into account. Results: We have developed FRAGS, an application framework that uses existing, freely available software components to construct in-frame alignments and estimate coding substitution rates from fragmentary sequence data. Coding sequence substitution estimates for human and chimpanzee sequences, generated by FRAGS, reveal that methodological differences can give rise to significantly different estimates of important substitution parameters. The estimated substitution rates were also used to infer upper-bounds on the amount of sequencing error in the datasets that we have analysed. Conclusion: We have developed a system that performs robust estimation of substitution rates for orthologous sequences from a pair of organisms. Our system can be used when fragmentary genomic or transcript data is available from one of the organisms and the other is a completely sequenced genome within the Ensembl database. As well as estimating substitution statistics our system enables the user to manage and query alignment and substitution data.


Related Articles

  • DNA bar coding of the Bryce's Whale Balaenoptera edeni Anderson (Cetacea: Balaenopteridae) washed ashore along Kerala coast, India. Bijukumar, A.; Jijith, S. S.; Suresh Kumar, U.; George, S. // Journal of Threatened Taxa;Mar2012, Vol. 4 Issue 3, p2436 

    Three whales washed ashore along Kerala coast of southwest India were identified as Bryde's Whale Balaenoptera edeni Anderson based on sequencing of mitochondrial cytochrome c oxidase subunit 1 and cytochrome b genes. The results of mtDNA sequencing in the present study confirm the presence of...

  • Common variants of FUT2 are associated with plasma vitamin B12 levels. Hazra, Aditi; Kraft, Peter; Selhub, Jacob; Giovannucci, Edward L.; Thomas, Gilles; Hoover, Robert N.; Chanock, Stephen J.; Hunter, David J. // Nature Genetics;Oct2008, Vol. 40 Issue 10, p1160 

    We identified a strong association (P = 5.36 × 10−17) between rs492602 in FUT2 and plasma vitamin B12 levels in a genome-wide scan (n = 1,658) and an independent replication sample (n = 1,059) from the Nurses' Health Study. Women homozygous for the rs492602[G] allele had higher B12...

  • Geneticists study chimp-human divergence. Check, Erika // Nature;3/18/2004, Vol. 428 Issue 6980, p242 

    Reports on research findings on the chimpanzee genome sequence which revealed differences between chimpanzees in humans, presented at a workshop in San Diego, California. Differences in the proteases found in chimpanzees and humans; DNA section found in chimpanzees which is absent in the human...

  • Structural variations, the regulatory landscape of the genome and their alteration in human disease. Spielmann, Malte; Mundlos, Stefan // BioEssays;Jun2013, Vol. 35 Issue 6, p533 

    High-throughput genomic technologies are revolutionizing human genetics. So far the focus has been on the 1.5% of the genome, which is coding, in spite of the fact that the great majority of genomic variants fall outside the coding regions. Recent efforts to annotate the non-coding sequence show...

  • Classical and "Quantum-like" Views of the Genetic Code. Négadi, Tidjani // NeuroQuantology;Dec2011, Vol. 9 Issue 4, p603 

    The article discusses various papers published in this issue including one by Miloje Rakocevic on the genetic code, one by Tidjani Négadi on the physico-mathematical structure of the genetic code, and one by Diego Lucio Rapoport on Kleinbottle topology.

  • Maternal and paternal lineages in Albania and the genetic structure of Indo-European populations. Belledi, Michele; Poloni, Estella S; Casalotti, Rosa; Conterio, Franco; Mikerezi, Ilia; Tagliavini, James; Excoffier, Laurent // European Journal of Human Genetics;Jul2000, Vol. 8 Issue 7, p480 

    Mitochondrial DNA HV1 sequences and Y chromosome haplotypes (DYS19 STR and YAP) were characterised in an Albanian sample and compared with those of several other Indo-European populations from the European continent. No significant difference was observed between Albanians and most other...

  • DNA sequencing:. Schlegel, Rolf H. J. // Encyclopedic Dictionary of Plant Breeding & Related Subjects;2003, p136 

    An encyclopedia entry for the term "DNA sequencing" used in terms of plant breeding is presented.

  • Replication-Associated Mutational Pressure (RMP) Governs Strand-Biased Compositional Asymmetry (SCA) and Gene Organization in Animal Mitochondrial Genomes. Qiang Lin; Peng Cui; Feng Ding; Songnian Hu; Jun Yu // Current Genomics;Mar2012, Vol. 13 Issue 1, p28 

    The nucleotide composition of the light (L-) and heavy (H-) strands of animal mitochondrial genomes is known to exhibit strand-biased compositional asymmetry (SCA). One of the possibilities is the existence of a replication-associated mutational pressure (RMP) that may introduce characteristic...

  • Overestimation of nonsynonymous/synonymous rate ratio by reverse-translation of aligned amino acid sequences. Suzuki, Yoshiyuki; Nishihara, Hidenori // Genes & Genetic Systems;2011, Vol. 86 Issue 2, p123 

    In the analysis of protein-coding nucleotide sequences, the ratio of the number of nonsynonymous substitutions to that of synonymous substitutions (dN/dS) is used as an indicator for the direction and magnitude of natural selection operating at the amino acid sequence level. The dS and dN values...


Read the Article


Sorry, but this item is not currently available from your library.

Try another library?
Sign out of this library

Other Topics