Genome-wide identification of coding and non-coding conserved sequence tags in human and mouse genomes

Mignone, Flavio; Anselmo, Anna; Donvito, Giacinto; Maggi, Giorgio P.; Grillo, Giorgio; Pesole, Graziano
January 2008
BMC Genomics;2008, Vol. 9, Special section p1
Academic Journal
Background: The accurate detection of genes and the identification of functional regions is still an open issue in the annotation of genomic sequences. This problem affects new genomes but also those of very well studied organisms such as human and mouse where, despite the great efforts, the inventory of genes and regulatory regions is far from complete. Comparative genomics is an effective approach to address this problem. Unfortunately it is limited by the computational requirements needed to perform genome-wide comparisons and by the problem of discriminating between conserved coding and non-coding sequences. This discrimination is often based (thus dependent) on the availability of annotated proteins. Results: In this paper we present the results of a comprehensive comparison of human and mouse genomes performed with a new high throughput grid-based system which allows the rapid detection of conserved sequences and accurate assessment of their coding potential. By detecting clusters of coding conserved sequences the system is also suitable to accurately identify potential gene loci. Following this analysis we created a collection of human-mouse conserved sequence tags and carefully compared our results to reliable annotations in order to benchmark the reliability of our classifications. Strikingly we were able to detect several potential gene loci supported by EST sequences but not corresponding to as yet annotated genes. Conclusion: Here we present a new system which allows comprehensive comparison of genomes to detect conserved coding and non-coding sequences and the identification of potential gene loci. Our system does not require the availability of any annotated sequence thus is suitable for the analysis of new or poorly annotated genomes.


Related Articles

  • Validation of 4 type-STR analysis for identification of 50 Korean. Seo, Seung-Chul; Lee, Ji-Young; Won, Jong-In // Biotechnology & Bioprocess Engineering;Jul2013, Vol. 18 Issue 4, p663 

    Short tandem repeat (STR) analysis provides genetic fingerprinting of individuals and is an indispensable technique for forensic human identification. Recently, this technique has been used in social areas, such as the identification of The Korean War, descendants of national merit, and missing...

  • A New Method to Scan Genomes for Introgression in a Secondary Contact Model. Geneva, Anthony J.; Muirhead, Christina A.; Kingan, Sarah B.; Garrigan, Daniel // PLoS ONE;Apr2015, Vol. 10 Issue 4, p1 

    Secondary contact between divergent populations or incipient species may result in the exchange and introgression of genomic material. We develop a simple DNA sequence measure, called Gmin, which is designed to identify genomic regions experiencing introgression in a secondary contact model....

  • DNA FINGERPRINTING ENTERS 21ST CENTURY.  // Forensic Examiner;Fall2012, Vol. 21 Issue 3, p11 

    The article reports on the short tandem repeats (STRs) identification discovered by Whitehead Institute researchers in Cambridge, Massachusetts. It notes that researchers have pulled DNA fingerprinting into the 21st Century by creating the lobSTR algorithm, a three-step system that profiles more...

  • Exploiting the Potential of Genomic Sequences. Janssen, Deborah // Genomics & Proteomics;Nov/Dec2002, Vol. 2 Issue 9, p18 

    Focuses on the exploitation of the potentials of genomic sequences in the U.S. Accomplishments of Elaine Mardis, assistant director of the Genome Sequencing Center related to gene sequencing; Analysis of DNA sequences; Use of robots to perform a magnetic bead preparation of the plasmid DNA. ...

  • Human genome bombshell. Schehr, Robert; Fox, Jeff // Nature Biotechnology;Apr2000, Vol. 18 Issue 4, p365 

    Reports on the impact of statement made by U.S. President George W. Bush and Great Britain Prime Minister Tony Blair to make public fundamental data on the human genome, including the human DNA sequence on patenting rules. Justification made by the White House officials on the statement;...

  • Lots More in Store. Dickey, Chris // Genomics & Proteomics;Nov/Dec2003, Vol. 3 Issue 9, p7 

    Editorial. Comments on the cover story in the November/December 2003 issue of "Genomics & Proteomics." Importance of the sequencing of the human genome in understanding the functional aspects of the noncoding region; Completion of the sequencing comparisons between humans and other animal species.

  • Do all SINEs lead to LINEs? Weiner, Alan M // Nature Genetics;Apr2000, Vol. 24 Issue 4, p332 

    Discusses the mechanisms that shape the human genome. Experimental methods that accelerate retroposition; Retroposition of short interspersed repeated sequences (SINE) and long interspersed repeated sequences (LINE); Comparison between the reverse transcription of SINE and LINE.

  • Introduction: putting it together.  // Nature Genetics;Sep2002 Supplement, Vol. 32, p5 

    Focuses on the human genome sequencing project. Status of human genome sequencing; Determination of the human sequence; Annotation of genome assemblies; Accessibility of human genome sequence data.

  • An evaluation of the draft human genome sequence. Katsanis, Nicholas; Worley, Kim C.; Lupski, James R. // Nature Genetics;Sep2001, Vol. 29 Issue 1, p88 

    The completed draft version of the human genome, comprised of multiple short contigs encompassing 85% or more of euchromatin, was announced in June of 2000 (ref. 1). The detailed findings of the sequencing consortium were reported several months later. The draft sequence has provided insight...


Read the Article


Sorry, but this item is not currently available from your library.

Try another library?
Sign out of this library

Other Topics