Bayesian classification for promoter prediction in human DNA sequences

Bercher, J.-F.; Jardin, P.; Duriez, B.
December 2006
AIP Conference Proceedings;2006, Vol. 872 Issue 1, p235
Academic Journal
Many Computational methods are yet available for data retrieval and analysis of genomic sequences, but some functional sites are difficult to characterize. In this work, we examine the problem of promoter localization in human DNA sequences. Promoters are regulatory regions that governs the expression of genes, and their prediction is reputed difficult, so that this issue is still open. We present the Chaos Game representation (CGR) of DNA sequences which has many interesting properties, and the notion of ‘genomic signature’ that proved relevant in phylogeny applications. Based on this notion, we develop a (naïve) bayesian classifier, evaluate its performances, and show that its adaptive implementation enable to reveal or assess core-promoter positions along a DNA sequence. © 2006 American Institute of Physics


Related Articles

  • The pattern of amplification and differentiation of Ty1-copia and Ty3-gypsy retrotransposons in Brassicaceae species. Fujimoto, Ryo; Takuno, Shohei; Sasaki, Taku; Nishio, Takeshi // Genes & Genetic Systems;2008, Vol. 83 Issue 1, p13 

    One of the causes of genome size expansion is considered to be amplification of retrotransposons. We determined nucleotide sequences of 24 PCR products for each of six retrotransposons in Brassica rapa and Brassica oleracea. Phylogenetic trees of these sequences showed species-specific clades....

  • FPGA Acceleration of the phylogenetic likelihood function for Bayesian MCMC inference methods. Zierke, Stephanie; Bakos, Jason D. // BMC Bioinformatics;2010, Vol. 11, p184 

    Background: Likelihood (ML)-based phylogenetic inference has become a popular method for estimating the evolutionary relationships among species based on genomic sequence data. This method is used in applications such as RAxML, GARLI, MrBayes, PAML, and PAUP. The Phylogenetic Likelihood Function...

  • PstI repeat: a family of short interspersed nucleotide element (SINE)-like sequences in the genomes of cattle, goat, and buffalo. Sheikh, Faruk G; Mukhopadhyay, Sudit S; Gupta, Prabhakar // Genome;Feb2002, Vol. 45 Issue 1, p44 

    The PstI family of elements are short, highly repetitive DNA sequences interspersed throughout the genome of the Bovidae. We have cloned and sequenced some members of the PstI family from cattle, goat, and buffalo. These elements are approximately 500 bp, have a copy number of 2 × 10[sup 5]...

  • Cloning and sequencing of columbid circovirus (CoCV), a new circovirus from pigeons. Mankertz, A.; Hattermann, K.; Ehlers, B.; Soike, D. // Archives of Virology;Dec2000, Vol. 145 Issue 12, p2469 

    Summary. The complete nucleotide sequence of columbid circovirus (CoCV) isolated from pigeons is described. CoCV was amplified using a consensus primer PCR approach directed against conserved sequences within the rep genes of vertebrate circoviruses. The genome of CoCV is circular and 2037 nt in...

  • Comparing Segmentation Methods for Genome Annotation Based on RNA-Seq Data. Cleynen, Alice; Dudoit, Sandrine; Robin, Stéphane // Journal of Agricultural, Biological & Environmental Statistics (;Mar2014, Vol. 19 Issue 1, p101 

    Transcriptome sequencing (RNA-Seq) yields massive data sets, containing a wealth of information on the expression of a genome. While numerous methods have been developed for the analysis of differential gene expression, little has been attempted for the localization of transcribed regions, that...

  • Prediction of Protein Function in the Absence of Significant Sequence Similarity. Dobson, Paul D.; Yu-dong Cai; Stapley, Benjamin J.; Doig, Andrew J. // Current Medicinal Chemistry;Aug2004, Vol. 11 Issue 16, p2135 

    Tremendous progress in DNA sequencing has yielded the genomes of a host of important organisms. The utilisation of these resources requires understanding of the function of each gene. Standard methods of functional assignment involve sequence alignment to a gene of known function; however such...

  • Sequencing, Analysis, and Annotation of Expressed Sequence Tags for Camelus dromedarius. Al-Swailem, Abdulaziz M.; Shehata, Maher M.; Abu-Duhier, Faisel M.; Al-Yamani, Essam J.; Al-Busadah, Khalid A.; Al-Arawi, Mohammed S.; Al-Khider, Ali Y.; Al-Muhaimeed, Abdullah N.; Al-Qahtani, Fahad H.; Manee, Manee M.; Al-Shomrani, Badr M.; Al-Qhtani, Saad M.; Al-Harthi, Amer S.; Akdemir, Kadir C.; Inan, Mehmet S.; Otu, Hasan H. // PLoS ONE;2010, Vol. 5 Issue 5, p1 

    Despite its economical, cultural, and biological importance, there has not been a large scale sequencing project to date for Camelus dromedarius. With the goal of sequencing complete DNA of the organism, we first established and sequenced camel EST libraries, generating 70,272 reads. Following...

  • Species tree inference in the age of genomics. Whelan, Nathan V. // Trends in Evolutionary Biology;2011, Vol. 3, p23 

    Species trees are an essential tool in conservation and evolutionary biology. In phylogenomics, not only is data choice (e.g. using unlinked orthologs rather than paralogs) an important systematic consideration, but the choice of phylogenetic algorithm is also important. Since individual gene...

  • A k-mer scheme to predict piRNAs and characterize locust piRNAs. Zhang, Yi; Wang, Xianhui; Kang, Le // Bioinformatics;Mar2011, Vol. 27 Issue 6, p771 

    Motivation: Identifying piwi-interacting RNAs (piRNAs) of non-model organisms is a difficult and unsolved problem because piRNAs lack conservative secondary structure motifs and sequence homology in different species.Results: In this article, a k-mer scheme is proposed to identify piRNA...


Read the Article


Sorry, but this item is not currently available from your library.

Try another library?
Sign out of this library

Other Topics