A Composite Method Based on Formal Grammar and DNA Structural Features in Detecting Human Polymerase II Promoter Region

Datta, Sutapa; Mukhopadhyay, Subhasis
February 2013
PLoS ONE;Feb2013, Vol. 8 Issue 2, p1
Academic Journal
An important step in understanding gene regulation is to identify the promoter regions where the transcription factor binding takes place. Predicting a promoter region de novo has been a theoretical goal for many researchers for a long time. There exists a number of in silico methods to predict the promoter region de novo but most of these methods are still suffering from various shortcomings, a major one being the selection of appropriate features of promoter region distinguishing them from non-promoters. In this communication, we have proposed a new composite method that predicts promoter sequences based on the interrelationship between structural profiles of DNA and primary sequence elements of the promoter regions. We have shown that a Context Free Grammar (CFG) can formalize the relationships between different primary sequence features and by utilizing the CFG, we demonstrate that an efficient parser can be constructed for extracting these relationships from DNA sequences to distinguish the true promoter sequences from non-promoter sequences. Along with CFG, we have extracted the structural features of the promoter region to improve upon the efficiency of our prediction system. Extensive experiments performed on different datasets reveals that our method is effective in predicting promoter sequences on a genome-wide scale and performs satisfactorily as compared to other promoter prediction techniques.


Related Articles

  • An ABRE Promoter Sequence is Involved in Osmotic Stress-Responsive Expression of the DREB2A Gene, Which Encodes a Transcription Factor Regulating Drought-Inducible Genes in Arabidopsis. Kim, June-Sik; Mizoi, Junya; Yoshida, Takuya; Fujita, Yasunari; Nakajima, Jun; Ohori, Teppei; Todaka, Daisuke; Nakashima, Kazuo; Hirayama, Takashi; Shinozaki, Kazuo; Yamaguchi-Shinozaki, Kazuko // Plant & Cell Physiology;Dec2011, Vol. 52 Issue 12, p2136 

    In plants, osmotic stress-responsive transcriptional regulation depends mainly on two major classes of cis-acting elements found in the promoter regions of stress-inducible genes: ABA-responsive elements (ABREs) and dehydration-responsive elements (DREs). ABRE has been shown to perceive...

  • Tuning gene expression with nucleosome-disfavoring sequences. Palpant, Timothy; Lieb, Jason // Nature Genetics;Jul2012, Vol. 44 Issue 7, p735 

    A new study shows that alteration of poly(dA:dT) tracts in promoters offers a broadly applicable genetic mechanism for predictably tuning gene expression with high resolution. By systematically manipulating these tracts in a controlled yeast system, the authors demonstrate quantitative...

  • Interplay of Rad51 with NF-κB Pathway Stimulates Expression of HIV-1. Kaminski, Rafal; Wollebo, Hassen S.; Datta, Prasun K.; White, Martyn K.; Amini, Shohreh; Khalili, Kamel // PLoS ONE;May2014, Vol. 9 Issue 5, p1 

    Transcription from the HIV-1 promoter is controlled by a series of ubiquitous and inducible cellular proteins with the ability to enter the nucleus and interact with the promoter. A DNA sequence spanning nucleotides −120 to −80, which supports the association of the inducible...

  • Transcriptional Regulation of Two Conceptus Interferon Tau Genes Expressed in Japanese Black Cattle during Peri-Implantation Period. Sakurai, Toshihiro; Nakagawa, So; Kim, Min-Su; Bai, Hanako; Bai, Rulan; Li, Junyou; Min, Kwan-Sik; Ideta, Atsushi; Aoyagi, Yoshito; Imakawa, Kazuhiko // PLoS ONE;Nov2013, Vol. 8 Issue 11, p1 

    Interferon tau (IFNT), produced by the mononuclear trophectoderm, signals the process of maternal recognition of pregnancy in ruminants. However, its expression in vivo and its transcriptional regulation are not yet well characterized. Objectives of this study were to determine conceptus IFNT...

  • Bayesian Centroid Estimation for Motif Discovery. Carvalho, Luis // PLoS ONE;Dec2013, Vol. 8 Issue 12, p1 

    Biological sequences may contain patterns that signal important biomolecular functions; a classical example is regulation of gene expression by transcription factors that bind to specific patterns in genomic promoter regions. In motif discovery we are given a set of sequences that share a common...

  • Transcriptional Regulation of the HMGA1 Gene by Octamer-Binding Proteins Oct-1 and Oct-2. Chiefari, Eusebio; Arcidiacono, Biagio; Possidente, Katiuscia; Iiritano, Stefania; Ventura, Valeria; Pandolfo, Rosantony; Brunetti, Francesco Saverio; Greco, Manfredi; Foti, Daniela; Brunetti, Antonio // PLoS ONE;Dec2013, Vol. 8 Issue 12, p1 

    The High-Mobility Group AT-Hook 1 (HMGA1) protein is an architectural transcription factor that binds to AT-rich sequences in the promoter region of DNA and functions as a specific cofactor for gene activation. Previously, we demonstrated that HMGA1 is a key regulator of the insulin receptor...

  • Transcription Regulation of Human oct-1 Gene Requires Involvement of Two Promoters. Zhenilo, S. V.; Deyev, I. E.; Serov, S. M.; Polanovsky, O. L. // Russian Journal of Genetics;Feb2003, Vol. 39 Issue 2, p216 

    Transcription initiation of human Oct-1 transcription factor-encoding gene involves two promoters, 1U and 1L, located at a substantial distance (about 100 kb) apart. The structure of these promoters and the adjacent sequences is different. Specifically, the 1U sequence is GC-rich, while the 1L...

  • Microsatellite Tandem Repeats Are Abundant in Human Promoters and Are Associated with Regulatory Elements. Sawaya, Sterling; Bagshaw, Andrew; Buschiazzo, Emmanuel; Kumar, Pankaj; Chowdhury, Shantanu; Black, Michael A.; Gemmell, Neil // PLoS ONE;Feb2013, Vol. 8 Issue 2, p1 

    Tandem repeats are genomic elements that are prone to changes in repeat number and are thus often polymorphic. These sequences are found at a high density at the start of human genes, in the gene’s promoter. Increasing empirical evidence suggests that length variation in these tandem...

  • A Generalized Topological Entropy for Analyzing the Complexity of DNA Sequences. Jin, Shuilin; Tan, Renjie; Jiang, Qinghua; Xu, Li; Peng, Jiajie; Wang, Yong; Wang, Yadong // PLoS ONE;Feb2014, Vol. 9 Issue 2, p1 

    Topological entropy is one of the most difficult entropies to be used to analyze the DNA sequences, due to the finite sample and high-dimensionality problems. In order to overcome these problems, a generalized topological entropy is introduced. The relationship between the topological entropy...


Read the Article


Sorry, but this item is not currently available from your library.

Try another library?
Sign out of this library

Other Topics