Predicting the probability of H3K4me3 occupation at a base pair from the genome sequence context

Ha, Misook; Hong, Soondo; Li, Wen-Hsiung
May 2013
Bioinformatics;May2013, Vol. 29 Issue 9, p1199
Academic Journal
Motivation: Histone modifications regulate chromatin structure and gene expression. Although nucleosome formation is known to be affected by primary DNA sequence composition, no sequence signature has been identified for histone modifications. It is known that dense H3K4me3 nucleosome sites are accompanied by a low density of other nucleosomes and are associated with gene activation. This observation suggests a different sequence composition of H3K4me3 from other nucleosomes.Approach: To understand the relationship between genome sequence and chromatin structure, we studied DNA sequences at histone modification sites in various human cell types. We found sequence specificity for H3K4me3, but not for other histone modifications. Using the sequence specificities of H3 and H3K4me3 nucleosomes, we developed a model that computes the probability of H3K4me3 occupation at each base pair from the genome sequence context.Results: A comparison of our predictions with experimental data suggests a high performance of our method, revealing a strong association between H3K4me3 and specific genomic DNA context. The high probability of H3K4me3 occupation occurs at transcription start and termination sites, exon boundaries and binding sites of transcription regulators involved in chromatin modification activities, including histone acetylases and enhancer- and insulator-associated factors. Thus, the human genome sequence contains signatures for chromatin modifications essential for gene regulation and development. Our method may be applied to find new sequence elements functioning by chromatin modulation.Availability: Software and supplementary data are available at Bioinformatics online.Contact: misook.ha@samsung.com or wli@uchicago.eduSupplementary information: Supplementary data are available at Bioinformatics online.


Related Articles

  • The functional landscape of mouse gene expression. Wen Zhang; Morris, Quaid D.; Chang, Richard; Shai, Ofer; Bakowski, Malina A.; Mitsakakis, Nicholas; Mohammad, Naveed; Robinson, Mark D.; Zirngibl, Ralph; Somogyi, Eszter; Laurin, Nancy; Eftekharpour, Eftekhar; Sat, Eric; Grigull, Jörg; Qun Pan; Wen-Tao Peng; Krogan, Nevan; Greenblatt, Jack; Fehlings, Michael; Van der Kooy, Derek // Journal of Biology;2004, Vol. 3, p21 

    Background: Large-scale quantitative analysis of transcriptional co-expression has been used to dissect regulatory networks and to predict the functions of new genes discovered by genome sequencing in model organisms such as yeast. Although the idea that tissue-specific expression is indicative...

  • A genetic signature of interspecies variations in gene expression. Tirosh, Itay; Weinberger, Adina; Carmi, Miri; Barkai, Naama // Nature Genetics;Jul2006, Vol. 38 Issue 7, p830 

    Phenotypic diversity is generated through changes in gene structure or gene regulation. The availability of full genomic sequences allows for the analysis of gene sequence evolution. In contrast, little is known about the principles driving the evolution of gene expression. Here we describe the...

  • Gene expression: One allele or two? Skipper, Magdalena // Nature Reviews Genetics;Jan2008, Vol. 9 Issue 1, p4 

    The article evaluates a research paper entitled "Widespread monoallelic expression on human autosomes," by A. Gimelbrant and colleagues. The study demonstrates that monoallelic expression is much more widespread than previously though, implying that variation of an epigenetic nature might have...

  • Evaluation of RNA-Seq software in gene expression quantification. Yan Ji; Ziliang Qian; Jia Wei // Journal of Biomedical Science & Engineering;Apr2013, Vol. 6 Issue 4, p473 

    High-throughput RNA sequencing (RNA-Seq) promises a complete annotation and quantification of all genes and their isoforms across samples. Because sequencing reads from this new technology are shorter than transcripts from which they are derived, expression estimation with RNA-Seq requires...

  • An efficient rRNA removal method for RNA sequencing in GC-rich bacteria. Peano, Clelia; Pietrelli, Alessandro; Consolandi, Clarissa; Rossi, Elio; Petiti, Luca; Tagliabue, Letizia; Bellis, Gianluca De; Landini, Paolo // Microbial Informatics & Experimentation;2013, Vol. 3 Issue 1, p1 

    Background: Next generation sequencing (NGS) technologies have revolutionized gene expression studies and functional genomics analysis. However, further improvement of RNA sequencing protocols is still desirable, in order to reduce NGS costs and to increase its accuracy. In bacteria, a major...

  • A combination of LongSAGE with Solexa sequencing is well suited to explore the depth and the complexity of transcriptome. Hanriot, Lucie; Keime, Céline; Gay, Nadine; Faure, Claudine; Dossat, Carole; Wincker, Patrick; Scoté-Blachon, Céline; Peyron, Christelle; Gandrillon, Olivier // BMC Genomics;2008, Vol. 9, Special section p1 

    Background: "Open" transcriptome analysis methods allow to study gene expression without a priori knowledge of the transcript sequences. As of now, SAGE (Serial Analysis of Gene Expression), LongSAGE and MPSS (Massively Parallel Signature Sequencing) are the mostly used methods for "open"...

  • Challenges in estimating percent inclusion of alternatively spliced junctions from RNA-seq data. Kakaradov, Boyko; Xiong, Hui Yuan; Lee, Leo J.; Jojic, Nebojsa; Frey, Brendan J. // BMC Bioinformatics;2012 Supplement 6, Vol. 13 Issue Suppl 6, p1 

    Transcript quantification is a long-standing problem in genomics and estimating the relative abundance of alternatively-spliced isoforms from the same transcript is an important special case. Both problems have recently been illuminated by high-throughput RNA sequencing experiments which are...

  • Full-length transcriptome assembly from RNA-Seq data without a reference genome. Grabherr, Manfred G; Haas, Brian J; Yassour, Moran; Levin, Joshua Z; Thompson, Dawn A; Amit, Ido; Adiconis, Xian; Fan, Lin; Raychowdhury, Raktima; Zeng, Qiandong; Chen, Zehua; Mauceli, Evan; Hacohen, Nir; Gnirke, Andreas; Rhind, Nicholas; di Palma, Federica; Birren, Bruce W; Nusbaum, Chad; Lindblad-Toh, Kerstin; Friedman, Nir // Nature Biotechnology;Jul2011, Vol. 29 Issue 7, p644 

    Massively parallel sequencing of cDNA has enabled deep and efficient probing of transcriptomes. Current approaches for transcript reconstruction from such data often rely on aligning reads to a reference genome, and are thus unsuitable for samples with a partial or missing reference genome. Here...

  • Genome-wide transcription and the implications for genomic organization. Kapranov, Philipp; Willingham, Aarron T.; Gingeras, Thomas R. // Nature Reviews Genetics;Jun2007, Vol. 8 Issue 6, p413 

    Recent evidence of genome-wide transcription in several species indicates that the amount of transcription that occurs cannot be entirely accounted for by current sets of genome-wide annotations. Evidence indicates that most of both strands of the human genome might be transcribed, implying...


Read the Article


Sorry, but this item is not currently available from your library.

Try another library?
Sign out of this library

Other Topics