Prediction of mono- and di-nucleotide-specific DNA-binding sites in proteins using neural networks

Andrabi, Munazah; Mizuguchi, Kenji; Sarai, Akinori; Ahmad, Shandar
January 2009
BMC Structural Biology;2009, Vol. 9, Special section p1
Academic Journal
Background: DNA recognition by proteins is one of the most important processes in living systems. Therefore, understanding the recognition process in general, and identifying mutual recognition sites in proteins and DNA in particular, carries great significance. The sequence and structural dependence of DNA-binding sites in proteins has led to the development of successful machine learning methods for their prediction. However, all existing machine learning methods predict DNA-binding sites, irrespective of their target sequence and hence, none of them is helpful in identifying specific protein-DNA contacts. In this work, we formulate the problem of predicting specific DNA-binding sites in terms of contacts between the residue environments of proteins and the identity of a mononucleotide or a dinucleotide step in DNA. The aim of this work is to take a protein sequence or structural features as inputs and predict for each amino acid residue if it binds to DNA at locations identified by one of the four possible mononucleotides or one of the 10 unique dinucleotide steps. Contact predictions are made at various levels of resolution viz. in terms of side chain, backbone and major or minor groove atoms of DNA. Results: Significant differences in residue preferences for specific contacts are observed, which combined with other features, lead to promising levels of prediction. In general, PSSM-based predictions, supported by secondary structure and solvent accessibility, achieve a good predictability of ~70-80%, measured by the area under the curve (AUC) of ROC graphs. The major and minor groove contact predictions stood out in terms of their poor predictability from sequences or PSSM, which was very strongly (>20 percentage points) compensated by the addition of secondary structure and solvent accessibility information, revealing a predominant role of local protein structure in the major/minor groove DNArecognition. Following a detailed analysis of results, a web server to predict mononucleotide and dinucleotide-step contacts using PSSM was developed and made available at http://sdcpred.netasa.org/ or http://tardis.nibio.go.jp/netasa/sdcpred/. Conclusion: Most residue-nucleotide contacts can be predicted with high accuracy using only sequence and evolutionary information. Major and minor groove contacts, however, depend profoundly on the local structure. Overall, this study takes us a step closer to the ultimate goal of predicting mutual recognition sites in protein and DNA sequences.


Related Articles

  • C/EBPα:AP-1 leucine zipper heterodimers bind novel DNA elements, activate the PU.1 promoter and direct monocyte lineage commitment more potently than C/EBPα homodimers or AP-1. Cai, D. H.; Wang, D.; Keefer, J.; Yeamans, C.; Hensley, K.; Friedman, A. D. // Oncogene;4/24/2008, Vol. 27 Issue 19, p2772 

    The basic-region leucine zipper (BR-LZ or bZIP) transcription factors dimerize via their LZ domains to position the adjacent BRs for DNA binding. Members of the C/EBP, AP-1 and CREB/ATF bZIP subfamilies form homodimeric or heterodimeric complexes with other members of the same subset and...

  • Correction: Data-Driven Prediction and Design of bZIP Coiled-Coil Interactions. null, null // PLoS Computational Biology;Apr2015, Vol. 11 Issue 4, p1 

    A correction to the article "Data-Driven Prediction and Design of bZIP Coiled-Coil Interactions" that was published in the April 15, 2015 issue is presented.

  • Leucin Zipper.  // Encyclopedic Reference of Molecular Pharmacology;2004, p550 

    A definition of the term "leucine zipper" is presented. It refers to a structural motif present in a large class of transcription factors. These dimeric proteins contain two extended alpha helices that grip the DNA molecule much like a pair of scissors at adjacent major grooves. Some...

  • Leucine Zipper.  // Encyclopedic Reference of Cancer;2001, p497 

    A definition of the term "leucine zipper" is presented. This term refers to an ampiphatic α-helical motif in which leucine residues are present every seven amino acids. This forms a hydrophobic stretch on the α-helix surface. Hydrophobic interactions mediate the dimerization of two...

  • Calphostin C-induced apoptosis is mediated by a tissue transglutaminase-dependent mechanism involving the DLK/JNK signaling pathway. Robitaille, K.; Daviau, A.; Lachance, G.; Couture, J.-P.; Blouin, R. // Cell Death & Differentiation;Sep2008, Vol. 15 Issue 9, p1522 

    A role for tissue transglutaminase (TG2) and its substrate dual leucine zipper-bearing kinase (DLK), an upstream component of the c-Jun N-terminal kinase (JNK) signaling pathway, has been previously suggested in the apoptotic response induced by calphostin C. In the current study, we directly...

  • A Transcription Factor with a Leucine-Zipper Motif Involved in Light-Dependent Inhibition of Expression of the puf Operon in the Photosynthetic Bacterium Rhodobacter sphaeroides. Shimada, Hiroshi; Wada, Tadashi; Handa, Hiroshi; Ohta, Hiroyuki; Mizoguchi, Hiroshi; Nishimura, Kohji; Masuda, Tatsuru; Shioi, Yuzo; Takamiya, Ken-ichiro // Plant & Cell Physiology;Jun1996, Vol. 37 Issue 4, p515 

    In the purple nonsulfur photosynthetic bacterium Rhodobacter sphaeroides the synthesis of components of the photosystem is regulated in response to oxygen tension and light intensity. We have purified and cloned a trans-acting protein (SPB) that binds to the promoter region of the puf operon,...

  • Structure of the RPA trimerization core and its role in the multistep DNA-binding mechanism of RPA. Bochkareva, Elena; Korolev, Sergey; Lees-Miller, Susan P.; Bochkarev, Alexey // EMBO Journal;4/1/2002, Vol. 21 Issue 7, p1855 

    The human single-stranded DNA-binding protein, replication protein A (RPA) binds DNA in at least two different modes: initial [8-10 nucleotides (nt)] and stable (∼30 nt). Switching from 8 to 30 nt mode is associated with a large conformational change. Here we report the 2.8 Å structure...

  • Extracellular Targeting of Synthetic Therapeutic Nucleic Acid Formulations. Philipp, Alexander; Meyer, Martin; Wagner, Ernst // Current Gene Therapy;Oct2008, Vol. 8 Issue 5, p324 

    Success of nucleic acid based therapies often depends on target-cell specific delivery of genetic materials such as plasmid DNA, antisense oligonucleotides or small interfering RNA. Such extracellular targeting strategies include the incorporation of hydrophilic shielding domains into nucleic...

  • Poly(ADP-ribose) Polymerase-1 Inhibits Strand-Displacement Synthesis of DNA Catalyzed by DNA Polymerase β. Sukhanova, M.V.; Khodyreva, S.N.; Lavrik, O.I. // Biochemistry (00062979);May2004, Vol. 69 Issue 5, p558 

    Poly(ADP-ribose) polymerase-1 (PARP- 1), a eucaryotic nuclear DNA-binding protein that is activated by breaks in DNA chains, may be involved in the base excision repair (B ER) because DNAs containing single-stranded gaps and breaks are intermediates of BER. The effect of PARP-1 on the DNA...


Read the Article


Sorry, but this item is not currently available from your library.

Try another library?
Sign out of this library

Other Topics