skip to main content
OSTI.GOV title logo U.S. Department of Energy
Office of Scientific and Technical Information

Title: An affinity-structure database of helix-turn-helix: DNA complexes with a universal coordinate system

Abstract

Molecular interactions between proteins and DNA molecules underlie many cellular processes, including transcriptional regulation, chromosome replication, and nucleosome positioning. Computational analyses of protein-DNA interactions rely on experimental data characterizing known protein-DNA interactions structurally and biochemically. While many databases exist that contain either structural or biochemical data, few integrate these two data sources in a unified fashion. Such integration is becoming increasingly critical with the rapid growth of structural and biochemical data, and the emergence of algorithms that rely on the synthesis of multiple data types to derive computational models of molecular interactions. We have developed an integrated affinity-structure database in which the experimental and quantitative DNA binding affinities of helix-turn-helix proteins are mapped onto the crystal structures of the corresponding protein-DNA complexes. This database provides access to: (i) protein-DNA structures, (ii) quantitative summaries of protein-DNA binding affinities using position weight matrices, and (iii) raw experimental data of protein-DNA binding instances. Critically, this database establishes a correspondence between experimental structural data and quantitative binding affinity data at the single basepair level. Furthermore, we present a novel alignment algorithm that structurally aligns the protein-DNA complexes in the database and creates a unified residue-level coordinate system for comparing the physico-chemical environments at themore » interface between complexes. Using this unified coordinate system, we compute the statistics of atomic interactions at the protein-DNA interface of helix-turn-helix proteins. We provide an interactive website for visualization, querying, and analyzing this database, and a downloadable version to facilitate programmatic analysis. Lastly, this database will facilitate the analysis of protein-DNA interactions and the development of programmatic computational methods that capitalize on integration of structural and biochemical datasets. The database can be accessed at http://ProteinDNA.hms.harvard.edu.« less

Authors:
 [1];  [1];  [1]
  1. Harvard Medical School, Boston, MA (United States)
Publication Date:
Research Org.:
Stanford Univ., CA (United States)
Sponsoring Org.:
USDOE Office of Science (SC)
OSTI Identifier:
1241035
Grant/Contract Number:
FG02-05ER64136
Resource Type:
Journal Article: Accepted Manuscript
Journal Name:
BMC Bioinformatics
Additional Journal Information:
Journal Volume: 16; Journal Issue: 1; Journal ID: ISSN 1471-2105
Publisher:
BioMed Central
Country of Publication:
United States
Language:
English
Subject:
protein-DNA; database; helix-turn-helix; transcription factors; structure; PWM

Citation Formats

AlQuraishi, Mohammed, Tang, Shengdong, and Xia, Xide. An affinity-structure database of helix-turn-helix: DNA complexes with a universal coordinate system. United States: N. p., 2015. Web. doi:10.1186/s12859-015-0819-2.
AlQuraishi, Mohammed, Tang, Shengdong, & Xia, Xide. An affinity-structure database of helix-turn-helix: DNA complexes with a universal coordinate system. United States. doi:10.1186/s12859-015-0819-2.
AlQuraishi, Mohammed, Tang, Shengdong, and Xia, Xide. Thu . "An affinity-structure database of helix-turn-helix: DNA complexes with a universal coordinate system". United States. doi:10.1186/s12859-015-0819-2. https://www.osti.gov/servlets/purl/1241035.
@article{osti_1241035,
title = {An affinity-structure database of helix-turn-helix: DNA complexes with a universal coordinate system},
author = {AlQuraishi, Mohammed and Tang, Shengdong and Xia, Xide},
abstractNote = {Molecular interactions between proteins and DNA molecules underlie many cellular processes, including transcriptional regulation, chromosome replication, and nucleosome positioning. Computational analyses of protein-DNA interactions rely on experimental data characterizing known protein-DNA interactions structurally and biochemically. While many databases exist that contain either structural or biochemical data, few integrate these two data sources in a unified fashion. Such integration is becoming increasingly critical with the rapid growth of structural and biochemical data, and the emergence of algorithms that rely on the synthesis of multiple data types to derive computational models of molecular interactions. We have developed an integrated affinity-structure database in which the experimental and quantitative DNA binding affinities of helix-turn-helix proteins are mapped onto the crystal structures of the corresponding protein-DNA complexes. This database provides access to: (i) protein-DNA structures, (ii) quantitative summaries of protein-DNA binding affinities using position weight matrices, and (iii) raw experimental data of protein-DNA binding instances. Critically, this database establishes a correspondence between experimental structural data and quantitative binding affinity data at the single basepair level. Furthermore, we present a novel alignment algorithm that structurally aligns the protein-DNA complexes in the database and creates a unified residue-level coordinate system for comparing the physico-chemical environments at the interface between complexes. Using this unified coordinate system, we compute the statistics of atomic interactions at the protein-DNA interface of helix-turn-helix proteins. We provide an interactive website for visualization, querying, and analyzing this database, and a downloadable version to facilitate programmatic analysis. Lastly, this database will facilitate the analysis of protein-DNA interactions and the development of programmatic computational methods that capitalize on integration of structural and biochemical datasets. The database can be accessed at http://ProteinDNA.hms.harvard.edu.},
doi = {10.1186/s12859-015-0819-2},
journal = {BMC Bioinformatics},
number = 1,
volume = 16,
place = {United States},
year = {Thu Nov 19 00:00:00 EST 2015},
month = {Thu Nov 19 00:00:00 EST 2015}
}

Journal Article:
Free Publicly Available Full Text
Publisher's Version of Record

Save / Share:
  • The tailed bacteriophage {phi}29 capsid is decorated with 55 fibers attached to quasi-3-fold symmetry positions. Each fiber is a homotrimer of gene product 8.5 (gp8.5) and consists of two major structural parts, a pseudohexagonal base and a protruding fibrous portion that is about 110 {angstrom} in length. The crystal structure of the C-terminal fibrous portion (residues 112-280) has been determined to a resolution of 1.6 {angstrom}. The structure is about 150 {angstrom} long and shows three distinct structural domains designated as head, neck, and stem. The stem region is a unique three-stranded helix-turn-helix supercoil that has not previously been described.more » When fitted into a cryoelectron microscope reconstruction of the virus, the head structure corresponded to a disconnected density at the distal end of the fiber and the neck structure was located in weak density connecting it to the fiber. Thin section studies of Bacillus subtilis cells infected with fibered or fiberless {phi}29 suggest that the fibers might enhance the attachment of the virions onto the host cell wall.« less
  • A simple and rapid method for the preparation of highly pure plasmid DNA has been developed. The DNA is directly captured from bacterial cell lysates by formation of a triple-helical structure between the plasmid dsDNA and a 20-base biotinylated oligonucleotide attached to streptavidin-coated magnetic beads and then eluted from the beads in pH 9 buffer solution. No phenol extraction, ethanol precipitation, RNase digestion, or CsCl gradient centrifugation is required. A general purpose cloning vector, pHJ19, was constructed for this application from pUC19 DNA by insertion of a 40-base sequence suitable for triple-helix formation. The approach was also found suitable formore » the purification of [lambda] bacteriophage DNA. 32 refs., 6 figs., 1 tab.« less
  • The abundant Fis nucleoid protein selectively binds poorly related DNA sequences with high affinities to regulate diverse DNA reactions. Fis binds DNA primarily through DNA backbone contacts and selects target sites by reading conformational properties of DNA sequences, most prominently intrinsic minor groove widths. High-affinity binding requires Fis-stabilized DNA conformational changes that vary depending on DNA sequence. In order to better understand the molecular basis for high affinity site recognition, we analyzed the effects of DNA sequence within and flanking the core Fis binding site on binding affinity and DNA structure. X-ray crystal structures of Fis-DNA complexes containing variable sequencesmore » in the noncontacted center of the binding site or variations within the major groove interfaces show that the DNA can adapt to the Fis dimer surface asymmetrically. We show that the presence and position of pyrimidine-purine base steps within the major groove interfaces affect both local DNA bending and minor groove compression to modulate affinities and lifetimes of Fis-DNA complexes. Sequences flanking the core binding site also modulate complex affinities, lifetimes, and the degree of local and global Fis-induced DNA bending. In particular, a G immediately upstream of the 15 bp core sequence inhibits binding and bending, and A-tracts within the flanking base pairs increase both complex lifetimes and global DNA curvatures. Taken together, our observations support a revised DNA motif specifying high-affinity Fis binding and highlight the range of conformations that Fis-bound DNA can adopt. Lastly, the affinities and DNA conformations of individual Fis-DNA complexes are likely to be tailored to their context-specific biological functions.« less
  • The genome of the diurnal cyanobacterium Cyanothece sp. PCC 51142 has recently been sequenced and observed to contain 35 pentapeptide repeat proteins (PRPs). These proteins, while present throughout the prokaryotic and eukaryotic kingdoms, are most abundant in cyanobacteria. The sheer number of PRPs in cyanobacteria coupled with their predicted location in all the cyanobacteria cellular compartments argues for important, yet unknown, physiological and biochemical functions. To gain insights into the biochemical function of PRPs in cyanobacteria, the first crystal structure of a PRP from Cyanothece has been determined at 2.1 Å resolution. The native protein, annotated Rfr32 for repeated five-residue,more » is a 167-residue protein with an N-terminal 29-residue signal peptide. The signal peptide was replaced with a 43-residue tag that was invisible in the electron density maps of two different crystal forms from which essentially identical structures were solved. The structure is dominated by 21 tandem pentapeptide repeats that fold into a right-handed quadrilateral β-helix, or Rfr-fold, reminiscent of a “square” tower with four distinct faces. Four consecutive pentapeptide repeats define a “floor” of the tower with a single repeat occupying a face. The Rfr-fold contains five complete, stacked, ascending floors (coils) that complete a revolution every 20 residues with a ~4.8 Å rise along the helix axis. The main chain backbone of the floors are held together with a narrow parallel β-sheet on one face and stacked parallel The main chain backbone of the floors are held together with a narrow parallel β-sheet on one face and stacked parallel β-bridges (single-residue β-sheets) on the other three faces. The regular shape of the tower is maintained by two distinct types of four-residue turns labeled pseudo type II and pseudo type IV β-turns. The interior of the Rfr-fold is primarily hydrophobic, with all side chains of the i and i-2 residues inserted into the center of the β-helix to form aligned, stacked, columns of hydrophobic side chains. The i-1, i+1, and i+2 residues are generally polar or charged and these side chains all point away from the Rfr-core to give the β-helix a predominately hydrophilic surface. Two short, anti-parallel β-helices, bridged with a disulfide bond, sit atop of the C-terminus of the Rfr-fold perhaps preventing edge-to-edge aggregation at the C-terminus of the Rfr-fold. The circular dichroism spectra of Rfr-32 is dominated by β-turn and parallel β-sheet features. The structure of Rfr32 is compared with the only other PRP structure, the mycobacterial fluoroquinoline resistance protein MfpA from Mycobacterium tuberculosis, and the general features of the amino acid sequences of the 35 Cyanothece PRPs are discussed.« less