Dynamic and Structural Modeling of the Specificity in Protein-DNA Interactions Guided by Binding Assay and Structure Data


How transcription factors (TFs) recognize their DNA sequences is often investigated complementarily by high-throughput protein binding assays and by structural biology experiments. The former quantifies the specificity of TF binding sites for numerous DNA sequences, often represented as the position-weight-matrix (PWM). The latter provides mechanistic insights into the interactions via the protein-DNA complex structures. However, these two types of data are not readily integrated. Here, we propose and test a new modeling method that incorporates the PWM with complex structure data. Based on pre-tuned coarse-grained models for proteins and DNAs, we model the specific protein-DNA interactions, PWMcos, in terms of an orientation-dependent potential function, which enables us to perform molecular dynamics simulations at unprecedentedly large scales. We show that the PWMcos model reproduces subtle specificity in the protein-DNA recognition. During the target search in genomic sequences, TF moves on highly rugged landscapes and occasionally flips on DNA depending on the sequence. The TATA-binding protein exhibits two remarkably distinct binding modes, of which frequencies differ between TATA-containing and TATA-less promoters. The PWMcos is general and can be applied to any protein-DNA interactions given their PWMs and complex structure data are available.

Journal of Chemical Theory and Computation