Heterogeneity of transcription factor binding specificity models within and across cell lines

  1. Sridhar Hannenhalli2,3
  1. 1 University of Maryland, College park, MD;
  2. 2 University of Maryland, College Park, MD
  1. * Corresponding author; email: sridhar{at}umiacs.umd.edu

Abstract

Complex gene expression patterns are mediated by the binding of transcription factors (TF) to specific genomic loci. The in vivo occupancy of a TF is, in large part, determined by the TF's DNA binding interaction partners, motivating genomic context based models of TF occupancy. However, approaches thus far have assumed a uniform TF binding model to explain genome wide cell type specific binding sites. Therefore, the cell type heterogeneity of TF occupancy models, and the extent to which binding rules underlying a TF's occupancy are shared across cell types, has not been investigated. Here, we develop an ensemble-based approach (TRISECT) to identify the heterogeneous binding rules for cell type specific TF occupancy and analyze the inter-cell type sharing of such rules. Comprehensive analysis of 23 TFs, each with ChIP-seq data in 4-12 different cell types, shows that by explicitly capturing the heterogeneity of binding rules, TRISECT accurately identifies in vivo TF occupancy. Importantly, many of the binding rules derived from individual cell types are shared across cell types and reveal distinct yet functionally coherent putative target genes in different cell types. Closer inspection of the predicted cell type-specific interaction partners provides insights into the context-specific functional landscape of a TF. Together, our novel ensemble-based approach reveals, for the first time, a widespread heterogeneity of binding rules, comprising the interaction partners within a cell type, many of which nevertheless transcend cell types. Notably, the putative targets of shared binding rules in different cell types, while distinct, exhibit significant functional coherence.

  • Received September 4, 2015.
  • Accepted June 16, 2016.

This article is distributed exclusively by Cold Spring Harbor Laboratory Press for the first six months after the full-issue publication date (see http://genome.cshlp.org/site/misc/terms.xhtml). After six months, it is available under a Creative Commons License (Attribution-NonCommercial 4.0 International), as described at http://creativecommons.org/licenses/by-nc/4.0/.

ACCEPTED MANUSCRIPT

Preprint Server