Profiling the quantitative occupancy of myriad transcription factors across conditions by modeling chromatin accessibility data

  1. Alexander J. Hartemink1,2,3,11
  1. 1Computational Biology & Bioinformatics Graduate Program, Duke University, Durham, North Carolina 27708, USA;
  2. 2Center for Genomic and Computational Biology, Duke University, Durham, North Carolina 27708, USA;
  3. 3Department of Computer Science, Duke University, Durham, North Carolina 27708, USA;
  4. 4Department of Human Genetics, The University of Chicago, Chicago, Illinois 60637, USA;
  5. 5Department of Pediatrics, Duke University Medical Center, Durham, North Carolina 27710, USA;
  6. 6Department of Medical Oncology, Dana-Farber Cancer Institute, Boston, Massachusetts 02215, USA;
  7. 7Department of Biostatistics and Bioinformatics, Durham, North Carolina 27710, USA;
  8. 8Department of Molecular Genetics and Microbiology, Duke University Medical Center, Durham, North Carolina 27710, USA;
  9. 9Department of Biomedical Engineering, Duke University, Durham, North Carolina 27708, USA;
  10. 10Department of Statistical Science, Duke University, Durham, North Carolina 27708, USA;
  11. 11Department of Biology, Duke University, Durham, North Carolina 27708, USA
  • Corresponding author: amink{at}cs.duke.edu
  • Abstract

    Over a thousand different transcription factors (TFs) bind with varying occupancy across the human genome. Chromatin immunoprecipitation (ChIP) can assay occupancy genome-wide, but only one TF at a time, limiting our ability to comprehensively observe the TF occupancy landscape, let alone quantify how it changes across conditions. We developed TF occupancy profiler (TOP), a Bayesian hierarchical regression framework, to profile genome-wide quantitative occupancy of numerous TFs using data from a single chromatin accessibility experiment (DNase- or ATAC-seq). TOP is supervised, and its hierarchical structure allows it to predict the occupancy of any sequence-specific TF, even those never assayed with ChIP. We used TOP to profile the quantitative occupancy of hundreds of sequence-specific TFs at sites throughout the genome and examined how their occupancies changed in multiple contexts: in approximately 200 human cell types, through 12 h of exposure to different hormones, and across the genetic backgrounds of 70 individuals. TOP enables cost-effective exploration of quantitative changes in the landscape of TF binding.

    Footnotes

    • Received September 30, 2020.
    • Accepted May 6, 2022.

    This article is distributed exclusively by Cold Spring Harbor Laboratory Press for the first six months after the full-issue publication date (see https://genome.cshlp.org/site/misc/terms.xhtml). After six months, it is available under a Creative Commons License (Attribution-NonCommercial 4.0 International), as described at http://creativecommons.org/licenses/by-nc/4.0/.

    | Table of Contents

    Preprint Server