Integrating and mining the chromatin landscape of cell type specificity using Self-Organizing Maps

  1. Barbara Wold2,7
  1. 1 University of California, Irvine;
  2. 2 California Institute of Technology;
  3. 3 University of California, Los Angeles;
  4. 4 MIT;
  5. 5 Pennsylvania State University;
  6. 6 HudsonAlpha Institute for Biotechnology
  1. * Corresponding author; email: woldb{at}caltech.edu

Abstract

We tested whether Self-Organizing Maps (SOMs) could be used to effectively integrate, visualize, and mine diverse genomics data types, including complex chromatin signatures. A fine-grained SOM was trained on 72 ChIP-seq histone modifications and DNase-seq datasets from six biologically diverse cell lines studied by the ENCODE Project Consortium. We mined the resulting SOM to identify chromatin signatures related to sequence-specific transcription factor occupancy, sequence motif enrichment, and biological functions. To highlight clusters enriched for specific functions such as transcriptional promoters or enhancers, we overlaid onto the map additional datasets not used during training, such as ChIP-seq, RNA-seq, CAGE, and information on cis-acting regulatory modules from the literature. We used the SOM to parse known transcriptional enhancers according to the cell type-specific chromatin signature, and we further corroborated this pattern on the map by EP300 (also known as p300) occupancy. New candidate cell-type specific enhancers were identified for multiple ENCODE cell types in this way, along with new candidates for ubiquitous enhancer activity. An interactive web interface was developed to allow users to visualize and custom-mine the ENCODE SOM. We conclude that large SOMs trained on chromatin data from multiple cell types provide a powerful way to identify complex relationships in genomic data at user-selected levels of granularity.

  • Received March 29, 2013.
  • Accepted October 7, 2013.

This manuscript is Open Access.

This article, published in Genome Research, is available under a Creative Commons License (Attribution-NonCommercial 3.0 Unported), as described at http://creativecommons.org/licenses/by-nc/3.0/.

Articles citing this article

OPEN ACCESS ARTICLE
ACCEPTED MANUSCRIPT

This Article

  1. Genome Res. gr.158261.113 Published by Cold Spring Harbor Laboratory Press

Article Category

Share

Preprint Server