An integrated 3-Dimensional Genome Modeling Engine for data-driven simulation of spatial genome organization

  1. Dariusz Plewczynski1,2,6
  1. 1Centre of New Technologies, Warsaw University, 02–097 Warsaw, Poland;
  2. 2Centre for Innovative Research, Medical University of Bialystok, 15-089 Białystok, Poland;
  3. 3I-BioStat, Hasselt University, BE3590 Hasselt, Belgium;
  4. 4The Jackson Laboratory for Genomic Medicine, Farmington, Connecticut 06032, USA;
  5. 5Department of Genetics and Genome Sciences, UConn Health, Farmington, Connecticut 06032, USA;
  6. 6Faculty of Pharmacy, Medical University of Warsaw, 02-097 Warsaw, Poland
  1. Corresponding authors: yijun.ruan{at}jax.org, d.plewczynski{at}cent.uw.edu.pl
  1. 7 These authors contributed equally to this work.

Abstract

ChIA-PET is a high-throughput mapping technology that reveals long-range chromatin interactions and provides insights into the basic principles of spatial genome organization and gene regulation mediated by specific protein factors. Recently, we showed that a single ChIA-PET experiment provides information at all genomic scales of interest, from the high-resolution locations of binding sites and enriched chromatin interactions mediated by specific protein factors, to the low resolution of nonenriched interactions that reflect topological neighborhoods of higher-order chromosome folding. This multilevel nature of ChIA-PET data offers an opportunity to use multiscale 3D models to study structural-functional relationships at multiple length scales, but doing so requires a structural modeling platform. Here, we report the development of 3D-GNOME (3-Dimensional Genome Modeling Engine), a complete computational pipeline for 3D simulation using ChIA-PET data. 3D-GNOME consists of three integrated components: a graph-distance-based heat map normalization tool, a 3D modeling platform, and an interactive 3D visualization tool. Using ChIA-PET and Hi-C data derived from human B-lymphocytes, we demonstrate the effectiveness of 3D-GNOME in building 3D genome models at multiple levels, including the entire genome, individual chromosomes, and specific segments at megabase (Mb) and kilobase (kb) resolutions of single average and ensemble structures. Further incorporation of CTCF-motif orientation and high-resolution looping patterns in 3D simulation provided additional reliability of potential biologically plausible topological structures.

Footnotes

  • Received February 4, 2016.
  • Accepted October 20, 2016.

This article is distributed exclusively by Cold Spring Harbor Laboratory Press for the first six months after the full-issue publication date (see http://genome.cshlp.org/site/misc/terms.xhtml). After six months, it is available under a Creative Commons License (Attribution-NonCommercial 4.0 International), as described at http://creativecommons.org/licenses/by-nc/4.0/.

Preprint Server