Epigenomic translocation of H3K4me3 broad domains over oncogenes following hijacking of super-enhancers

Chromosomal translocations are important drivers of haematological malignancies whereby proto-oncogenes are activated by juxtaposition with enhancers, often called enhancer hijacking. We analyzed the epigenomic consequences of rearrangements between the super-enhancers of the immunoglobulin heavy locus (IGH) and proto-oncogene CCND1 that are common in B cell malignancies. By integrating BLUEPRINT epigenomic data with DNA breakpoint detection, we characterized the normal chromatin landscape of the human IGH locus and its dynamics after pathological genomic rearrangement. We detected an H3K4me3 broad domain (BD) within the IGH locus of healthy B cells that was absent in samples with IGH-CCND1 translocations. The appearance of H3K4me3-BD over CCND1 in the latter was associated with overexpression and extensive chromatin accessibility of its gene body. We observed similar cancer-specific H3K4me3-BDs associated with hijacking of super-enhancers of other common oncogenes in B cell (MAF, MYC, and FGFR3/NSD2) and T cell malignancies (LMO2, TLX3, and TAL1). Our analysis suggests that H3K4me3-BDs can be created by super-enhancers and supports the new concept of epigenomic translocation, in which the relocation of H3K4me3-BDs from cell identity genes to oncogenes accompanies the translocation of super-enhancers.


Genome-wide H3K4me3-BD and super-enhancer co-occurrence
A total of 59,097 (15,443 non-overlapping) H3K4me3-BDs were detected at genome-wide level using BLUEPRINT chromatin state data divided into the following haematopoietic cell types: haematopoietic stem cells (HSC), B cells, T cells, myeloid cells, MCL, CLL and MM. H3K4me3-BDs were classified into categories based on epigenomic specificity using their overlaps and active chromatin background (Supplemental Fig. S2). A total of 31, 302, 144 and 1186 H3K4me3-BDs were exclusively detected in HSC, B cells, T cells and myeloid cells, respectively. This indicates that the vast majority of H3K4me3-BDs remain stable during haematopoiesis.
We confirm that cell-type exclusive H3K4me3-BDs are associated with cell-identity genes (Supplemental Fig. S3).
B-cell exclusive H3K4me3-BDs identified genes important in B-cell biology (for example immunoglobulin genes IGH, IGK and IGL,or PAX5,MS4A1 and CD22). Similarly, T-cell identity genes were enriched within T-cell exclusive H3K4me3-BDs (for instance CD2, GATA3, BCL11B or CD28) and myeloid identity genes within myeloid exclusive H3K4me3-BDs (for instance TAL1, CEBPA, IL18 or IL1B). In comparison, H3K4me3-BDs that were shared between B cells, T cells and myeloid cells were associated with genes involved in basic cell functions such as RNA processing or epigenetic modifications and so they are likely reflecting the house-keeping background rather than haematopoiesis.  Table S4.
We observed a significantly higher co-occurrence of H3K4me3-BD and super-enhancers compared to H3K4me3-BD and promoters (Supplemental Fig. S4). This enrichment increased with H3K4me3-BD size and was highly significant within a 100 kb proximity across all tested healthy and malignant cell types. Co-occurrence with super-enhancers within 100 kb was stronger in H3K4me3-BDs exclusive for B cells and myeloid cells (Supplemental Fig. S4D,F). T-cell exclusive H3K4me3-BDs showed broader enrichment with super-enhancers present more frequently within 2 Mb proximity (Supplemental Fig.S4E). This may be a result of a small number of H3K4me3-BDs, but also it could be a specific mechanism present in T cells, highlighting the presence of longer-range genomic interactions.

Supplemental Tables
Supplemental Table S1: Selected BLUEPRINT chromatin states and their characterization.

Original # 1 Description Histone marks high signal Figure colour
State 9

Haematopoietic stem cells, B -B cells, T -T cells, M -Myeloid cells, MCL -Mantle cell lymphoma, CLL -Chronic
lymphocytic leukemia, MM -Multiple myeloma. Overlapped H3K4me3-BDs can be recognized by the same labelling within "PART" column.
Supplemental Table S4 is attached as a separated file.

Supplemental Figures
Supplemental Figure S1: Immunoglobulin structure, genes and physiological genomic rearrangements.

Abbreviations: C -Constant region, J -Joining region, D -Diversity region, V -Variable region, RAG -
Recombination-activating gene, AID -Activation-induced cytidine deaminase.   Table S3) represents the selected ChIP-seq chromatin states (Supplemental Table S1) and the lower line shows DNase I hypersensitivity sites for the nonvariable region of the IGH locus. Vertical black lines in sample U266 mark breakpoint positions for the region, which includes the Eα1 super-enhancer that inserts ~12 kb upstream of CCND1 (11q13). (Fig. 3). Abbreviations: BD -Broad domain. Table   S1) First line of each sample shows the selected ChIP-seq chromatin states (Supplemental Table S1) and other lines display peaks of individual histone marks. Numbers in coloured squares (red denotes high expression and blue low expression; the two shades of blue for MYC expression denote different levels of low expression determined in our previous study 1 ) show gene expression detected by RNA-seq and displayed as Log2 normalised counts (Log2 NC).

Supplemental Figure S6: Chromatin landscape of the MYEOV locus (11q13) in healthy human B cells and five cell lines derived from B-cell haematological malignancies. Selected ChIP-seq chromatin states (Supplemental
*sample contains chromosomal translocation involving the displayed region, described in detail in Supplemental   Tables S3 and S5.