Glucocorticoid receptor quaternary structure drives chromatin occupancy and transcriptional outcome
- 1Laboratory of Receptor Biology and Gene Expression, National Cancer Institute, National Institutes of Health, Bethesda, Maryland 20892-5055, USA;
- 2Institute of Biomedicine, University of Eastern Finland, Kuopio, FI-70211 Kuopio, Finland;
- 3IFIBYNE, UBA-CONICET, Universidad de Buenos Aires, Facultad de Ciencias Exactas y Naturales, Buenos Aires, C1428EGA, Argentina
-
↵4 These authors contributed equally to this work.
Abstract
Most transcription factors, including nuclear receptors, are widely modeled as binding regulatory elements as monomers, homodimers, or heterodimers. Recent findings in live cells show that the glucocorticoid receptor NR3C1 (also known as GR) forms tetramers on enhancers, owing to an allosteric alteration induced by DNA binding, and suggest that higher oligomerization states are important for the gene regulatory responses of GR. By using a variant (GRtetra) that mimics this allosteric transition, we performed genome-wide studies using a GR knockout cell line with reintroduced wild-type GR or reintroduced GRtetra. GRtetra acts as a super receptor by binding to response elements not accessible to the wild-type receptor and both induces and represses more genes than GRwt. These results argue that DNA binding induces a structural transition to the tetrameric state, forming a transient higher-order structure that drives both the activating and repressive actions of glucocorticoids.
The multimeric structure adopted by transcription factor complexes at their genomic sites of interaction is a longstanding problem in transcription biology. Information bearing on this question is often inferred from indirect data sources. Many transcription factors, including nuclear receptors, are found in crystal structures as either homodimers or heterodimers. These structures in turn suggest preliminary models for how the proteins might interact with regulatory sites in the genome. Genomic methods, such as digital genomic footprinting (DGF) (Hesselberth et al. 2009) or chromatin immunoprecipitation (ChIP)-exo (Rhee and Pugh 2011), extend these models based on the thesis that the “footprint” left on the DNA (either DNA protection for DGF, or protein complex boundaries in the case of ChIP-exo) is informative as to the structure of the bound complex. From a rigorous viewpoint, these static methods address the nature of the DNA interaction domains, not necessarily the actual status of the full protein complex. Finally, many studies invoke the presence or absence of a DNA motif for a given factor (or binding partner) in a ChIP coupled with deep sequencing (ChIP-seq) peak as evidence for the composition of the complex.
In contrast, methods that measure directly the size of a complex in live cells can address the actual multimeric status of a given complex in real time. The bound TP53 protein (also known as p53) has been described as a tetramer, based on both static methods (Emamzadah et al. 2011; Rhee and Pugh 2011) and live cell approaches (Gaglia et al. 2013). More recently, the DNA-bound STAT3 factor was also confirmed to be a tetramer, based on pair correlation of molecular brightness (pCOMB), another live cell technique (Hinde et al. 2016). These findings illustrate the inadequacy of static cell data sets for an accurate and complete understanding of molecular status for regulatory complexes.
When the glucocorticoid receptor response element (GRE) was elucidated in the 1980s (Huang et al. 1981), it was postulated that the receptor likely binds DNA as a dimer owing to the palindromic nature of the GRE consensus motif (Evans 1988). The further characterization of the crystal structure of the DNA-binding domain (DBD) fragment appeared to confirm this theory (Luisi et al. 1991). This line of research continued in the 1990s using the A465T mutation within DBDs of the mouse glucocorticoid receptor NR3C1 (also known as GR), known as the GRdim mutant (Heck et al. 1994; Reichardt et al. 1998), and led to the development of the dissociated model of glucocorticoid action (Clark and Belvisi 2012). This model describes two modes of GR binding that direct distinct transcriptional responses, direct dimeric binding for induction or tethered monomeric binding for repression, and theoretically allow the development of selective glucocorticoids that favor one form of receptor binding, thus eliciting therapeutically favorable repressive responses. Further mechanisms for GR action have proliferated in recent years, including composite GREs, negative GREs, competitive GREs, squelching, etc., models that propose a variety of binding modes (Ratman et al. 2013). These concepts are invariably based on static, population-based methods, as well as inferential approaches derived from motif profiling.
We recently examined GR's oligomeric state at response elements in live cells, using an advanced fluorescence microscopy technique termed Number and Brightness (Digman et al. 2008). We reported that receptor binding on an activating GRE induces tetrameric quaternary structures, suggesting that tetramers are the final active form of GR (Presman et al. 2016). Among the nuclear receptor family, only the retinoid X receptor (RXR) has been postulated to exist as a full-length tetrameric complex (Kersten et al. 1995). This state of the RXR has generally been considered a repressive or inactive form, with ligand-induced disruption to activating dimers. However, one case has been reported in which the RXR tetramer was invoked as an activator (Mangelsdorf et al. 1991).
To unveil the relationship between GR's oligomerization state and transcriptional outcome, we have examined chromatin binding and receptor-dependent gene regulation for the receptor in two GR null cell systems. A null mouse embryonic fibroblast (MEF) system was described previously (Presman et al. 2014). A second GR knockout (GRKO) was established by eliminating all endogenous copies of the receptor gene in the 3617 mouse mammary cell line (Voss et al. 2011). We used these systems in conjunction with a mutation in the GR DBD (P481R) that mimics a GR structural transition induced by DNA binding (van Tilborg et al. 2000). We showed previously (Presman et al. 2016) that this mutation (referred to herein as GRtetra) induces a constitutive tetrameric state for all activated receptors in the nucleus.
Results
Generation of a GRKO/GR tetrameric cell system
Our group has developed numerous cell lines from murine C127 mammary adenocarcinoma cells to study GR action (McNally et al. 2000). By using CRISPR/Cas9, we first knocked-out the stably integrated rat GFP-GR gene in the 3617-cell model. We used this parental cell line for making a GRKO system in which we could reintroduce a wild type or a forced tetrameric GR mutant (GRtetra; more details later), both GFP-tagged, to study receptor binding and gene response (for details, see Methods) (Supplemental Fig. S1A). Western blots show no detectable GR levels in the knockout cell line (GRKO) (Supplemental Fig. S1B). We confirmed the lack of hormone response in the GRKO cells via RNA-seq, comparing the GRKO and parental lines before and after 2 h of 100 nM dexamethasone (Dex) (Supplemental Fig. S1C; Supplemental Table S1). ChIP-seq, with either anti-GR or anti-GFP antibodies, confirms the lack of GR binding in the GRKO line compared with the other cell lines with reintroduced GR (Supplemental Fig. S1D,E). As shown in the GRwt cell line, there is a good correlation in binding between the two antibodies (Supplemental Fig. S1F). We used the anti-GFP antibody herein because of a better signal-to-noise ratio compared with anti-GR (Supplemental Fig. S1G).
Tetrameric GR mutant occupies more genomic sites than GRwt and efficiently increases chromatin accessibility
Although it is widely accepted GR's transcriptional output depends heavily on its dimeric/monomeric status (Vandewalle et al. 2018), recent live-cell imaging evidence suggests that GR forms higher oligomerization states, most likely a tetramer, after engaging with chromatin in vivo (for a summary, see Supplemental Fig. S1A; Presman and Hager 2017). We reasoned that shifting the equilibrium of the receptor toward tetramers should affect its ability to bind to chromatin. To test this hypothesis, we performed ChIP-seq in the presence or absence of Dex (1 h, 100 nM) in our engineered cells expressing GRwt, or the GR-P481R (GRtetra) receptor (cell line characterization in Supplemental Figs. S1B, S2A–F). This point mutation mimics the DNA-bound conformation of the receptor (van Tilborg et al. 2000), likely changing the conformation of GR to constitutively expose transcriptional activation surfaces normally available only when bound to DNA. More recently, we have shown liganded GRtetra forms tetramers throughout the nucleus in live cells (Presman et al. 2016). Western blots from crosslinked and sonicated material did not show any evidence of potential differences in detectability or stability between GRwt and GRtetra under the ChIP protocol (Supplemental Fig S2B).
By using a relatively stringent peak calling protocol (see Supplemental Methods), we obtained 5923 binding sites for GFP-GRwt (cluster [C] 1) using an anti-GFP antibody, 2124 binding sites (C2) using anti-GR, and 11,003 sites for GFP-GRtetra (C3) using anti-GFP (Supplemental Fig. S1D,E,G). For a guide to all clusters described in this work, please refer to Supplemental Figure S2G. ChIP peaks were aligned on their GR peak summits and sorted to obtain a heat map with the most highly occupied sites at the top of each cluster (Fig. 1A). Binding profile comparisons between reintroduced GFP-GRwt, GFP-GRtetra, and endogenous GR in 3134 cells (John et al. 2011) reveal three distinct binding clusters (clusters C4–C6) (Fig. 1A). GRtetra binding spans all three clusters, whereas GRwt significantly occupies only C4 (Supplemental Fig. S3A,B). This clearly indicates that the mutant binds to more sites than GRwt. Also, signal intensity suggests GRtetra binds more strongly than GRwt (Supplemental Fig. S3A). The C5 represents sites that are not significantly occupied by the stably integrated GFP-GRwt; however, these enhancers remained engaged by the endogenous GR in 3134 cells (Fig. 1A; Supplemental Fig. S3C). One possibility is that the GFP-tag on the reintroduced receptors may inhibit binding compared with the endogenous GR. Another possibility is that during the GRKO generation, the chromatin landscape partially changed, and some of the GR sites that require other initiators are no longer accessible. Nevertheless, GFP-tagged GRtetra overcomes this negative effect and binds to additional sites (C6) that even endogenous GR cannot bind (Fig. 1A; Supplemental Fig. S3A–C). Representative Genome Browser track examples are shown in Supplemental Figure S3D,F.
Chromatin analysis of tetrameric GR mutant. (A) Comparison of GRwt and GRtetra binding reveals three clusters, C4–C6: C4, shared by GRwt and GRtetra; C5, shared by endogenous GR in 3134 cells and GRtetra; C6, GRtetra-specific sites. Heat maps represent ChIP-seq, ATAC-seq, and MNase-seq data as indicated. Each heat map represents ± 1 kb around the center of the GR peak. Binding intensity scale is noted below on a linear scale. Heat maps are sorted based in GRtetra binding intensity. All heat maps are normalized to a total of 10 million reads and further to local tag density. (B) De novo motif analysis. The percentage of sites with a motif, P-value of enrichment, and position-weight matrix (PWM) are shown. A full list of enriched de novo motifs is shown in Supplemental Table S2. (C) Cumulative distribution function (CDF) between C4–C6 binding sites and JUN (AP-1) peak in 3134 cells. Each cluster is color-coded with median distance shown for each cluster. Gray dashed line depicts median.
The vast majority (>90%) of GR binding sites occur within distal intergenic and intronic genomic regions (Supplemental Fig. S4A), consistent with previous reports on GR genomic distribution (So et al. 2007). Hence, GRtetra enriches at similar genomic loci as GRwt. De novo motif analyses of GRtetra binding sites show that the C4–C6 sites contain GREs (Fig. 1B; Supplemental Table S2), with highest frequency at the GRtetra-specific C6 and lowest at C4 sites shared with GRwt. There is <1% enrichment of half GREs at C6, indicating that GR tetramerization drives the receptor binding to full GREs. Consistent with the important role the AP-1 complex (JUN-FOS) has on GR binding in this cell line (Biddie et al. 2011), de novo motif analyses shows enrichment of the JUN (AP-1) motif at all clusters with varying degree (Fig. 1B). Although C4 sites shows JUN binding (Fig. 1A), frequently have the AP-1 motif (Fig. 1B), and are close to an AP-1 peak in 3134 cells (median 130 bp) (Fig. 1C), there is no JUN binding at C6 sites, and the median distance between the GR peak and JUN peak is 10 kb. This analysis suggests that GRtetra is less dependent on AP-1, resembling a “pioneer-like” factor.
The mouse genome harbors millions of putative GREs; however, GR binds only a small subset of them in a highly tissue-specific manner. Chromatin landscape appears to be a major determinant in defining GR access to its response elements (Voss and Hager 2014). The majority of cell-specific GR-enhancers appear to be primed or “preprogramed” for GR binding, as measured by DNase I hypersensitivity assays. Another subset of enhancers is initially closed and actively opened by GR (i.e., de novo sites), showing context-dependent pioneer activity of this receptor (John et al. 2011; Johnson et al. 2018). To assess the chromatin environment of GRtetra binding sites, we measured (1) chromatin accessibility by the assay for transposase-accessible chromatin (ATAC)-seq and (2) the binding profile of the chromatin remodeler SMARCA4 (also known as BRG1), an important GR cofactor (John et al. 2011). In both cases, the data were sorted in the same order as the GR ChIP data. In all clusters, GRtetra more effectively increases chromatin accessibility and SMARCA4 recruitment after Dex treatment, especially in C5 and C6 (Fig. 1A; Supplemental Fig. S4B,C). The SMARCA4 ChIP-seq and ATAC-seq data parallel the SMARCA4 ChIP-seq and DNase-seq data from the 3134 cells (Fig. 1A; Supplemental Fig. S3C), except in the additional GRtetra-specific sites (C6). There is no binding of GR, SMARCA4, JUN, or EP300 (also known as p300) at C6 sites in the presence or absence of hormone in 3134 cells (Fig. 1A; Supplemental Fig. S3C), and these sites are not accessible before hormone in GRtetra-expressing cells (Fig. 1A; Supplemental Fig. S4B). Furthermore, nucleosome mapping in the 3134 cell line by MNase-seq (Johnson et al. 2018) shows high nucleosome presence around C6 sites before hormone (Fig. 1A; Supplemental Fig. S3C). On the other hand, C4 are mainly—but not completely—depleted of nucleosomes as resorting C4 sites based on nucleosome occupancy reveals that GRwt can also penetrate closed, nucleosomal chromatin (Supplemental Fig. S4D).
Taken together, we conclude that GRtetra functions as a more effective pioneer factor than GRwt. Forcing the receptor into a tetrameric conformation appears to drive the protein into an optimal conformation. Not only does GRtetra bind better to sites already accessible to GRwt, but also it penetrates and recruits SMARCA4 to sites that are SMARCA4 free, AP-1 free, nucleosomal, not accessible to nucleases (ATAC-seq), and enriched for GRE motifs.
An obligate tetrameric GR induces and represses more genes than the wild-type receptor
To determine if more efficient binding of GRtetra influences Dex-regulated gene expression, we performed RNA-seq in each cell line (GRwt, GRtetra) as done with the GRKO and parental cells (Supplemental Table S1). We included only genes that are annotated in the RefSeq database and eliminated any genes with duplicate gene symbols that may be indicative of alternative TSS or similar variants. Since we sequenced GRwt and GRtetra cell lines independently, technical variability in library preparation and/or sequencing runs do not allow us to directly compare baseline expression levels between cell lines; however, other no-treatment RNA replicates, prepared and sequenced at the same time, do show basal expression similarity between GRwt and GRtetra cell lines (Supplemental Fig. S4E,F and Supplemental Methods). Moreover, results show that hormone treatment both induces (Fig. 2A) and represses (Fig. 2B) more target genes in GRtetra cells compared to GRwt expressing cells (Supplemental Table S1). Although the shared Dex up-regulated genes between GRwt and GRtetra are similarly induced (Fig. 2C), common Dex-repressed genes are done so to a much greater extent in the GRtetra cells (Fig. 2D). The much stronger repressive effect of GRtetra contrasts with the model that GR-mediated transcriptional repression acts through monomeric modes of receptor interaction with chromatin (Lim et al. 2015). Representative examples of each category are shown as bar graphs (Fig. 2E).
Gene expression analysis of tetrameric GR mutant. (A,B) Venn diagrams of up-regulated or down-regulated genes from RNA-seq data after 2-h Dex treatment. (C,D) Box plots represent the log2 fold change of the 109 (up-regulated) or 18 (down-regulated) GRwt and GRtetra shared genes. (E) Examples of Dex-regulated genes from each subset. Data shown as fold change of RNA-seq RPKM values for GRKO, GRwt, and GRtetra cells. For each cell line, the nontreated sample was used as reference point. (F,G) C4–C6 sites were associated to the nearest Dex up-regulated gene (F) or Dex down-regulated gene (G) based on linear proximity. Box plots represent log2 fold change (Dex/NT) of the Dex-regulated gene in GRwt- and GRtetra-expressing cells that are associated with GR binding sites in C4 (left), C5 (middle), or C6 (right). All box plot comparisons are normalized to total of 10 million reads. (H,I) Shared and GRtetra uniquely Dex up-regulated (H) or Dex down-regulated (I) genes were associated to the closest GR binding site in one of the C4–C6 clusters. Bar graph represents the percentage of regulated genes associated closest with GR binding site of the three clusters. P-values are calculated using χ2 test.
We next analyzed the correlation between (1) the closest Dex-regulated genes to receptor binding sites for each cluster in GRwt or GRtetra cells (peak-centric analysis) (Fig. 2F,G) and (2) the clusters with closest GR binding sites to a regulated gene (gene-centric analysis) (Fig. 2H,I); for details, see Supplemental Methods. Within the C4 cluster, although up-regulation of genes does not differ between the GRtetra- and GRwt-expressing cells (Fig. 2F), GRtetra appears to down-regulate genes more effectively (Fig. 2G). In contrast, within the C5 cluster, GRtetra both up- and down-regulates genes better. The GRtetra-exclusive cluster C6 is also close to many Dex up-regulated genes that are uniquely expressed in GRtetra cells (Fig. 2F). Representative examples are shown as Genome Browser tracks for C4 sites and C6 sites (Supplemental Fig. S5A–D). From the gene-centric perspective, looking at up-regulated genes only, even though C4 peaks associate more frequently with target genes, C5 and C6 are significantly more associated to tetra-unique-regulated genes than to shared-regulated genes (Fig. 2H). Conversely, the association significantly decreases for C4 sites. These differences are not observed with down-regulated genes (Fig. 2I). Moreover, C4 sites are closest to the regulated gene TSS, whereas C6 are the furthest away (Supplemental Fig. S5E–H). Thus, C6 sites do not cluster “near” tetra uniquely regulated genes, suggesting that GRtetra sites may be part of chromatin loops to regulate these genes. Indeed, it has been shown stimulus-dependent biological processes are more likely to interact with distal rather than proximal binding sites (Heidari et al. 2014). Taken together, this strongly suggests the ability of GRtetra to bind to new genomic sites leads to transcriptional regulation of additional target genes. Of note, the GRtetra C6 unique sites are weakly associated with down-regulated genes (Fig. 2G), consistent with a previous report claiming many glucocorticoid down-regulated genes lack nearby GR binding sites (Reddy et al. 2009). It is also plausible that newly generated GRtetra sites influence more gene induction than repression through penetration of closed chromatin sites.
In conclusion, GRtetra is less dependent on JUN, and its higher oligomerization state generates a “super receptor” that can engage in novel GREs unavailable to wild-type receptors. GRtetra has increased transcriptional effects, especially in hormone-repressed genes, and can more effectively act as a pioneer factor than GRwt.
New tetrameric GR binding sites are marked with H3K4me1
To investigate if particular histone marks enable GRtetra to target only a certain subset of GREs, we measured enhancer activation status by ChIP-seq of histone 3 lysine 27 acetylation (H3K27ac) and histone 3 lysine 4 monomethylation (H3K4me1). These two histone modifications can be used to map active and poised enhancers, respectively (Creyghton et al. 2010). Comparing H3K27ac and H3K4me1 levels in GRtetra cells and GRwt cells shows that practically all GR binding sites shared by both cell lines (C4) have the classical bimodal distribution of H3K27ac after Dex treatment, which is only observed in a subset of sites in the nontreated conditions (Fig. 3A,B). Resorting the C4 cluster based on chromatin preaccessibility revealed that the subset that displays a bimodal distribution of H3K27ac and H3K4me1 before hormone treatment is largely the preaccessible sites (top 25% Supplemental Fig. S6A,B). C4 sites with low or absent preaccessibility, de novo sites, are marked with H3K4me1 before hormone exposure, indicating that GR binding activates these poised enhancers (bottom 25%) (Supplemental Fig. S6A,B). In comparison, the Dex-induced bimodal H3K27ac distribution in C5–C6 sites is only observed in GRtetra cells (Fig. 3A,B), which also correlates with SMARCA4 recruitment (Fig. 1A). These observations strongly suggest that these enhancers are activated by GRtetra. In addition, if we split the C4 peaks according to their association with their closest up- or down-regulated genes as in Figure 2, F and G, a clear opposite pattern emerges, wherein H3K27ac increases with up-regulation and decreases with down-regulated associated genes (Supplemental Fig. S6C). The correlation between down-regulation and decreased H3K27ac acetylation is consistent with a direct repressive activity of both GRwt and GRtetra. Nevertheless, the C5–C6 peaks correlates well only with up-regulation of GRtetra-dependent genes, suggesting GRtetra repressive action may involve indirect or even direct long-distance action.
Active and poised enhancer marks at GRtetra binding sites. (A) ChIP-seq data of H3K27ac and H3K4me1 at clusters C4–C6. GR binding is represented as in Figure 1A. A two-color binding intensity scale is used for the histone modification data on a linear scale. (B) Aggregate plots represent histone modification changes for each cluster; color indicates treatment and GR type. (C) Aggregate plots show the comparison of GRtetra H3K4me1 enrichment in untreated cells at C6 sites to GR binding sites present in other cell lines not shared with GRtetra cells. AtT-20 (pituitary), mouse embryonic fibroblast (MEF), and 3T3-L1 differentiated toward white (WAT) or brown (BAT) adipose tissue. Peaks used in the analyses harbored a GRE at the center of the peak.
H3K4me1 was prominently enriched at the center of C5–C6 sites, marking the regions prior hormone exposure (Fig. 3A,B). There was little change to H3K4me1 after Dex treatment in the GRwt cells; however, the mark was decreased in GRtetra-expressing cells. Thus, the histone marks data agree with the receptor binding data in which C5–C6 sites are activated only in GRtetra cells. The H3K27ac is enriched at the center of C6 sites before hormone exposure (Fig. 3A,B); however, a similar presence is observed in the input sample (Supplemental Fig. S7A), suggesting that it is owing to higher nucleosome occupancy rather than more of the histone modification. Indeed, both C5 and C6 sites contain nucleosomes in 3134 cells before hormone treatment, mirroring the H3K4me1 data (Supplemental Fig. S3C, MNase). H3K4me1 is enriched significantly more than the input sample at the C6 sites (Supplemental Fig. S7B), suggesting that GRtetra can bind H3K4me1-marked binding sites.
To assess the role of H3K4me1 marking GRtetra-specific sites more closely, we compared H3K4me1 levels at other GR and GRE-containing binding sites. For this purpose, we downloaded all the available mouse GR ChIP-seq data sets from different cell types from the NCBI Gene Expression Omnibus (GEO). To call peaks from each data set, we used either available nontreated sample or input sample from the same data set or from the same laboratory (for details, see Supplemental Methods; Supplemental Table S3). We compared each data set to GRtetra binding data to differentiate GR peaks from each cell type that are not present in the GRtetra-expressing cells. We further filtered the GR peaks to only contain peaks that harbor GRE sequences. In addition, we created a control GRE data set consisting of 10 iterations of 1000 random GREs. All the data sets harbor GREs at the center of the site (Supplemental Fig. S7C). None of the GR binding sites in other cell types display enrichment of H3K4me1 in the GRtetra-expressing cells (Fig. 3C), including the random GREs (Supplemental Fig. S7B). This suggests that GRtetra is not targeted to random GR binding sites occupied in other cell types but to H3K4me1-marked sites.
Tetrameric GR mutant transcends tissue-specific barriers
As stated above, GR binding is highly tissue specific, with little overlapping between cell types (Grbesa and Hakim 2017). We hypothesize that GRtetra new sites might be normally occupied by GR in other cell tissues; thus, we used previously published mouse GR ChIP-seq data sets from different cell types to assess the overlap between the new GRtetra binding sites and these other GR data sets (see Supplemental Methods). The heat map in Figure 4A was sorted based on the tag density of the other cell line data with GRtetra-absent sites on the top part and GRtetra binding sites on the bottom part of the heat map (Supplemental Fig. S7D, cartoon guide). Different degrees of overlap can be observed at the GRtetra binding sites between different GR data sets (Fig. 4A, bottom part of the heat map). Of the GRtetra unique sites (C6), 66% (1198/1808 sites) overlap in at least one of the seven GR ChIP-seq cell line data sets (Fig. 4B), with GREs being the most significant motif (Supplemental Fig. S8A–G). As an example, 32% (584/1808) of GRtetra unique sites overlapped with GR data in the liver (C7) (Fig. 4C,D), with GRE being the most common motif (Fig. 4E). In addition to GREs, the thyroid hormone receptor beta (THRB) motif was also present at the shared C7 sites. Because THRB is an important transcription factor in the liver (Goldstein and Hager 2015), it suggests the GR sites where GRtetra binds can be functionally relevant.
GRtetra unique binding sites are occupied by the receptor in other cell types. (A) Heat map comparison of all GRtetra binding sites (1) to endogenous GR binding sites in other cell types (2–10). Data set information shown on the top of the heat maps. Each cell type was separated into its own heat map. Top heat maps show cell-type–specific binding sites not occupied by GRtetra. The peak number is shown on the left. Lower heat maps show all GRtetra binding sites and the overlap with other cell types. The peak number overlapping between the GRtetra (C4–C6 sites) and the other cell type is shown on the right of the lower heat map. The number of peaks shared between C6 (GRtetra specific) and the other cell type is indicated with an arrow on the bottom of the heat map. Heat maps are sorted by GR ChIP-seq binding intensity in the other cell line. (B) Pie chart showing C6 sites unique or shared with at least one published GR ChIP-seq data set. De novo motif analysis of the most enriched motif shown on the right. Full list of enriched de novo motifs is shown in Supplemental Table S2. (C,D) Heat map (C) and (D) aggregate plots of C6 sites compared with published mouse liver GR data set reveal two clusters, C7 (common to both GRtetra unique sites [C6] and GR binding sites in liver) and C8 (unique GRtetra sites not shared with GR data in liver). (E) De novo motif analysis of C7 sites. Full list of enriched de novo motifs is shown in Supplemental Table S2.
The remaining (34%) C6 sites do not overlap with the published GR ChIP-seq data sets we analyzed (Fig. 4B). However, the GRE motif is the most frequently occurring motif, suggesting these sites could also be occupied by GR in other, nonsequenced tissues. Altogether, we conclude GRtetra can penetrate cell-type–specific barriers to wild-type receptor binding.
The properties of tetrameric GR mutant are retained in another cell type
To evaluate the generality of our GRtetra results, we compared receptor binding in wild-type immortalized MEFs as well as GFP-GRwt or GFP-GRtetra expressed in GRKO MEF (MEF-GRKO) cells (Presman et al. 2014, 2016). Comparison of the three GRs in MEFs also shows that GRtetra binds to more sites more strongly and more significantly than does endogenous GR or GFP-GRwt (Fig. 5A,B; Supplemental Fig S9A). All MEFwt and MEF-GRwt binding sites are occupied by MEF-GRtetra (C9); however, MEF-GRtetra binds to even more sites (C10). Representative Genome Browser track examples are shown in Supplemental Figure S9, B and C. In agreement with the adenocarcinoma cells, C10 (GRtetra-exclusive) sites contain higher enrichment of GREs compared with shared C9 (Fig. 5C; Supplemental Table S2). In addition, ATAC-seq data and H3K4me1 ChIP-seq from MEFs (Chronis et al. 2017) indicated that C10 sites are less accessible than C9 sites before hormone exposure, and H3K4me1 marks more clearly the center of the C10 sites than the C9 sites (Fig. 5A,D,E). Thus, GRtetra can also target closed chromatin sites in MEFs. Finally, we compared C10 and H3K4me1 MEF data to GR binding sites in other cell types (Fig. 5F; Supplemental Fig. S9D,F). The majority (60%) of the C10 sites are present in at least one of the GR ChIP-seq data sets, with GREs being the most significantly enriched motif (Fig. 5F; Supplemental Fig. S10). None of the GR binding sites in other cell types display high levels of H3K4me1 in the MEF-GRtetra–expressing cells (Supplemental Fig. S9D,E). Thus, all the properties of GRtetra—pioneer factor-like action, targeted to H3K4me1-marked GRE sites, and penetrating the cell-type–specific barrier—are retained in two different cell types.
Properties of GRtetra are retained in MEFs. (A) Comparison of endogenous GR in MEFs, reintroduced GRwt, and GRtetra binding in GRKO MEFs reveals two clusters, C9 (shared by MEF-GRwt and MEF-GRtetra) and C10 (MEF-GRtetra–specific sites). Each heat map represents ±1 kb around the center of the GR peak. Binding intensity scale is noted below on a linear scale. A two-color binding intensity scale is used for the histone modification data. Heat maps are sorted based on MEF-GRtetra GFP ChIP-seq binding intensity. (B) Aggregate plots of +Dex GR binding (anti-GFP, or anti-GR for wild-type MEFs) for each cluster; color indicates cell type with endogenous GR, GRKO, or MEFs expressing GRwt or GRtetra. (C) De novo motif analysis as in Figure 1B. Full list of enriched de novo motifs is shown in Supplemental Table S2. (D) Aggregate plots of ATAC-seq tag density in untreated MEFs for each GR binding cluster. (E) Aggregate plots of H3K4me1 ChIP-seq and input tag density in untreated MEFs for each GR binding cluster. (F) Pie chart showing C10 sites unique or shared with at least one published GR ChIP-seq data set. De novo motif analysis of the most enriched motif shown on the right.
Discussion
Almost all eukaryote transcription factor classes are currently modeled as binding to regulatory sites as monomers, homodimers, or heterodimers. There are a few well-described exceptions to this rule. The activated heat shock factor (HSF) binds sites as a trimer (Rabindran et al. 1993). In plants, MADS domain transcription factors form functional tetramers (Espinosa-Soto et al. 2014). When signal transducer and activator of transcription (STAT) family members are activated by JAK-mediated tyrosine phosphorylation, they form phosphorylated homodimers that translocate to the nucleus and bind GAS motifs (Zhao et al. 2013). Early biochemical studies showed the formation of STAT tetramers on repeated motifs with a spacing of 11–12 bp (Vinkemeier et al. 1998). Recent evidence supporting the conversion of STAT3 from dimers to tetramers during DNA binding in living cells was presented by Gaus and colleagues (Hinde et al. 2016), using the pCOMB technique.
The TP53 tumor-suppressor protein has been extensively characterized as a tetrameric binding factor (McLure and Lee 1998). The tetrameric protein binds a consensus sequence consisting of two consecutive palindromic half-sites. Tetramerization has been proposed to result either from an increase in dimer concentration (Bode and Dong 2004) or from a DNA-damage-induced mechanism (Gaglia et al. 2013). As with the STAT factors, tetramerization of TP53 has been observed by imaging methods in live cells (Gaglia et al. 2013).
For the large nuclear receptor family, a tetrameric state has been described for only one member, RXR. The nuclear receptors contain significant proportions of disorganized structure; thus, detailed structural information has been available only for the DBD or LBD portions of the molecules. It is clear, however, that the unliganded RXR protein can exist as a complete tetramer independent of DNA binding (Kersten et al. 1995; Zhang et al. 2011), in contrast to the STAT factors and the TP53 protein. Also, in contrast with STATs and TP53, the tetrameric form of RXR has generally been considered a repressive form, with ligand dissociation to activating dimers.
GR oligomeric status has been a matter of continuous debate (Sacta et al. 2016; Presman and Hager 2017) and is presumed to be of important pharmacological relevance. This discussion has centered mostly on the possible involvement of homodimer and monomer binding (Busillo and Cidlowski 2013; Starick et al. 2015), as well as tethering (Langlais et al. 2012; Ratman et al. 2013) and possible cobinding of monomeric GR with other transcription factors (Cohen and Steger 2017). By using the technique of number and brightness analysis, we reported that GR adopts a tetrameric configuration upon binding to a specific GRE in living cells (Presman et al. 2016). The apparent influence of DNA binding on the oligomeric state suggests a parallel with the findings of an allosteric GR structural transition for the bound receptor (Meijsing et al. 2009; Watson et al. 2013). By using a mutation that mimics the DNA-induced allostery (van Tilborg et al. 2000), we showed the conversion of all activated nuclear receptors to the tetrameric state (Presman et al. 2016). This observation facilitated an examination of the binding and transactivation activity for this receptor form.
Introduction of the constitutive tetrameric receptor in two distinct GRKO cells reveals an unprecedented active form of the receptor. This receptor not only binds more robustly to all sites occupied by GRwt but also invades a large number of sites unavailable to the wild-type GR (Figs. 1, 5). Chromatin analysis shows that these new sites are resistant to nuclease digestion and show the lack of active histone marks diagnostic of inactive enhancers (Figs. 1, 3, 5). These features suggest the “tetra-specific” binding sites may be either tissue-selective enhancers not available in the mammary knockout cell or random sites unrelated to normal GR function.
A careful examination of overlap between the “tetra-specific” sites and sites bound by GR in several other cell types supports the first hypothesis. The vast majority of new sites represent GR binding elements active in alternate cell types (1198/1808 for the mammary KO and 868/1517 for the MEF KO) (Figs. 1, 5). Because almost all tissue types in the body express GR, it is likely that many of the remaining overlaps correspond to enhancers active in tissues not represented in the current analysis.
GRtetra is also a more active regulator of gene expression, both activation and repression (Fig. 2). GR transrepression is often modeled on the basis of tethering, wherein the receptor negatively modulates the activity of a DNA-bound interacting partner (Glass and Saijo 2010). Traditional examples include the negative modulation of JUN-FOS (i.e., AP-1) and NFKB1 (also known as NFκB); key factors in inflammatory responses (Cain and Cidlowski 2017). Given the strong repressive activity of the GRtetra receptor, we propose that transrepression as previously defined cannot entirely explain GR repressive action. In fact, both JUN-FOS and NFKB1 pathways are most likely modulated by GR in a more complex manner that involves, at least in part, an oligomeric receptor interacting directly with the chromatin landscape (Weikum et al. 2017; Hudson et al. 2018).
Of the hundreds of thousands of potential binding elements across the genome, GR only binds a very small subset of sites in a highly tissue-specific manner (Grbesa and Hakim 2017). The mechanisms behind GR binding selectivity are likely complex, but multiple lines of evidence point to chromatin landscape as a major contributing factor (John et al. 2008, 2011; Love et al. 2017; D'Ippolito et al. 2018; Johnson et al. 2018). A recent study examined in detail the relationship between nucleosome position, GR binding potential, and remodeler action (Johnson et al. 2018). Essentially all binding events involve the action of the SWI/SNF remodeling complex, either prerecruited by other factors or recruited by the receptor itself. Furthermore, at receptor-dependent SMARCA4 recruitment sites, the receptor clearly binds to a pre-existing nucleosome.
The findings presented herein support the concept of DNA as an allosteric effector of GR action (Meijsing et al. 2009; Watson et al. 2013). However, a primary effect of DNA binding is to convert the receptor to a tetrameric state. Whether DNA-induced tetramerization occurs widely or only at a subset of GREs remains an open question. Nevertheless, the distinct enrichment of GRE motifs at GRtetra-enhanced sites (C6) may indicate higher oligomerization states are the general rule rather than the exception. Integrating both current and previous observations, we suggest a general model described in Figure 6. The wild-type receptor can bind both closed nucleosomal and preaccessible sites at active enhancers in a given cell type but cannot invade tissue-restricted enhancers. When converted constitutively to the tetrameric state via the P481R mutation, the receptor gains the ability to recruit activities necessary for activating repressed elements. This “pioneering” action may simply be a normal activity of the wild-type receptor, enhanced by the higher concentration, or possibly longer duration, of the tetrameric state. Further exploration of the dynamic behavior of the tetrameric receptor in live cells may shed light on this issue. It will also be important to examine the activity of the tetrameric form in forming long-range interactions within the nucleus, as the tetramer may present a second binding domain able to interact in trans with distant binding sites (Presman et al. 2016).
Model of tetrameric GR action on chromatin. Binding of liganded GR to chromatin induces a transition from a dimeric to tetrameric state (1). GRwt can bind to closed nucleosomal sites (2) and to preaccessible sites (3). At both classes of sites, GRwt can recruit chromatin remodelers and other cofactors to increase chromatin accessibility and influence gene expression. AP-1 (or other initiating factors) maintains chromatin preaccessible sites before receptor binding. GRwt is incapable of binding to inaccessible GREs (4). Liganded GRtetra (GR-P481R) is constitutively tetrameric (5). Liganded GRtetra binds to the same sites as GRwt. The binding of receptor represents transient states (6) with relatively brief residence times in live cells. In addition, GRtetra can penetrate GREs marked with H3K4me1 that are inaccessible to GRwt (7). These GRtetra-specific sites can influence gene expression.
In conclusion, we reiterate that the putative oligomeric status for transcription factors at their binding sites is almost always inferred from structural studies on purified proteins, deduced binding motifs, or genomic footprints. The unusual DNA binding state for GR was discovered using a very specialized form of correlation spectroscopy called number and brightness. It is likely that many transcription factors originally characterized as monomers or dimers present higher oligomeric states when evaluated by these new imaging methods. For example, the progesterone receptor, considered to be dimeric, forms tetramers in a ligand-dependent, yet DNA-independent manner (Presman et al. 2016). These rapid advances in live cell characterization of complex size and structure at specific binding elements open new opportunities for understanding the actual status of enhancer activating factors. Further application of these methods is likely to provide critical mechanistic insights in this area of gene regulation.
Methods
Cell culture and generation of cell lines by CRISPR/Cas9
All cell lines were grown as previously described (Presman et al. 2016). For details, see Supplemental Methods. To knockout the stably integrated rat GFP-GR gene and the endogenous Nr3c1 gene, 3617 cells (McNally et al. 2000) were transfected with pX330 CRISPR/Cas9 plasmid (Addgene 42230) containing guide RNA sequences to induce random frameshift mutations at those loci. For further details, see Supplemental Methods. We reintroduced the GFP-tagged WT and the mouse GR P481R mutant into the GT(Rosa)26Sor locus. GFP-integrated cells were selected for similar levels of GFP expression and size uniformity.
RNA isolation and RNA-seq data analysis
GR mutant-expressing cells were left untreated or treated with 100 nM of Dex for 2 h before RNA isolation. RNA isolations were performed using the PureLink RNA kit (Thermo Fisher Scientific 12183018A) per the manufacturer's instructions. RNA-seq libraries were generated from rRNA-depleted (Illumina RS-122-2301) total-RNA samples, using Illumina stranded total RNA (Illumina 20020596) according to the manufacturer's instructions. We sequenced at least two biological replicates of each cell line for the untreated condition and three replicates each for the hormone treatment condition using an Illumina HiSeq 2500 with pair-end reads. RNA-seq alignment to mouse mm10 genome was performed by TopHat2 (2.0.8) (Kim et al. 2013). All RNA-seq biological replicates correlated well with each other (Supplemental Table S4). Subsequent downstream analysis was performed using HOMER (Heinz et al. 2010) and DESeq2 (Love et al. 2014).
ChIP-seq and ATAC-seq
GR mutant-expressing cells were left untreated or treated with 100 nM of Dex (Sigma-Aldrich) for 1 h. For ChIP, after cross-linking with paraformaldehyde and cell collection, the chromatin was sonicated (Bioruptor, Diagenode) to an average DNA length of 200–500 bp. For immunoprecipitation, 600 µg of chromatin was incubated with antibody (for details, see Supplemental Methods) coupled onto Dynabeads magnetic beads (Thermo Fisher Scientific) with rotation overnight at 4°C. After stringent washes, the antibody-bound chromatin fragments were eluted, the cross-linking was reversed, and the remaining proteins were digested. DNA was extracted from the samples with phenol-chloroform extraction and ethanol precipitation. ChIP-seq libraries were generated using a TruSeq ChIP sample prep kit (Illumina IP-202-1012) according to the manufacturer's instructions. For ATAC, the cells were detached from the flasks using 5 mL of Accutase (Thermo Fisher Scientific) by incubating 5 min at RT. After nuclei isolation, ATAC was performed according to the published protocol (Buenrostro et al. 2015). Size selection was performed using SPRIselect (Beckman Coulter) to remove <150-bp and >800-bp fragments according to the manufacturer's instructions. Size selection was verified using 5% TBE PAGE gels (Bio-Rad).
ChIP- and ATAC-seq data analysis
Biological duplicate ChIP samples were sequenced using Illumina NextSeq 500 with single-end reads, whereas biological duplicate ATAC samples were sequenced using Illumina HiSeq 4000 with paired-end reads. The data were aligned to the mouse reference mm10 genome using Bowtie 2 (Langmead and Salzberg 2012). All ChIP-seq and ATAC-seq biological replicates correlated well with each other (Supplemental Table S4). Subsequent downstream analysis was performed using HOMER (Heinz et al. 2010). Peaks in each data set were called using findPeaks with style factor for transcription factors and style histone for histone modifications. DESeq2 (Love et al. 2014) through getDiffrentialPeaksReplicates.pl was used to isolate differential binding peaks (FDR < 0.05, FC > 3) between the GR mutants.
Data access
Raw data and processed data from this study have been submitted to the NCBI Gene Expression Omnibus (GEO; https://www.ncbi.nlm.nih.gov/geo/) under accession number GSE108634. Accession numbers for all previously published data used in this study can be found in Supplemental Table S3.
Acknowledgments
We thank Ido Goldstein for critical reading of the manuscript and the National Cancer Institute Advanced Technology Program Sequencing Facility for sequencing services. The research used the NIH high-performance computing systems (Biowulf). Research was supported by the Intramural Research Program of the CCR, NCI, NIH. V.P. was supported, in part, by the Academy of Finland, the University of Eastern Finland strategic funding, and the Sigrid Jusélius Foundation. D.M.P. was supported, in part, by CONICET.
Author contributions: V.P. participated in the planning of the project and performed genome-wide experiments and bioinformatic analyses. T.A.J. participated in the planning of the project, performed genome-wide experiments, and generated cell lines. D.M.P. initiated the project and assisted with experimentation and design. D.M.P. and G.L.H. directed the project. All authors prepared the manuscript.
Footnotes
-
[Supplemental material is available for this article.]
-
Article published online before print. Article, supplemental material, and publication date are at http://www.genome.org/cgi/doi/10.1101/gr.244814.118.
- Received October 3, 2018.
- Accepted April 9, 2019.
This is a work of the US Government.

















