Zelda overcomes the high intrinsic nucleosome barrier at enhancers during Drosophila zygotic genome activation

The Drosophila genome activator Vielfaltig (Vfl), also known as Zelda (Zld), is thought to prime enhancers for activation by patterning transcription factors (TFs). Such priming is accompanied by increased chromatin accessibility, but the mechanisms by which this occurs are poorly understood. Here, we analyze the effect of Zld on genome-wide nucleosome occupancy and binding of the patterning TF Dorsal (Dl). Our results show that early enhancers are characterized by an intrinsically high nucleosome barrier. Zld tackles this nucleosome barrier through local depletion of nucleosomes with the effect being dependent on the number and position of Zld motifs. Without Zld, Dl binding decreases at enhancers and redistributes to open regions devoid of enhancer activity. We propose that Zld primes enhancers by lowering the high nucleosome barrier just enough to assist TFs in accessing their binding motifs and promoting spatially controlled enhancer activation if the right patterning TFs are present. We envision that genome activators in general will utilize this mechanism to activate the zygotic genome in a robust and precise manner.

Zld peaks, and Dl peaks within each group. The y-axis represents the frequency of peaks belonging to a certain genomic annotation. The expected Dl occurrence for all Dl peaks within each annotation is shown as a gray shadow. The expected percentage of each annotation across the genome is denoted with a black bar. Significantly different annotations between Dl groups are denoted by an asterisk (* p<0.05, ** p<0.001, *** p<0.0001, hypergeometric test). "Promoter" is -500bp to +150bp of a TSS. (B) The maternal/zygotic contribution of genes assigned to each Dl group is based on the classification of Chen et al. (Chen et al., 2013), with "Z" indicating actively transcribed zygotic genes (pre-MBT + MBT active), "MZ" indicating genes with both maternal and zygotic contribution (MBT maternal), "None" indicating genes not zygotically expressed (N/A + MBT poised). Supplemental Fig. 3. Meta-profiles of Dl peaks that are >1kb away from a TSS are shown for the three Dl-peak groups bound by Zld, as well as Dl peaks that do not co-localize with Zld binding as control. wt Zld binding in blue, wt Dl binding in solid brown line, zld -Dl binding in dashed brown line.
The normalized reads were aligned at the Dl summit, and average reads within 2kb distance are shown.
Supplemental Fig. 4. Heatmaps of nucleosome occupancy at 3 Dl groups in wt and zldembryos.
MNase reads comparing wt and zldembryos are shown for the 3 Dl groups, aligned at Dl summit and ranked by wt Dl summit reads high to low. The read coverage is in linear scale ranging from minimum (zero reads) to maximum (read value at the 99 th percentile among all displayed bases). The x-axis indicates the distance from Dl peak summit (bp). Note within Group I, there is a significant increase in nucleosome occupancy in zldcompared to wt. For Groups II and III, the overall nucleosome occupancy is comparable between wt and zld -, indicating that Zld does not have a significant influence on the nucleosome occupancy at these regions.   Fig. 8. As a normalization control, Dl ChIP and input data were normalized to the mean of total reads, then the differential Dl binding between wt and zldwas analyzed by DESeq as aforementioned in the main text, and MNase meta-profiles as well as predicted nucleosome model were plotted for each Dl group >1kb away from a TSS. This normalization method generated very similar properties of Dl bound regions and MNase profiles as those generated by our Z-score transformation method used in the main text. (A) MA plot of differential Dl binding in zldversus wt embryos. The x-axis represents the mean of normalized Dl reads per peak; the y-axis represents the log 2 fold-change of normalized reads per peak between the genotypes. Significantly decreased peaks (Group I, red), not significantly changed peaks (Group II, blue) and significantly increased peaks (Group III, green) were identified by DESeq with FDR<0.1. (B) MNase meta-profiles (wt in blue, zldin red, predicted nucleosome occupancy model in grey) of Dl peaks that are >1kb away from a TSS are shown for the three Dl-peak groups defined from (A), as well as Dl peaks that do not co-localize with Zld binding as control. The normalized MNase reads and model were aligned at the Dl summit, and average reads (average probability for model) within 1kb distance are shown.

Genomic annotations
Zld and Dl group peaks were each assigned an exclusive genomic annotation based on FlyBase Dmel_Release_5.57 with the following assignment hierarchy: 1) if the peak summit is within a single annotated transcript, it is assigned to the annotations of that transcript; 2) if the peak region has multiple annotations, the peak is assigned to one annotation in the following hierarchical order: promoter (-500bp to +150bp of a TSS), CDS, 5'UTR, 3'UTR and intron; 3) if the peak does not fall into a transcript, it is annotated as in an intergenic region. A peak was considered as "near a TSS" if the peak boundary is within 1kb of a TSS.
Assigning Dl peaks to genes Dl peaks (from peak summit) were assigned to the nearest TSS based on FlyBase Dmel_Release_5.57. Genes that were assigned to multiple Dl-peak groups were excluded from further analysis.

Maternal/zygotic contribution of genes associated with Dl peaks
The maternal/zygotic contribution of a gene was determined according to Chen et al. (Chen et al. 2013), with "Z" indicating actively transcribed zygotic genes during 1-3h, "MZ" indicating genes with both maternal and zygotic contribution, "None" indicating genes not zygotically expressed (N/A + poised).

Random region control and G-C frequency calculation
1000 random regions of 800bp length were selected across the genome, with the criteria that they are >1kb away from a TSS and that their G-C content is insignificantly different (p=0.71, t-test) from that of non-TSS Dl peaks (>1kb away from a TSS) within 400bp of Dl summits. The G-C frequency in Supplemental Fig. 5A within 1kb of the alignment center was calculated with a 75bp sliding window for non-TSS Dl-peak groups and the aforementioned 1000 random regions. In Supplemental Fig. 5C, student's t-test was performed on the predicted nucleosome model centered at Dl summits and 1000 aforementioned random regions within 75bp of alignment center.