Acute depletion of METTL3 implicates N6-methyladenosine in alternative intron/exon inclusion in the nascent transcriptome

RNA N6-methyladenosine (m6A) modification plays important roles in multiple aspects of RNA regulation. m6A is installed cotranscriptionally by the METTL3/14 complex, but its direct roles in RNA processing remain unclear. Here, we investigate the presence of m6A in nascent RNA of mouse embryonic stem cells. We find that around 10% of m6A peaks are located in alternative introns/exons, often close to 5′ splice sites. m6A peaks significantly overlap with RBM15 RNA binding sites and the histone modification H3K36me3. Acute depletion of METTL3 disrupts inclusion of alternative introns/exons in the nascent transcriptome, particularly at 5′ splice sites that are proximal to m6A peaks. For terminal or variable-length exons, m6A peaks are generally located on or immediately downstream from a 5′ splice site that is suppressed in the presence of m6A and upstream of a 5′ splice site that is promoted in the presence of m6A. Genes with the most immediate effects on splicing include several components of the m6A pathway, suggesting an autoregulatory function. Collectively, our findings demonstrate crosstalk between the m6A machinery and the regulation of RNA splicing.

The centre of m 6 A peak stands for the peak summit called by MACS2 and the flanking strand-specific 500nt regions were taken for calculating the signal and making plot. The color key is shown aside.
(E) Boxplots showing the distribution of m 6 A peak intensities for each confidence group, calculated as averages over 4 replicates (2 SySy + 2 Abcam). (B) Pie charts showing m 6 A peak distributions from two previous studies with MeRIP (Batista et al. 2014;Geula et al. 2015) and one study with m 6 A-CLIP in mESCs (Ke et al. 2017).

Spen intron2
Xist E-repeat (B) UCSC genome browser tracks showing the profile of m 6 A in Xist RNA, aligned with the two replicates of RBM15 binding. RT stops (i.e. the 5'-end of each sequencing read) were shown for the irCLIP-seq tracks. Landmark A-and E-repeats of Xist are also indicated.
(D) Zoomed-in view of strand-specific RBM15 binding on loci of example genes: Spen intron 2 (top), Xist E-repeat (middle), and Ythdc1 intron 11 (bottom). RT stops are shown in the tracks. Direction of RNA transcription is also indicated. (A) RBM15 binding (CITS) metaprofiles and heatmaps for exonic m 6 A peaks of 3 different confidence groups (red, blue, and grey for Cfg1, Cfg2, and Cfg3 respectively). The color key is shown for all heatmaps. 0.5kb strand-specific flanking regions each side of m 6 A peak summit are included for the plot.
(B) Same as (A) but for intergenic m 6 A peaks.   Figure S6. H3K36me3 correlates with RBM15 binding and RNA m 6 A modification, related to Figure 3.
(A) H3K36me3 metaprofiles and heatmaps for m 6 A peaks from confidence groups Cfg1 (red), Cfg2 (blue) and Cfg3 (grey), ordered vertically (from top to bottom are Cfg1 to Cfg3) in the heatmaps. Exonic (left) and intronic m 6 A peaks (right) are separated. Color keys are shown as bellow. 5kb flanking regions each side of the m 6 A peak summit are included for the plot.
(B) Metagene profile of H3K36me3 at the intronic m 6 A sites as well as flanking 5 kb regions. The intronic m 6 A sites are more than 5 kb away toward the annotated transcription terminal site of MaxORF_LongestNcRNA transcript by comprehensive GENCODE_vM24 annotation.
(C-E) Metaprofiles and heatmaps of H3K36me3 for Cfg1 (C), Cfg2 (D), and Cfg3 (E) m 6 A peaks with strong (red and green) and weak (grey) RBM15 binding. Red and green curves denote exonic and intronic m 6 A peaks respectively. 5kb regions flanking each m 6 A peak summit are included for the heatmaps, with color keys of H3K36me3 shown below. (B) Heatmap showing the m 6 A peak signals from two METTL3_FKBP12 F36V lines, C3 and H5. Both MeRIP and input samples are shown. The centre of m 6 A peak stands for the peak summit called from SySy ChrMeRIP-seq by MACS2 and the flanking strand-specific 500nt regions were taken for calculating the signal and making plot. The color key is shown on the right. (A) Volcano plots show the differentially expressed genes (red dots) between control and dTAG-13 treatment samples from 3 independent replicates of METTL3_FKBP12 F36V clone C3 (left) and METTL3_FKBP12 F36V H5 clone H5 (right).
(B) Overlap of differentially expressed genes observed in the two clones. (C,D) Schematic (C) and scatter plot (D) showing the correlation between m 6 A intensity and the effect size of splicing changes (deltaPSI) calculated from dTAG METTL3 4sU-seq experiment.
The intensity was calculated from 1000nt upstream of 5'SS (left: SySy 2 replicates; right: Abcam 2 replicates). Fitted lines (blue) and correlation coefficients are indicated. The dot denoting Ythdc1 intron11 is also marked by red.  (B) The splicing direction for the m 6 A-bearing short form upon acute depletion of METTL3 (red and blue lines denote METTL3_FKBP12 F36V clone C3 and H5 respectively from this study) or conditional Ythdc1 knockout (black line, GSE133585). "Ctrl" indicates no treatment in the corresponding experiments, while "KO" means treatment for knockout. Coordinates of alternative 5' ss are shown above each panel and gene names are labelled at the bottom of each panel.   (C) Schematic about how to calculate the splicing ratio of intron 3 and 2 as -log(Intron3/Intron2,e).
(D) Boxplots showing the splicing ratio as (C). ChrRNA-seq datasets in which components of m 6 A writer complex were perturbed (Nesterova et al. 2019). Samples were included only if at least 2 reads span each junction of intron2 and intron3.   (B) Splicing of Wtap intron6 having m 6 A modification. The pink dashed box indicates the intron having m 6 A modification and arrowheads indicate the "partial" intron retention. The sample names are indicated by red. 4sU-seq for METTL3 dTAG study, and nuclear RNA for Ythdc1 cKO study.

Cell culture
All mouse embryonic stem cells (mESCs), except E14 mESCs, were grown in feeder-dependent conditions on gelatinized plates at 37 o C in a 5% CO2 incubator. Mitomycin C-inactivated mouse fibroblasts were used as feeders. mESCs medium consists of Dulbecco's Modified Eagle Medium (DMEM, ThermoFisher) supplemented with 10% foetal calf serum (Seralab), 2 mM L-glutamine (ThermoFisher), 1× non-essential amino acids (ThermoFisher), 50 μM β-mercaptoethanol (ThermoFisher), 50 g/mL penicillin/streptomycin (ThermoFisher), and 1 mL of Leukemia Inhibitory Factor (LIF)-conditioned medium made in-house. E14 cells were grown feeder-free in the same conditions and medium as above. 4sU-RNA immunoprecipitation 4sU-RNA was generated and isolated essentially as described in (Rabani et al. 2011;Pintacuda et al. 2017). In detail, 4-thiouridine (4sU, Sigma-Aldrich, T4509) was dissolved in sterile PBS and stored at -20ºC. 4sU was thawed just before use and added to the cells in the growing media at a concentration of 500 μM. METTL3-FKBP12 F36V expressing cells (C3 and H5 clones) were treated with or without dTAG-13 (100 nM) for 3 hours, then exposed to 4sU-suplemented medium for 30 minutes. Cell culture medium was rapidly removed from cells and 5 mL of TRIzol reagent (Life Technologies) was added. Total RNA was isolated, and any potential DNA contamination was removed using Ambion DNA-free DNase Treatment kit (Life Technologies) according to the manufacturer's instructions. For each μg of total RNA, 2 μL of Biotin-HPDP (Pierce, 50 mg EZ-Link Biotin-HPDP), previously dissolved in DMF at a concentration of 1 mg/mL, and 1 μL of 10x  (Pollard et al. 2010). For calculating conservation and GC contents, control region for this intronic m 6 A methylation was chosen by randomly selecting the size-matched region from the same intron. For the relative intron length and position, random intron from the same transcript was chosen. Plots were generated by R package ggplot2. Scripts used for this analysis were attached.

Calibrated MeRIP-seq analysis
The raw reads from m 6 A immunoprecipitation were mapped to mouse (mm10) and Drosophila (dm6) concatenated genome by STAR (v2.5.2b) (Dobin et al. 2013). The resulted alignment files (BAM) were then split into mm10 and dm6 alignment files according to the chromosome name.
The read counts mapping to the dm6 were counted (Li et al. 2009) and used for subsequent normalization (Supplemental Table S1). The raw reads from input samples were mapped to the mm10 genome by STAR (v2.5.2b) (Dobin et al. 2013). Alignment files for m 6 A IP and input samples were also normalized in a conventional manner by 10 million mapped reads. The strandspecific bigwig files were generated and visualized by IGV (Robinson et al. 2011).

irCLIP-seq analysis
The irCLIP adaptors (sequence) from single-end sequencing reads were first cut by cutadapt (v1.12) (Martin 2011), and then collapsed by CTK toolkit to remove PCR duplication (Shah et al. 2017). The barcodes with UMI-containing 12 letters were moved to FASTQ header lines. The resulted single-end fastq reads, longer than 15 nt, were mapped to the mm10 genome by STAR (v2.5.2b) (Dobin et al. 2013) with key parameters (--outSAMattributes All --outFilterMultimapNmax 1 --outFilterMismatchNmax 2 --alignEndsType EndToEnd --seedSearchStartLmax 15 --outWigType bedGraph read1_5p). This setting is because RT stops were generated by reverse transcriptase around the UV crosslinked sites or crosslinking induced truncation sties (CITS), and RT-stops were main output when analysing the irCLIP-seq reads. The single nucleotide was considered as a CITS if there are no less than 3 stops (>=3) from independent strand-specific reads. In addition, crosslinking induced mutation sites (CIMS), particularly (deletion site) were also calculated. The single nucleotide was considered as a CIMS if this position has more than 2 deletions among total covered reads and the mutation rate is [0.01, 0.5], in order to avoid the potential contamination from the indels. Given that most of the RNA-binding proteins has 5-mer consensus motif, called CITSs or CIMSs from biological replicate 1 were kept if they were also called and within 5 nucleotides distance in biological replicate 2 (closestBed -a CITS_Rep1.bed -b CITS_Rep2.bed -s -d | awk '$13<=5'). The distribution of CITS and CIMS were analysed by the RNAmpp pipeline, described as above. Motifs around these CITSs or CIMSs were searched by Homer with key parameter (findMotifsGenome.pl CITS.bed mm10 MotifOutput -rna).
RBM15 binding, m 6 A peak, and H3K36me3 modification RBM15 CITS (>=3) called in 2 biological replicates were intersected with m 6 A peaks from different confidence groups. Due to the strand-specificity, the closest distance between RBM15 binding and m 6 A peak will be minus if RBM15 CITS located upstream of m 6 A peak (closestBed -a RBM15_CITS.bed -b m6A_peaks.bed -s -D a) (Quinlan and Hall 2010). Strong RBM15 binding group was defined by no more than 1000 nt distance between RBM15 binding and m 6 A peak. The density of RBM15 binding sites (CTIS>=3) 1000 bp flanking exonic m 6 A peaks, intronic m 6 A peaks, and intergenic m 6 A peaks were plotted. BigWig files for H3K36me3 ChIP-seq data retrieved from ENCODE was generated in 10 bp bin size. The metagene profile and heatmap for RBM15 binding group and other groups were generated by deepTools (v3.4.3) (Ramirez et al. 2016). The matrix generating from negative strand bigWig files are combined with the one from positive strand, and with a brief modification to average two replicates, then subjected to plotHeatmap tool (Ramirez et al. 2016).
Differentially expressed genes analysis