Ranking noncanonical 5′ splice site usage by genome-wide RNA-seq analysis and splicing reporter assays
Abstract
Most human pathogenic mutations in 5’ splice sites affect the canonical GT in positions +1 and +2, leading to noncanonical dinucleotides. On the other hand, noncanonical dinucleotides are observed under physiological conditions in ∼1% of all human 5′ss. It is therefore a challenging task for the understanding of pathogenic mutation mechanisms to examine under which conditions noncanonical 5′ss are used. In this work, we systematically examined noncanonical 5′ splice site selection, both experimentally using splicing competition reporters, and analyzing a large RNA-seq dataset of 54 fibroblast samples from 27 subjects containing a total of 2.4 billion gapped reads covering 269,375 exon junctions. From both approaches, we consistently derived a noncanonical 5′ss usage ranking GC > TT > AT > GA > GG > CT. In our competition splicing reporter assay, noncanonical splicing was strictly dependent on the presence of up- or downstream splicing regulatory elements (SREs), and changes in SREs could be compensated by co-variation of U1 snRNA complementarity in the competing 5′ss. In particular, we could confirm splicing at different positions (i.e. -1, +1, +5) of a splice site for all noncanonical dinucleotides “weaker” than GC. In our comprehensive RNA-seq dataset analysis, noncanonical 5′ss were preferentially detected in weakly used exon junctions of highly expressed genes. Among high confidence splice sites, they were 10-fold overrepresented in clusters with a neighboring, more frequently used 5′ss. Conversely, these more frequently used neighbors contained only the dinucleotides GT, GC and TT, in accordance with the above ranking.
- Received February 8, 2018.
- Accepted October 20, 2018.
- Published by Cold Spring Harbor Laboratory Press
This article is distributed exclusively by Cold Spring Harbor Laboratory Press for the first six months after the full-issue publication date (see http://genome.cshlp.org/site/misc/terms.xhtml). After six months, it is available under a Creative Commons License (Attribution-NonCommercial 4.0 International), as described at http://creativecommons.org/licenses/by-nc/4.0/.











