
lincRNA sequence composition and conservation. (A) A/U content of lincRNAs and ancRNAs, compared to that of mRNA 5′ UTRs, 3′ UTRs, and coding regions, and that of intergenic regions. Box and whisker plots indicate the median, interquartile range (IQR) between 25th and 75th percentiles (box), and 1.5 IQR (whisker). (B) A/U content of lincRNAs antisense to abundant 22G-RNAs (≥5 RPKM) and those antisense to less abundant or no 22G-RNAs (<5 RPKM); otherwise, as in A. (C) The fraction of mRNAs containing annotated repeat elements. (D) The fraction of lincRNAs containing annotated repeat elements. (E) Fraction of residues aligned in multiple-genome alignments for the indicated mRNA and lincRNA regions. Control exons were generated by random selection of a length-matched region from intergenic space of the same chromosome; within this control region, exons were assigned to the same relative positions as in the authentic lincRNA locus. Annotated repeats were removed from the control exons, lincRNA exons, and lincRNA introns prior to analysis. (F) Conservation of lincRNA and mRNA introns and exons. Shown are cumulative distributions of mean phastCons scores derived from the six-way whole-genome alignments (Siepel et al. 2005). Control exons were as in E. (G) Relationship between mapping to 22G-RNAs and sequence conservation. lincRNAs were assigned to three groups based on the abundance (RPKM) of antisense-mapping 22G-RNAs. Shown are cumulative distributions of mean phastCons scores (Siepel et al. 2005) for each group. (H) Lengths of conserved regions within exons. For each exon that had an average phastCons score > 0, the maximum length of regions exceeding a phastCons score of 0.5 was measured. For CDS exons, 1000 length-matched exons were randomly selected from coding regions.











