Multidimensional RNA structural features from primary to tertiary levels associated with sequencing biases. (A) Spike-in synthetic RNAs were processed using the hexamer-based standard VAHTS Universal V8 RNA-seq library preparation protocol. Sequencing counts for each spike-in were obtained and subsequently used for downstream grouping. (B) Three-dimensional plot depicting the aggregated sequencing counts for each category across the 65,536 unique spike-in RNAs. Each RNA template is categorized by its GC-content and MFE, with data organized and centralized based on these two parameters. (C) UMAP performed on one-hot encoded spike-in sequences reveals 16 major clusters among the 65,536 spike-ins. (D,E) Visualization of all spike-in RNAs, color-coded by GC-content cluster and by MFE cluster. The nine GC-content categories and 120 MFE levels reflect the full breadth of the 65,536 spike-in sequences.
