Table 1.

The collection of the data sets with their corresponding mRNA source and property used for method evaluation

Data setTargetCategoryNo. of mRNAsSeq length
MLOS flu vaccines (Sanofi-Aventis)ExpressionRegression5431698–1704
mRFP expression (Nieuwkoop et al. 2023)ExpressionRegression1459678–678
Fungal expression (Wint et al. 2022)ExpressionRegression7056150–3000
E. coli proteins (Ding et al. 2022)ExpressionClassification6348171–3000
Tc-riboswitches (Groher et al. 2019)Switching factorRegression35567–73
mRNA stability (Diez et al. 2022)StabilityRegression41,12330–1497
SARS-CoV-2 vaccine degradation (Wayment-Steele et al. 2022)DegradationRegression240081–81

[i] Each data set is split into training, validation, and test with a 0.7, 0.15, and 0.15 ratio. All the methods were optimized on the same data split.