Table 1.
The collection of the data sets with their corresponding mRNA source and property used for method evaluation
| Data set | Target | Category | No. of mRNAs | Seq length |
|---|---|---|---|---|
| MLOS flu vaccines (Sanofi-Aventis) | Expression | Regression | 543 | 1698–1704 |
| mRFP expression (Nieuwkoop et al. 2023) | Expression | Regression | 1459 | 678–678 |
| Fungal expression (Wint et al. 2022) | Expression | Regression | 7056 | 150–3000 |
| E. coli proteins (Ding et al. 2022) | Expression | Classification | 6348 | 171–3000 |
| Tc-riboswitches (Groher et al. 2019) | Switching factor | Regression | 355 | 67–73 |
| mRNA stability (Diez et al. 2022) | Stability | Regression | 41,123 | 30–1497 |
| SARS-CoV-2 vaccine degradation (Wayment-Steele et al. 2022) | Degradation | Regression | 2400 | 81–81 |
-
Each data set is split into training, validation, and test with a 0.7, 0.15, and 0.15 ratio. All the methods were optimized on the same data split.











