EnDeep4mC predicts DNA N4-methylcytosine sites using a dual-adaptive feature encoding framework in deep ensembles

Table 1.

Performance comparison between EnDeep4mC ensemble model and each base model trained by fivefold cross-validation on six species data sets

Data set Algorithm ACC SN SP MCC AUC F1-score
C. elegans CNN 0.9252 0.9559 0.8946 0.8529 0.9812 0.9278
Bi-LSTM 0.9231 0.9514 0.8948 0.8461 0.9792 0.9246
Transformer 0.9296 0.9589 0.9003 0.859 0.9829 0.9308
EnDeep4mC 0.9571 0.9594 0.9548 0.9142 0.9914 0.9572
D. melanogaster CNN 0.9191 0.9530 0.8851 0.8401 0.9758 0.9216
Bi-LSTM 0.9169 0.9502 0.8836 0.8379 0.9738 0.9205
Transformer 0.9246 0.9540 0.8951 0.8512 0.9778 0.927
EnDeep4mC 0.9412 0.9508 0.9316 0.8826 0.9842 0.9418
A. thaliana CNN 0.8727 0.8933 0.8522 0.7489 0.9453 0.8758
Bi-LSTM 0.8702 0.8781 0.8623 0.7423 0.9423 0.8727
Transformer 0.8789 0.8962 0.8617 0.7587 0.9491 0.8811
EnDeep4mC 0.9133 0.9205 0.9061 0.8267 0.9697 0.9133
E. coli CNN 0.9619 0.9513 0.9725 0.9438 0.9933 0.9718
Bi-LSTM 0.9644 0.9437 0.9850 0.9251 0.9972 0.9623
Transformer 0.9675 0.9613 0.9737 0.9376 0.9947 0.9686
EnDeep4mC 0.9973 0.9969 0.9976 0.9945 0.9999 0.9973
G. subterraneus CNN 0.8525 0.8725 0.8325 0.7102 0.9334 0.852
Bi-LSTM 0.8551 0.8545 0.8557 0.7142 0.9309 0.8578
Transformer 0.8514 0.8527 0.8500 0.7024 0.9282 0.8513
EnDeep4mC 0.9349 0.9323 0.9375 0.8698 0.9786 0.9347
G. pickeringii CNN 0.9075 0.9437 0.8712 0.8233 0.9722 0.9138
Bi-LSTM 0.9180 0.9384 0.8976 0.8345 0.9742 0.9186
Transformer 0.9107 0.9346 0.8868 0.8241 0.9709 0.9127
EnDeep4mC 0.9729 0.9740 0.9718 0.9457 0.9937 0.9729
  • Bold values indicate the best performance achieved within each comparison group.

This Article

  1. Genome Res. 36: 589-599

Preprint Server