Parameter-efficient fine-tuning on large protein language models improves signal peptide prediction

Table 2.

Benchmark results of MCC2 for SignalP 6.0, Fine-tuning ESM2-3B, and PEFT-SP models using different PEFT methods with the ESM2-3B backbone

Method/Backbone Fine-tuning Prompt tuning Adapter tuning LoRA
SP types SignalP 6.0 ESM2-3B ESM2-3B ESM2-3B ESM2-3B
Archaea Sec/SPI 0.793 0.771 0.777 0.825 0.783
Archaea Sec/SPII 0.825 0.864 0.509 0.783 0.730
Archaea Sec/SPIIIa 0.426 0.724 0.500 0.351 0.798
Archaea Tat/SPI 0.563 0.564 0.653 0.538 0.579
Archaea Tat/SPIIa 0.718 0.792 0.182 0.660 0.850
Eukarya Sec/SPI 0.958 0.948 0.954 0.954 0.960
Negative Sec/SPI 0.804 0.813 0.723 0.820 0.809
Negative Sec/SPII 0.929 0.946 0.886 0.950 0.945
Negative Sec/SPIIIa 0.902 0.982 0.970 0.899 0.919
Negative Tat/SPI 0.962 0.902 0.853 0.899 0.961
Negative Tat/SPIIa 0.486 0.358 0.325 0.405 0.520
Positive Sec/SPI 0.770 0.810 0.746 0.814 0.848
Positive Sec/SPII 0.882 0.908 0.833 0.911 0.939
Positive Sec/SPIIIa 0.902 1.000 0.951 0.969 1.000
Positive Tat/SPI 0.799 0.746 0.590 0.752 0.850
Positive Tat/SPIIa 0.786 0.603 0.148 0.669 0.783
Mean (MCC2) 0.781 0.796 0.663 0.762 0.830
  • The bold values indicate the highest value for each SP type among all methods.

  • aSP types with limited training samples.

This Article

  1. Genome Res. 34: 1445-1454

Preprint Server