Parameter-efficient fine-tuning on large protein language models improves signal peptide prediction

Shuai Zeng; Duolin Wang; Lei Jiang; Dong Xu

doi:10.1101/gr.279132.124

Method

Parameter-efficient fine-tuning on large protein language models improves signal peptide prediction

- Department of Electrical Engineering and Computer Science, Christopher S. Bond Life Sciences Center, University of Missouri, Columbia, Missouri 65211, USA

Published July 26, 2024. Vol 34 Issue 9, pp. 1445-1454. https://doi.org/10.1101/gr.279132.124

Cite Article Permissions

Current Issue:

April 2026, Vol. 36, No. 4

This article requires a subscription/paid access. Click here for options on how to access the full text.

Purchase short term access

Buy access to this article online for 24 hours. This includes access to:

The HTML version on the journal website, along with any supplementary material
A PDF version that can be downloaded for offline use during or after the access period

Access via an Institutional Subscription

You may already have access via your institution. Connect securely to your campus network or connect via an institutional VPN to see whether you have access.

Recommend this journal to your institution

If you do not have subscription access and would like to recommend this journal to your librarian, please use this online form.

Focus view

Abstract

Signal peptides (SPs) play a crucial role in protein translocation in cells. The development of large protein language models (PLMs) and prompt-based learning provide a new opportunity for SP prediction, especially for the categories with limited annotated data. We present a parameter-efficient fine-tuning (PEFT) framework for SP prediction, PEFT-SP, to effectively utilize pretrained PLMs. We integrated low-rank adaptation (LoRA) into ESM-2 models to better leverage the protein sequence evolutionary knowledge of PLMs. Experiments show that PEFT-SP using LoRA enhances state-of-the-art results, leading to a maximum Matthews correlation coefficient (MCC) gain of 87.3% for SPs with small training samples and an overall MCC gain of 6.1%. Furthermore, we also employed two other PEFT methods, prompt tuning and adapter tuning, in ESM-2 for SP prediction. More elaborate experiments show that PEFT-SP using adapter tuning can also improve the state-of-the-art results by up to 28.1% MCC gain for SPs with small training samples and an overall MCC gain of 3.8%. LoRA requires fewer computing resources and less memory than the adapter tuning during the training stage, making it possible to adapt larger and more powerful protein models for SP prediction.

Article contents

Article (Back to top)

Method

Parameter-efficient fine-tuning on large protein language models improves signal peptide prediction

Cite this article

Share

Current Issue:

Purchase short term access

Access via an Institutional Subscription

Recommend this journal to your institution

Abstract

Article contents

Announcement(s)