scSHEFT enables multiomics label transfer from scRNA-seq to scATAC-seq through dual alignment

  1. Min Li1
  1. 1School of Computer Science and Engineering, University, Changsha, Hunan, 410083, China;
  2. 2Bioinformatics Institute, Agency for Science, Technology and Research (A*STAR), 138673, Singapore;
  3. 3Immunology Translational Research Program, Department of Microbiology and Immunology, Yong Loo Lin School of Medicine, National University of Singapore (NUS), 117545, Singapore
  • Corresponding author: limin{at}mail.csu.edu.cn
  • Abstract

    Currently, with the emergence of abundant single-cell multiomics data, there is a trend where labels are transferred from well-annotated scRNA-seq data to less-annotated omics data, such as scATAC-seq. This approach leverages the gene expression profiles available in scRNA-seq to help annotate common cell types and even novel cell types for other omics data. However, the heterogeneous features between scRNA-seq and scATAC-seq pose challenges for identifying different cell types, which hinders the discovery of novel types. In this study, we propose a new label transfer tool scSHEFT, which simultaneously considers gene expression count data, peak count data, and Gene Activity Scores as inputs to bridge the gap of heterogeneous features. Specifically, we transform scATAC-seq data into Gene Activity Scores based on prior knowledge to harmonize heterogeneous features. As the feature transformation would result in information loss, we introduce the raw ATAC-seq embeddings to preserve the original information. To achieve a balance between interomics alignment and intraomics heterogeneity, we propose a dual alignment strategy. Specifically, scSHEFT employs an anchor-based approach to align interomics anchor pairs and a contrastive-based strategy to preserve cellular heterogeneity within each omics layer. Benchmarking scSHEFT against 11 state-of-the-art methods across seven data sets demonstrates its superiority in handling data sets of varying scales and technical noises.

    Footnotes

    • Received January 8, 2025.
    • Accepted November 14, 2025.

    This article is distributed exclusively by Cold Spring Harbor Laboratory Press for the first six months after the full-issue publication date (see https://genome.cshlp.org/site/misc/terms.xhtml). After six months, it is available under a Creative Commons License (Attribution-NonCommercial 4.0 International), as described at http://creativecommons.org/licenses/by-nc/4.0/.

    | Table of Contents

    Preprint Server