Highly accurate reference and method selection for universal cross-dataset cell type annotation with CAMUS

  1. Shihua Zhang3,4
  1. 1 Fudan University, Academy of Mathematics and Systems Science, Chinese Academy of Sciences;
  2. 2 Fudan University;
  3. 3 Academy of Mathematics and Systems Sciences Chinese Academy of Sciences, University of Chinese Academy of Sciences
  • * Corresponding author; email: zsh{at}amss.ac.cn
  • Abstract

    Cell type annotation is a critical and essential task in single-cell data analysis. Various reference-based methods have provided rapid annotation for diverse single-cell data. However, how to select the optimal references and methods is often overlooked. To this end, we present a cross-dataset cell-type annotation methodology with a universal reference data and method selection strategy (CAMUS) to achieve highly accurate and efficient annotations. We demonstrate the advantages of CAMUS by conducting comprehensive analyses on 672 pairs of cross-species scRNA-seq datasets. The annotation results with references selected by CAMUS achieved substantial accuracy gains (25.0-124.7%) over random selection strategies across five reference-based methods. CAMUS achieved high accuracy in choosing the best reference-method pair among 3360 pairs (49.1%). Moreover, CAMUS showed high accuracy in selecting the best methods on the 80 scST datasets (82.5%) and five scATAC-seq datasets (100.0%), illustrating its universal applicability. In addition, we utilized the CAMUS score with other metrics to predict the annotation accuracy, providing direct guidance on whether to accept current annotation results.

    • Received April 20, 2025.
    • Accepted September 25, 2025.

    This article is distributed exclusively by Cold Spring Harbor Laboratory Press for the first six months after the full-issue publication date (see https://genome.cshlp.org/site/misc/terms.xhtml). After six months, it is available under a Creative Commons License (Attribution-NonCommercial 4.0 International), as described at http://creativecommons.org/licenses/by-nc/4.0/.

    ACCEPTED MANUSCRIPT

    This Article

    1. Genome Res. gr.280821.125 Published by Cold Spring Harbor Laboratory Press

    Article Category

    Share

    Preprint Server