Highly accurate reference and method selection for universal cross–data set cell type annotation with CAMUS
- Qunlun Shen1,2,
- Shuqin Zhang1,3 and
- Shihua Zhang2,4,5
- 1School of Mathematical Sciences, Fudan University, Shanghai 200433, China;
- 2State Key Laboratory of Mathematical Sciences, Academy of Mathematics and Systems Science, Chinese Academy of Sciences, Beijing 100190, China;
- 3Center for Applied Mathematics, Research Institute of Intelligent Complex Systems, and Shanghai Key Laboratory for Contemporary Applied Mathematics, Fudan University, Shanghai 200433, China;
- 4School of Mathematical Sciences, University of Chinese Academy of Sciences, Beijing 100049, China;
- 5Key Laboratory of Systems Health Science of Zhejiang Province, School of Life Science, Hangzhou Institute for Advanced Study, University of Chinese Academy of Sciences, Chinese Academy of Sciences, Hangzhou 310024, China
Abstract
Cell type annotation is a critical and essential task in single-cell data analysis. Various reference-based methods have provided rapid annotation for diverse single-cell data. However, selection of the optimal references and methods is often overlooked. To this end, we present a cross–data set cell type annotation methodology with a universal reference data and method selection strategy (CAMUS) to achieve highly accurate and efficient annotations. We demonstrate the advantages of CAMUS by conducting comprehensive analyses on 672 pairs of cross-species scRNA-seq data sets. The annotation results with references selected by CAMUS achieves substantial accuracy gains (25.0%–124.7%) over random selection strategies across five reference-based methods. CAMUS achieves high accuracy in choosing the best reference–method pair among 3360 pairs (49.1%). Moreover, CAMUS shows high accuracy in selecting the best methods on the 80 scST data sets (82.5%) and five scATAC-seq data sets (100.0%), illustrating its universal applicability. In addition, we utilize the CAMUS score with other metrics to predict the annotation accuracy, providing direct guidance on whether to accept current annotation results.
Footnotes
-
[Supplemental material is available for this article.]
-
Article published online before print. Article, supplemental material, and publication date are at https://www.genome.org/cgi/doi/10.1101/gr.280821.125.
- Received April 20, 2025.
- Accepted September 25, 2025.
This article is distributed exclusively by Cold Spring Harbor Laboratory Press for the first six months after the full-issue publication date (see https://genome.cshlp.org/site/misc/terms.xhtml). After six months, it is available under a Creative Commons License (Attribution-NonCommercial 4.0 International), as described at http://creativecommons.org/licenses/by-nc/4.0/.











