Abstract

Cell type annotation is a critical and essential task in single-cell data analysis. Various reference-based methods have provided rapid annotation for diverse single-cell data. However, selection of the optimal references and methods is often overlooked. To this end, we present a cross–data set cell type annotation methodology with a universal reference data and method selection strategy (CAMUS) to achieve highly accurate and efficient annotations. We demonstrate the advantages of CAMUS by conducting comprehensive analyses on 672 pairs of cross-species scRNA-seq data sets. The annotation results with references selected by CAMUS achieves substantial accuracy gains (25.0%–124.7%) over random selection strategies across five reference-based methods. CAMUS achieves high accuracy in choosing the best reference–method pair among 3360 pairs (49.1%). Moreover, CAMUS shows high accuracy in selecting the best methods on the 80 scST data sets (82.5%) and five scATAC-seq data sets (100.0%), illustrating its universal applicability. In addition, we utilize the CAMUS score with other metrics to predict the annotation accuracy, providing direct guidance on whether to accept current annotation results.

Loading
Loading
Back to top