
Comprehensive assessment of the accuracy and stability of four consensus strategies. (A) Consensus clustering results for four consensus strategies on seven baseline algorithms sorted from low to high according to average ARI. (B) Accuracies of four consensus strategies under different numbers of baseline algorithm. We randomly selected one to six baseline algorithms for mouse brain data, DLPFC sample 151672, and human breast data and selected one to five baseline algorithms for mouse olfactory bulb data and performed consensus clustering for each scenario 20 times. (C) Averaged ARI of four consensus strategies under “single method” and “all methods” situations. The size of the dots represents the mean ARI of 10 repeat experiments; color of the spots indicates stability. (D) Comparison of F1 scores across cell types for consensus strategies (top four panels) and baseline methods (bottom seven panels).











