Differential expression results for the PBMC3K and mouse brain data. (A) Number of statistically significant genes, out of the 2000 most variable, in the PBMC3K data, the best result here being those closest to the unaltered original data. (B) Number of statistically significant genes, out of the 2000 most variable, in the PBMC3K data when using the original uncorrected count data but with clusters calculated on the corrected count data. (C) Percentage of the statistically different genes after correction in the PBMC3K data that are also found in uncorrected data for the cluster only model. The color indicates which count matrix was used for clustering. For uncorrected counts, the uncorrected count matrix was used for the model testing but with cell type clusters calculated on the corrected count matrix. A lower ratio indicates that the process of batch correction alters the genetic expression profiles of clusters. D–F, respectively, are the same as A–C but created using the mouse brain data.
