Comparison of clustering performance of PCA versus VAE
| Data type | Human data | Canine data | |
|---|---|---|---|
| Pseudo F statistic | PCA | 50,315.56 | 151.89 |
| VAE | 56,409.29 | 276.03 | |
| Silhouette coefficient | PCA | 0.69 | 0.12 |
| VAE | 0.77 | 0.07 | |
| Davies–Bouldin index | PCA | 0.48 | 3.87 |
| VAE | 0.29 | 3.39 |
-
PCA and VAE parameters have been fitted to human and canine SNP data sets of 839,629 and 198,473 SNP positions, respectively. Clustering metrics have been computed on seven self-reported human ancestry groups and 16 canine clades composed of 144 distinct canine breeds. The 2D latent coordinates of the samples have been standardized. Bold values indicate the better-performing method for each metric and data type (higher is better for Pseudo F and Silhouette; lower is better for Davies–Bouldin).











