
Comparison of embedding-based similarity between G6PD-containing proteins. (A) Domain-level embedding similarity computed by DCTdomain; (B) Whole-protein embedding similarity computed by DCTglobal. Domain-level embedding works better for computing the similarity between protein pairs with local similarity; the distribution of similarity scores for such pairs (shown in blue) shifts toward those for global homologs (shown in orange) when domain-level embeddings (A) instead of whole-protein embeddings (B) were used.











