Table 1.
Protein benchmarks for homology/similarity detection
| Benchmarka | # Pairs (homologs) | Homolog definition | Example protein [domain architecture] |
|---|---|---|---|
| pfam-max50 | 10,450 (5228) | Identical domain architecture; <50 aa between domains | Q9VFJ2 [PF03946, PF00298] P53875 [PF03946, PF00298] |
| pfam-nomax50 | 71,988 (36,278) | Identical domain architecture; no constraint on the amino acid between domains | Q15149 [PF03501, CL0188, CL0188, PF00681] Q9QXS1 [PF03501, CL0188, CL0188, PF00681] |
| pfam-local | 15,273 (7602) | Share some domains, but not all | P40791 [PF00319, PF12347] Q8VWM8 [PF00319, PF01486] |
| gene3d-nomax50 | 58,163 (29,109) | Same as pfam-nomax50 but based on CATH domains | P52917 [1.20.58.280, 3.40.50.300] Q9ZNT0 [1.20.58.280, 3.40.50.300] |
| supfam-nomax50 | 49,365 (24,708) | Same as pfam-nomax50 but based on SCOP domains | Q9T0N8 [56,176, 55,103] P46681 [56,176, 55,103] |
-
aThe benchmarks are denoted as pfam-max50, gene3d-nomax50, and so on to indicate the domain database used for defining the homologs, with the number of pairs (total/homologs) in each benchmark listed in the second column. The benchmarks include full-length proteins. Each particular benchmark's definition of homology is located in the third column, and example protein domain architectures are depicted in the last column.











