AUC and total runtime for each method (listed from the least to most accurate) on pfam-local benchmark with 15,273 pairs of proteins
| UBLAST | USEARCH | FASTA | phmmer | BLAST | CS-BLAST | HHsearch | DCTdomaina | |
|---|---|---|---|---|---|---|---|---|
| AUC | 0.840 | 0.906 | 0.906 | 0.924 | 0.951 | 0.952 | 0.971 | 0.972 |
| Time | 237 sec | 156 sec | 749 sec | 993 sec | 468 sec | 50 min | 5.7 h | 6.6 sec/47 min |
-
aOur program reports both DCTglobal scores and DCTdomain scores and the reported time is for computing both; using DCTglobal scores resulted in very low accuracy with AUC of 0.665 only; see Figure 5. PROST is not included in this table as its AUC is also very low. Given the DCT fingerprints, DCTdomain is extremely fast using only a few seconds for comparing all the pairs; it is still relatively fast (47 min) even including the DCT fingerprint generation (ESM-2 embedding and RecCut) for all of the proteins in those pairs (13,407 proteins). All programs were run on the same Linux computer using one CPU and GPU (embedding was done using GPU).











