Protein domain embeddings for fast and accurate similarity search

(Downloading may take up to 30 seconds. If the slide opens in your browser, select File -> Save As to save it.)

Click on image to view larger version.

Figure 3.
Figure 3.

Comparison of the performance of every ESM-2 checkpoint (except t48) by layer on determining if sequence pairs from SCOPe v2.08 are within the same fold or different fold. Checkpoint t30 has the highest performing layers, particularly 15 and 21, which we use as the two layers to generate the DCT fingerprints.

This Article

  1. Genome Res. 34: 1434-1444

Preprint Server