Supplemental_Data.zip contains all the relevant training/test data for reproducing the results in "Ultra high diversity factorizable libraries for efficient therapeutic discovery". A complementary file named “Supplemental_Code.zip” contains the relevant code. Please visit https://github.com/gifford-lab/FactorizableLibrary for additional information.

Contents:
data - contains training and test data splits
calibration.txt - contains the sequences used to estimate the standard deviation of the models
encoding.pkl - contains the mapping between amino acids and vectors