Transporter domains associated with molecular size and partition coefficient. (A) The frequencies of common transporter Pfam domains in BGCs that synthesize metabolites >1000 Da (left) and <1000 Da (right). Bars in green were significantly different in frequency between the two classes (Fisher's exact test; Q < 0.05). (B) Precision-recall curves for two-layer decision trees and LASSO logistic regression models classifying BGCs producing metabolites >1000 Da using Pfam transporter, CATH transporter, and Pfam biosynthetic gene features. (C) The distribution of metabolite molecular weights synthesized by BGCs with at least one NBD-binding ABC transporter domain, at least one MFS domain, and the ABC2_membrane_3 transmembrane domain. (D) Predicted partition coefficients (log P) for metabolites synthesized by BGCs that contain at least one variant of two different ABC transporter transmembrane domains.
