
Here, we visually explain the “large family bias” problem. The solid line shows the likelihood of gene families with sizes (k(k(k(kk)))) as a function of k, for values of k from 1 to 50. The dashed line shows the average likelihood of gene families evolved from a common ancestor with family size equal to k. The average is computed over 100 random samples for each value of k. Clearly, the likelihoods for large gene families are consistently and significantly smaller.











