Distribution of duplicate genes in 20-kb distance bins. (A) Fraction of duplicate gene pairs relative to all pairs at this distance; (B) mean log(Expect) as a measure of similarity. Theinsets show data for 2-kb distance bins. The solid lines are linear regressions (duplicate fraction:R 2 = 0.70, log(Expect):R 2 = 0.66 for 0.5–5 Mb). The functional form seems to change at ∼30 kb and at ∼200 kb.

