Figure 2.

Using corrected distances improves statistical efficiency by 10%–15%. We benchmarked two baselines (NJ applied to Hamming Distance, as well as its weighted version) against the corrected versions which we propose, using a finer grid {90, 95, 100, …, 145, 150} of number of characters during the simulations. We observe that the best performing method—NJ applied to weighted Hamming Distance with correction—needs 10%–15% less characters to achieve a similar performance on RF and triplets correct. Each entry is the average performance of 250 repetitions.

1199f02