Building eBACs
|
Number of |
Average (reads) |
Std. Dev. (reads) |
|---|---|---|
| Distinct WGS in overlapsa | 2342 | 2011 |
| WGS passing rarityb | 1431 | 352 |
| WGS passing overlap qualb | 2276 | 1947 |
| WGS passing bothb | 1417 | 351 |
| WGS binned with BACc | 1314 | 310 |
| WGS binned + mates | 1757 | 390 |
| WGS in Phrap contigs
|
1675
|
368
|
-
↵a Distinct WGS in all overlaps produced by the overlapper with 95% identity and 100 k-mer copies allowed
-
↵b Filtering done in Binner based only on overlapper information. Repeat heuristic limits k-mer copies to 12 (three times the coverage). Overlap quality heuristic requires 3 × span/(3 + span-score) ≥ 35, where score is the banded alignment score, and 2 × span/(2 + span-score) would approximate the average distance between discrepancies were there only substitutions (indels have added penalties)
-
↵c Beyond the k-mer repeat and quality heuristics, only the top six (i.e., coverage × 1.5) WGS overlaps from each end of a BAC read are examined, and they are kept only if strictly better by the quality heuristic than the top discarded overlap











