Estimation of Errors in “Raw” DNA Sequences: A Validation Study

Table 3.

Comparison of Predicted and Actual Error Rates for Six Different Sequencing Projects

Project Quality score 4–12 13–22 23–32 33–42 43–51
A aligned bases 119,246 75,293 70,391 144,876 73,234
expected errors 20,256 2,064 172 37 1
actual errors 16,784 1,758 127 17 1
B aligned bases 182,034 137,940 181,998 399,690 140,176
expected errors 29,953 3,704 410 102 3
actual errors 26,038 2,536 287 35 0
C aligned bases 139,345 131,419 151,197 292,070 68,529
expected errors 22,277 3,411 357 74 2
actual errors 16,670 1,513 194 26 3
D aligned bases 103,898 68,995 68,613 153,730 111,752
expected errors 16,880 1,919 168 38 3
actual errors 14,495 1,924 146 59 2
E aligned bases 378,755 217,438 167,968 392,717 144,313
expected errors 63,947 6,336 418 95 4
actual errors 55,968 6,516 355 67 5
F aligned bases 359,809 136,688 98,840 64,035 5,130
expected errors 66,938 4,079 256 23 0
actual errors 57,971 3,856 332 33 1
All aligned bases 1,283,087 767,773 739,007 1,447,118 543,134
expected errors 220,252 21,513 1,781 370 13
actual errors 187,926 18,103 1,441 237 12

This Article

  1. Genome Res. 8: 251-259

Preprint Server