Sequence Quality as a Function of Read Lengths
| Organisms sequenced | Percent fidelity of sequence and no. of ambiguous bases within 100-base intervals from first readable base | Useful data range (bases) | ||||||
| 1–100 | 101–200 | 201–300 | 301–400 | 401–500 | 501–600 | 601–700 | ||
| E. coli apaG gene | 99 (0) | 99 (1) | 100 (0) | 99 (1) | 100 (0) | 99 (0) | 95 (2) | 700 |
| E. coli HtpG gene | 99 (1) | 99 (1) | 99 (0) | 98 (2) | 95 (5) | 87 (5) | 64 (14) | 560 |
| E. coli ldhA gene | 97 (2) | 100 (0) | 100 (0) | 98 (2) | 98 (2) | 97 (3) | 91 (5) | 740 |
| S. pneumoniae pspAgene | 98 (0) | 100 (0) | 100 (0) | 100 (0) | 100 (0) | 99 (1) | 91 (5) | 790 |
| U. urealyticum ⧣1 | 96 (4) | 100 (0) | 100 (0) | 100 (0) | 99 (1) | 96 (3) | 90 (3) | 655 |
| U. urealyticum ⧣2 | 93 (3) | 98 (1) | 100 (0) | 100 (0) | 99 (1) | 99 (1) | 89 (6) | 689 |
| U. urealyticum ⧣3 | 93 (3) | 98 (1) | 100 (0) | 100 (0) | 100 (0) | 99 (0) | 77 (8) | 674 |
| U. urealyticum ⧣4 | 96 (3) | 99 (1) | 99 (1) | 97 (2) | 92 (8) | 96 (3) | 68 (8) | 650 |
| M. fermentans[ii] | 100 (0) | 0[ii] | 0[ii] | 0[ii] | 0[ii] | 1 | 4 | 725 |
| Average | 96.8 (1.6) | 99.1 (0.6) | 99.8 (0.1) | 99.1 (0.6) | 97.8 (1.8) | 96.5 (1.9) | 83 (6.0) | 712 ± 18 |
[i] Each unedited sequence determined from genomic DNA template is aligned and compared with its corresponding GenBank database sequence. The fidelity of each sequence is listed as percentage agreement in a 100-bp interval. The number following the percentage value is the number of ambiguous bases (N) for each interval. The usable data range is the length of sequence that would be employed after human editing of the computer-generated base-calls.
[ii] The M. fermentans initiation factor database sequence only overlapped the new sequence for 150 bases, limiting the comparison to that stretch.