Table 1.
Steps in the Atlas Assembly System
|
Stage |
Program |
Action |
Comment |
|---|---|---|---|
| 1. Data preparation | Data quality checks | Contamination (reads from other organisms) and mislabeled reads (ie, from another BAC) are identified and corrected if possible. | |
| trim-reads | Remove low-quality bases, so that only highest-quality sequence is used for finding overlaps between reads. | Trimmed reads only used in finding overlaps; full sequences used to assemble the consensus sequence. | |
| 2. Analyze sequence redundancy | k-mer-counter | Build table of the frequency of oligonucleotides (k-mers). | Only WGS reads used to give most complete and random sampling of the genome. |
| 3. Compute read overlap graph | overlapper | Identify candidate overlaps based on shared rare k-mers. | End-to-end criterion scores alignment on entire overlapping portion of reads. |
| Stringently evaluate overlaps by banded alignment. | |||
| Save overlap graph with stringency annotations. | |||
| 4. eBAC assembly | Coassemble WGS and skim reads that have been assigned to the same BAC to produce eBACs. | ||
| binner | Choose WGS reads with best overlaps to skim reads in a BAC; add read pair mates. | ||
| Phrap | Assemble WGS and skim reads. | ||
| split-scaffold | Split misjoined contigs. | ||
| split-scaffold | Build scaffolds with read pairs. | ||
| 5. Build bactigs | Find overlapping eBACs based on shared reads and more. | ||
| BLASTZ | Confirm overlap by aligning eBACs. | ||
| Compare bactigs to other maps for verification. | |||
| 6. Assembly of bactigs | rolling-phrap | Assemble reads in bactigs. | |
| Phrap | Assemble contigs in bactigs. | ||
| split-scaffold | Split misjoins, build scaffolds. | ||
| 7. Build superbactigs | Link bactigs by read pairs and BAC skim read distribution. | ||
| 8. Build ultrabactigs and map to chromosomes
|
|
Link superbactigs by map and synteny data.
|
|











