The Sanger Centre; The Washington University Genome Sequencing Center

Figure 1.

Strategy for determining the human genome sequence. Sequence-ready maps are constructed by screening for bacterial clones using a high density of STSs (15/Mb on average). Bacterial clones are assembled into contigs by comparative restriction fingerprint analysis and landmark content mapping. Contigs are extended and joined by generating new markers at their ends or using region-specific probes generated from bridging YAC clones. A minimally overlapping subset of clones is selected for sequencing, after ensuring that for each clone all available restriction patterns, landmark content, and fluorescent in situ hybridization data are consistent. DNA fragments are derived from the bacterial clones and subcloned into bacteriophage M13 or plasmid vectors. The aim is to achieve an average of approximately six- to sevenfold coverage in high-quality bases for each base of the bacterial clone insert from 2500 reads. More reads may be added to facilitate finishing of difficult clones and to compensate for the occasional higher than expected rates of failure (sequencing and gel failures, vector reads, and contaminating sequences). These can generally be assembled into 2–10 contigs representing >98% of the bacterial clone. This preliminary consensus, or unfinished, sequence is subjected to automatic and manual editing, and additional directed sequence reads are performed to close the remaining gaps and to resolve all ambiguities, thus providing the finished sequence. The entire sequence is checked and analyzed using a variety of computer tools and manually annotated. At all stages, raw data, analysis, and status information are publicly available via the internet. Full experimental details can be found via URL http://www.sanger.ac.uk/HGP/methods/ andhttp://www.genome.wustl.edu/ (see also Gregory et al. 1997; Leversha 1997; Dunham et al. 1998).

Toward a Complete Human Genome Sequence

This Article

Preprint Server

Current Issue

In This Issue