RESOURCE

Single Nucleotide Polymorphism Markers for Genetic Mapping in Drosophila melanogaster

Published May 14, 2001. Vol 11 Issue 6, pp. 1100-1113. https://doi.org/10.1101/gr.178001
Download PDF Cite Article Permissions Share
cover of Genome Research Vol 36 Issue 6
Current Issue:

Abstract

For nearly a century, genetic analysis in Drosophila melanogaster has been a powerful tool for analyzing gene function, yet Drosophila lacks the molecular genetic mapping tools that recently have revolutionized human, mouse, and plant genetics. Here, we describe the systematic characterization of a dense set of molecular markers in Drosophila by using a sequence tagged site-based physical map of the genome. We identify 474 biallelic markers in standard laboratory strains of Drosophila that span the genome. Most of these markers are single nucleotide polymorphisms and sequences for these variants are provided in an accessible format. The average density of the new markers is one per 225 kb on the autosomes and one per megabase on the X chromosome. We include in this survey a set of P-element strains that provide additional use for high-resolution mapping. We show one application of the new markers in a simple set of crosses to map a mutation in the hedgehog gene to an interval of <1 Mb. This new map resource significantly increases the efficiency and resolution of recombination mapping and will be of immediate value to the Drosophila research community.


The development of genome-based tools for genetic mapping has made possible increasingly sophisticated genetic studies in many eukaryotes and has contributed to rapid increases in the rate of discovery of new genes and gene functions. In particular, dense maps of polymorphic markers are in use in humans (Wang et al. 1998; Cargill et al. 1999) and mice (Lindblad-Toh et al. 2000), and currently are being developed in many other vertebrates. Similar resources have been deployed in well-studied and genetically powerful model organisms, including Saccharomyces cerevisiae (Winzeler et al. 1998),Arabidopsis thaliana (Cho et al. 1999), andCaenorhabditis elegans (Koch et al. 2000). It is clear that a dense map of molecular markers is now an important tool for genetic analyses in any organism.

Traditional strategies for meiotic recombination mapping inDrosophila melanogaster rely on a chromosome carrying multiple dominant or recessive marker mutations with visible phenotypes. These visible phenotypes are often laborious to score and may interfere with the phenotype of the mutant of interest. Most importantly, because mutations with easily scored, viable phenotypes are relatively infrequent, the mapping resolution available using this approach is limited. A much higher degree of interstrain variation is available at the molecular level, and modern methods for scoring molecular variants offer the advantages of high throughput and automated scoring (Landegren et al. 1998). In addition, the alleles of such markers are co-dominant and usually phenotypically neutral. Although microsatellites exist in Drosophila, they occur infrequently and show relatively low rates of polymorphism (Schug et al. 1997). By far the most common types of molecular variation are single nucleotide polymorphisms and insertion/deletion polymorphisms, hereafter collectively referred to as single nucleotide polymorphisms (SNPs). The interstrain level of sequence polymorphism in Drosophila is relatively high (Begun and Aquadro 1993; Moriyama and Powell 1996). Thus, sufficient variation exists in Drosophila to develop a high-density map. Recently, a set of 69 SNP markers in a collection of strains used for quantitative trait loci (QTL) mapping purposes was described (Teeter et al. 2000).

We describe here the systematic discovery of a set of genome-wide SNP markers in a collection of commonly used laboratory strains ofDrosophila, and we provide the sequences of these markers in an accessible format that allows immediate use by the research community. We also show the use of this resource by mapping a mutation in the hedgehog gene to a small interval.

RESULTS

Survey of Interstrain Polymorphism

We first evaluated several Drosophila strains for levels of sequence polymorphism. Many strains are in use in theDrosophila research community, and there are no widely used standard mapping strains. We therefore selected a few strains with the aim of identifying strains highly polymorphic relative to each other. The polymorphism rate in Drosophila varies among specific strain pairs, with the greatest variation observed between East African and other populations (Begun and Aquadro 1993). Therefore, strain selection might significantly affect the rate of SNP discovery. We also wanted to use a series of strains containing single mapped P-element insertions for fine-scale mapping (Cutforth and Rubin 1994; Spradling et al. 1995). Thus, another important consideration in our strain selection was the level of sequence polymorphism relative to these P-element strains.

Using genomic DNA from six wild-type strains (Barcelona, Capetown, Hikone, Pyrenees, w;iso2;iso3, and a P-element containing strain) as templates, we compared sequences from 24 third chromosome sequence tagged sites (STSs) from the P1-based physical map of the genome (Kimmerly et al. 1996). The results shown in Table1 indicate that the rate of sequence variation between any two strains ranges from 2.1 per kilobase (1 polymorphism/476 bp) to 5.2 per kilobase (1 polymorphism/192 bp, Table1A), in agreement with previous studies. Although the rate of sequence polymorphism varies depending on the strains compared, much of the observed polymorphism occurs in local clusters, a single STS containing as many as five SNPs. For purposes of genetic mapping, one needs only a single polymorphism per point in the genome. Thus, for the purposes of this study, a better measure of the interstrain polymorphism rate is the percentage of STSs that contain at least one SNP (Table 1B). Although the absolute polymorphism rate observed varies nearly 2.5-fold, from 2.1 differences/kb to 5.2 differences/kb, the percentage of polymorphic STSs varies less than twofold, from 33% to 63%. In addition, the polymorphism rates relative to the P-element containing strain vary less, ranging from 33% to 50%. Because we observed a largely random distribution of sequence variation and similar levels of relative polymorphism among pairwise sets of strains, we chose to focus our genome-wide discovery efforts on three commonly used strains- Canton S, Oregon R, and w; iso2; iso3. We also selected a series of 17 strains that contain single P-element insertions at evenly spaced intervals across the genome.

Table 1.

Analysis of Sequence Variation among Six Wild-Type

1100t1

[i] Twenty-four STSs from the third chromosome were sequenced in the following strains: w;iso2;iso3 (ISO), Barcelona (BAR), Capetown (CAP), Hikone (HIK), Pyrenees (PYR), and a P-element strain (BEP). The total amount of sequence examined was 5.16 kb. Data are represented as (A) total number of polymorphisms per kilobase, and (B) percentage of STSs containing polymorphisms.

Genome-Wide SNP Discovery

We set out to identify a set of SNP markers spanning the genome, with a goal of identifying at least one SNP every 500 kb. We used mapped STSs as a source of genomic sequences (Kimmerly et al. 1996). At the time this project was initiated, these STSs were the only mapped sequence elements with a genome-wide distribution of the required density. The cytological map positions of STSs were inferred from the P1 clones from which they were derived. We chose STSs evenly distributed across each cytological division for screening. We took two approaches to polymorphism identification. First, a set of 1016 STSs was amplified from isogenized versions of Canton S and Oregon R and then sequenced to identify candidate polymorphisms. Second, a partially overlapping set of ∼1050 STSs was screened for polymorphisms between w;iso2;iso3 and the 17 P-element strains by using denaturing high performance liquid chromatography (DHPLC; Underhill et al. 1997). DHPLC is a simple and robust screening tool and is also useful for genotyping SNPs in recombination mapping experiments (see Fig. 2 below;Spiegelman et al. 2000). Polymorphisms identified via DHPLC were verified and characterized by DNA sequencing.

Figure 2.

Recombination mapping of a recessive lethal mutation by using SNP markers. Chromosomes (bars) and molecular markers (vertical hatch marks) are shown. The mapping process occurs in two stages. (A) The mutation (asterisk) induced in the w;iso2;iso3 background (black bar) that has been mapped previously to a chromosome and balanced is mapped relative to a polymorphic mapping strain (open bar). Single flies heterozygous for the mutation-carrying chromosome and the mapping chromosome are crossed to flies homozygous for the parental w;iso2;iso3 strain to generate 96 recombinant flies. The four recombinant classes are represented. Each recombinant fly strain is assayed for a low-density set of markers that span the chromosome. These markers can be of any type, including SNPs or P-element insertions. We have typically tested six markers spaced at ∼10-Mb intervals on these 96 recombinants, for a total of 576 assays. Each recombinant also is assayed for presence or absence of the mutation by outcrossing. From the outcross data and initial marker data on this set of 96 recombinants, the mutation can be assigned to an interval of 10–20 Mb. (B) A higher density set of SNPs then is assayed on recombinants from A that break in the appropriate interval. An example of mapping a mutation in the hedgehog gene is shown. SNP markers are indicated by the STSs from which they are derived. Two P elements (Q 1059 and Q 1058) that were used to localize these mutations also are indicated. The chromosomal compositions of three recombinant (#6, #25, and #42) and two control (ISO and HET) flies are represented. Asterisks indicate chromosomes that carry the mutation as defined by outcrossing. The SNP markers shown were scored using DHPLC. The recombinants shown delimit the position of these mutations to between Dm1601 and Dm1655, a region of ∼984 kb. Subsequent complementation testing showed that these mutations are alleles of thehedgehog gene, which lies between Dm1601 and Dm1655. (C) DHPLC scoring of SNPs. PCR products were amplified from recombinants and analyzed under partially denaturing conditions. Data are shown for Dm1655, a C/T dimorphism. Run time in minutes is shown on the X axis and ultraviolet absorbance on the Y axis. Dm1655 was analyzed from the following strains: w;iso2;iso3 (ISO), a w;iso2;iso3/Q1059 heterozygote (HET), and hedgehogrecombinants 6 and 25 (#6 and #25). In this example, the sample from recombinant 25 shows a heteroduplex pattern and therefore is scored as a heterozygote. (DHPLC) denaturing high performance liquid chromatography.

12f2_L1TT

Sequence data from a total of ∼1500 STSs from these two discovery approaches were assembled into a single data set. We identified polymorphisms both by visual inspection and automatically as high-quality discrepancies by using Phred quality scores in the Phrap assembly viewer Consed (Gordon et al. 1998). Table 2displays the sequences of 109 STSs for which sequence was derived from all strains used in the genome-wide survey. In addition, we aligned each consensus STS sequence to the Release 1 version of theDrosophila genome sequence to identify the allele in they;cn bw sp strain (Adams et al. 2000). These data define 279 polymorphisms, of which 225 are single nucleotide substitutions, 17 are small insertion/deletions, 8 are dinucleotide substitutions, and 29 are more complex substitutions. Among the single nucleotide substitutions, transitions outnumber transversions by 55% to 45%, and ∼22% of these substitutions occur at CpG dinucleotides. These data are similar to those observed in human and mouse genomes and may be evidence that similar mutagenic mechanisms are at work in Drosophila(Moriyama and Powell 1996), despite the fact that there is no deficit of CpG dinucleotides and little evidence for cytosine methylation that explains the bias seen in vertebrate genomes (Hacia et al. 1999). Note that the STSs analyzed here represent randomly sampled genomic sequence, so this analysis does not take into account effects of coding versus noncoding sequence. Approximately 40% of these SNPs create restriction site polymorphisms and thus are accessible to scoring by a simple restriction digestion (Table 2). In addition to the sequences provided in Table 2, the sequences for 365 STSs that define an additional ∼750 polymorphisms in pairwise subsets of the strains are available at http://www.fruitfly.org.

Table 2.

List of SNP Sequences Identified

X CHROMOSOME
STS Cytological location Flanking sequence Nucleotide position Oregon R    Canton S    w:iso2:iso3     P strain 1   Q1028   P strain 2 Release 1  v:cnbwsp  RFLP?   
Dm35131F1–1F3 TCACT−GGAAC 78G(42)C(42)G(46)G(51)G
GGACA−CTCAA 90G(42)A(44)G(35)G(35)GAlul
ACCCA−ACACT 144C(42)A(44)C(56)C(56)C
ACAAA−ATGGG 174A(47)T(44)A(40)A(35)A
Dm17292B8–2B9 GAGCT−CTCGA 477T(56)C(56)T(40)T(42)TSacl
Dm01272F1–2F6 GTCGT−AGGGG 1447C(37)AGT(40)C(40)AGG(56)C(30)AGT(42)(A/C,22)AGG(33)AAGG
CGCTC−CGCTC 1450C(56)T(43)C(32)C(35)CAcil BsrBl
Dm01063A6 TCTGT−GAACG 122, 125G(37)TGA(37)A(56)TGG(42)G(45)TGA(34)A(35)TGG(34)ATGG
Q1028 3D–E Q1028 Q1034
Dm32474C7–4C8 AGTTT−CTGCG 1607T(42)C(56)C(24)C/T(22)C/T(11)T
CAGTC−GTGAA 1628C(37)G(37)G(25)G(28)C/G(23)C
GTGAA−ACGCC 1634C(51)T(44)T(25)C/T(19)T/C(37)C
GTTTT−CCGAA 1766C(56)T(56)T(44)T(42)T(20)C
Dm19544E1 ACAAT−ATCGA 149T(45)A(56)A(42)T(56)T(40)TTsp5091
Dm30015C1–5C2 AATTC−GGATC 695G(27)G(35)A(41)G/A(22)G(30)G
TCCAA–ACTTT 704T(37)T(41)G(50)T(30)T(33)T
ATTTT−ACAAT 888T(30)C(56)T(42)C(34)C(42)C
Q1034 6E Q1034 Q1030
Dm20067A2–7A5 CCCAG−ACATG 582T(56)C(45)T(30)T/C(26)T/C(17)TRsal
TACCT−CCCCC 618DEL(56)C(56)DEL(98)DEL(98)DEL(98)DEL
TCTTG−C 624C(37)C(46)C(31)A/C(26)A/C(33)C
C−CAATC 626G(40)A(45)G(20)G(42)G(42)G
ACTCG−CCAGC 647A(40)A(35)A(42)A/T(30)A/T(29)ATaql BspGl
Dm18497D10–7D16 GGTT−TCCCT 180A(30)A(51)T(28)T(28)T(28)A
Dm04268C1–8C2 CATAT−TA 1223A(51)A(51)A(42)G/A(24)G/A(28)A
TA−GTACA 1226–7T(51)G(42)T(51)G(42)T(44)G(42)T/C(17), G/A(17)T/C(18), G/A(18)TA
ACATA−GTACT 1234T(56)T(56)T(48)C/T(17)C/T(11)TNdel Rsal SnaBl
TACTT−ATGTTG 1241G(42)A(51)G(42)A(40)A(42)AMsel
ACAAC−TACAA 1352T(40)T(40)T(42)A/T(23)A/T(14)T
Dm31698D1–8D5 GCCGC−CACGA 1280T(45)T(40)T(44)C/T(13)C/T(19)TBsrBl
CCCGT−TGCTC 1373G(37)A(29)G/A(37)G/A(30)G/A(39)G
Dm19809B1–9B2 GGTGA−TTCAC 35A(48)C(40)C(40)C(40)C(40)CEcoRl Tsp5091
ATCTA−CAAAA 62A(51)A(43)T(40)A(40)A(45)T
GCAAA−CATAA 95, 98, 101A(51)TTA(51)AAG(56)C(45)TTC(51)AAA(56)C(35)TTC(40)AAA(40)C(29)TTC(34)AAA(25)C(29)TTC(27)AAA(27)CTTCAAAMsel Tsp5091
Q1030 10E Q1030 Q1035
Dm374610F7–10F8 TACAC−TCTCT 79–80G(51)C(51)A(51)A(51)G(40)C(40)A(45)A(46)A(35)A(35)AA
TCACA−CCGTA 137G(42)T(43)G(35)T(44)T(37)T
Dm202411B1–11B2 GAAC(A)9−(A)5GAT 172A(47)C(44)AC(56)C(42)C
Q1035 13E9 Q1035
Dm047813A8–13A9 TGGCA−TTGGC 70A(40)A(51)G(35)A(39)AMunl Tsp5091
AGACT−TAGTT 145A(45)A(45)C(35)A(37)ABfal
STS Cytological location  Flanking    sequence   Relative nucleotide position Oregon R Canton S    w:iso2:iso3     P strain 1   Q1028   P strain 2 Release 1  v:cnbwsp  RFLP?
Dm349113D1–E4 CATTG−CCGCC 60T(46)T(46)C(46)T(39)TEcll
CACTG−ACGAG 170A(56)G(40)A(35)A(35)ABspGl
Dm191114B3–14B4 TGAAT−GAATG 1134A(46)C(42)A(40)C(37)G(43)T(40)G(21)T/C(21)GT
Dm379015B1–15B2 TGTGC−CTTAA 106C(44)A(45)A(37)C(50)CApaLl CviRl
ACTAT−CGCCT 221C(40)T(40)T(40)T(45)TAcil Ecil
Dm050916F5–16F8 GAGGG−CCGCT 58G(37)T(45)G(37)T(40)THaelll Siml
Q1035 17C
Dm184617A1–17A2 ATTTC−ACTTG 2150A(37)C(45)A(42)A(46)C
Dm275417B5–17C4 AGTAG−GTATG 402C(35)C(38)C(56)T(34)C
CATTT−CAGCG 423C(33)T(42)C(40)T(22)C
TGCGG−TTCTC 444G(42)T(38)G(40)T/G(34)GT
CAAAA−ACCGG 465A(34)A(34)T(56)T(26)T
CCGTT−GCTAA 512T(27)T(35)T(40)C(51)C
Dm050519E1–19E3 GCTGC−TGGCT 90A(42)A(42)A(30)A(30)ACviRl NIallI
CGGAG−ATATG 170C(44)T(56)C(56)C(37)CNdel
ATATA−TATTG 190A(56)A(42)A(42)A(42)A
Dm052920B1–20B2 GGCTC−GTGTC 837G(45)T(45)T(42)G(56)G
2nd Chromosome          
STS Cytological location Flanking sequence Relative nucleotide position Oregon R Canton S w:iso2:iso3 P strain 1 Q1037 P strain 2 Release 1 y:cnbwsp RFLP?
Dm044721D1–21D2 TCTCT−ACGGC 449C(37)C(40)C(42)T(17)C
TCTCT−GACCT 485, 487C(32)TT(32) *(98)T*(98) *(98)T*(98) *(98)T*(98)CTT
Dm037521E3–21F3 TACGG−AGTGT 32A(29)A(27)G(30)A(27)ABscGl
GTCGC−CTGGT 103C(32)C(33)A(21)C(37)C
Q1037 22D1–D2 Q1037 Q1040
Dm264123A1–23A3 ACTTG−CCCAG 32A(40)A(51)G(56)A(47)A(42)GSiml Haelll
TCCAT−AAGAA 107G(56)G(56)A(48)A(48)A(46)GNialll
GGCCA−CGGCT 174G(35)A(40)G(37)G(37)G(35)AAcil
Dm061123E1–23E4 ACAAA−CTTTA 133, 135A(56)AC(42)T(45)AG(40)T(51)AG(45)T(45)AG(45)T(51)AG(51)TAGAlul
ATCTT−TATAG 153C(48)C(46)T(56)C(46)C(46)C
ACTCG−TGTGT 169C(51)C(51)C(27)G(56)G(56)C
GCCAC−CAGTT 186T(45)G(45)C(51)C(51)T(34)G(34)T(40)G(45)T(51)G(51)CCCviRl Pstl
Dm223525A3 TATTA−AAATC 141NDNDG(48)A(42)A(37)AMsel
CATAT−GATAT 156–163DEL(98)DEL(98)DEL(98)TATAATAT(42)TATAATAT(42)TATAATATNdel
GTAAT−ATTGT 174A(40)T(35)T(56)A(40)A(42)A
ATTGG−TTG 233T(56)T(56)T(56)C(44)C(56)T
TTG−AGTAA 237–238G(56)A(56)G(56)A(56)G(56)A(56)T(42)T(40)T(42)T(40)GA
Dm220726D1–26D2 TCCGT−TCCAG 28G(51)G(51)G(40)A(45)A(45)GBspGl
TCTTC−GT 118G(56)T(56)T(42)G(44)G(43)G
GT−TAGTT 121G(51)A(56)A(37)G(35)G(40)G
CGTCC−AGACG 148T(51)C(51)C(40)C(46)C(46)TBfal
AGGTA−TCGTT 163G(40)G(45)G(40)C(40)C(45)GRsal
Dm301529A2 AACTG−CTTTT 23T(40)C(56)T(34)T(35)C(42)C
TGTTT−AAAGT 114A(56)G(56)A(44)A(56)G(56)GDral Msel
AACAC−CGGTT 147T(39)A(56)T(35)T(39)A(38)A
CAACC−GGTGC 171T(31)A(32)A/T(14)A(56)A(42)A
Dm027429A5–29B4 GGTAC−CGGTA 741G(29)G(35)G(38)T(42)G(37)Acil Thal
TTAAT−CAATT 805G(40)G(42)G(51)A(44)G(42)CviRl
GCCGA−AGGTG 832C(40)C(40)C(42)T(46)C(56)T
STS Cytological location Flanking sequence Relative nucleotide position Oregon R  Canton S w:iso2:iso3  P strain 1 P strain 2   Release 1 y:cnbwsp RFLP?
Q1040 29C1–C2 Q1040 Q1042  
Dm043430A3–30A6 TTTTT−AACAT 73–75A(35)A(35)A(35)T(35)A(35)A(34)A(21)A(27)A(27)T(21)T(27)T(29)T(4)T(4)T(4)TAA
TAGGC−AAAAA 113–114A(56)A(56)A(47)C(47)A(13)A(13)DEL(98)DEL(98)AC
Dm065231F3–32A1 AGCAA−CGAAA 742–743C(9)A(9)C(37)A(37)C(24)A(24)T(46)T(33)C(24)A(25)CABsbl
AGTTT−AACGA 821A(51)A(51)A(42)G(42)A(26)ADral Msel Pmel
TTATG−AATGC 900G(51)G(56)G(44)C(37)G(42)GCviRl
TACAA−TTGTC 929C(51)T(56)T(48)T(13)T(44)TTsp5091
Dm404832A1–32A2 CATAC−TTAGA 113T(56)A(56)A(42)A(35)T(42)A
GGTTG−CGATA 168C(46)G(42)G(37)G(35)C(35)G
Dm337433A1–33A2 GATAT−GCAGT 733T(51)G(40)G(44)G(42)T(44)GCviRl
TCCCA−CGGAT 758–759A(29)A(42)G(34)T(42)G(42)T(42)G(26)T(35)A(24)A(46)GT
TAAAA−GAAAT 769T(47)C(51)C(56)C(42)T(44)C
GCATT−ACACA 802G(40)A(51)A(56)A(33)G(33)AMsel
CACAG−GAA 809C(37)T(56)T(44)T(42)C(33)T
GAA−GTACG 813T(42)A(56)A(42)A(51)T(42)A
Dm211233A1–33A2 TAAAA−TTATG 67T(35)G(42)T(56)T(56)G(45)TTsp5091
AACAT−AATCG 104A(42)T(45)A(56)A(56)A(42)A
GAGAA−AGCGG 125A(56)G(51)A(42)A(46)G(42)ABsrBl
Dm024033A3–33A8 ATGTG−GTGCA 514C(24)C(24)T(56)C(37)C(47)T
CATAT−TTTGG 546DEL(27)DEL(52)DEL(48)GTAT(44)DEL(33)DELNdel
Dm247933D4 ATAGT−GGCAA 75A(17)A(40)A(40)A(45)T(40)
TTTTC−CATTG 110G(37)G(56)G(44)G(56)T(44)G
Dm363534C1–34C2 CCGCA−AATGC 1494C(27)C(27)T(40)C(44)C(42)T
GCTTG−ATAAA 1552A(27)G(14)A(42)A(56)G(42)A
AAAAT−TAAGA 1560G(33)T(13)G(56)G(44)T(29)GMsel Tsp5091
Dm039334D1–34D6 TAACA−TAATT 50A(42)A(42)T(37)A(33)A(37)AMsel Vspl
ACTGC−GTGAG 191–192T(46)C(33)A(42)A(42)A(48)A(56)A(29)A(29)T(38)C(38)TCBssSl CvlRl
Dm001235A1–35A2 AATTT−TAAAG 74T(42)G(23)T(42)G(51)T(44)TDral Msel
Q1042 35E1–E2 Q1042 Q1805
Dm004335F1–35F2 GATGG−CAGTC 60A(30)AA(35)A(40)T(37)A
Dm245136B3–36C1 TCCTT−AGACA 108G(40)A(45)G(44)A(56)A(32)GAflll Msel
AGAAT−AA 125T(51)C(45)T(40)C(56)C(56)T
AA−AGT 128, 130A(40)AT(40)T(45)AA(51)A(45)AT(56)T(56)AA(51)T(56)AA(40)AAT
AGT−AAT 134T(40)C(56)T(45)C(51)C(38)T
AAT−CCCAA 138–139, 141,

 143, 145
*(98)G(56)T*(98)T*

 (98)TC(40)
A(45)T(45)TA(45)

 TA(51)TT(56)
*(98)G(56)T*(98)T*

 (98)TC(56)
A(45)T(45)TA(51)

 TA(40)TT(40)
A(56)T(56)TA(46)

 TA(38)TT(40)
*GT*T*TC
TCAAA−GTGAA 157C(42)T(56)C(51)T(56)T(56)CTall
GTGAA−GTACT 163–164A(42)T(42) *(38)*(38)A(56)T(56) *(98)*(98) *(98)*(98)ATScal
Dm035339A3–39A7 CTGCT−CAGCT 26G(40)A(51)G(40)A(42)A(42)GCvlRl Patl
TTCCG−GTATC 41A(51)C(40)C(42)C(42)C(37)CAcil Thal
CTCGA−ACAGC 60C(47)T(45)C(40)C(42)C/T(21)C
CTGTG−TTTTC 85G(51)A(42)A(51)A(51)A(56)A
TGGGA−AAATC 212A(56)A(56)A(56)T(42)A(47)A
Dm062740A6–40B2 GCAAT−TCTGA 304A(33)T(56)T(25)T(51)T(40)TTsp5091
Dm097541C1–41C6 GCGTG−CGTGC 140–143 *** *(40) *** *(40)CGTG(51) *** *(51) *** *(40) *** *
Dm349542A1–42A2 CGGGG–CTTGT 83G(56)T(37)G(35)G(51)G/T(17)TSiml
CGAGA−GTCAG 132T(45)C(46)T(20)C(16)C(19)CAatll Tall
Dm260942A8–42A16 TTTTT−AAATT 77–78 *(43)A(40)T(47)A(40)T(56)A(56)T(56)T(56)T(56)A(23) *A
Dm210642B4–42C2 ACCGA−GCTAC 114A(56)G(45)G(39)G(25)A(22)AAlul
AGTTT−TAAGC 140C(51)C(51)T(56)C(56)C(56)CMsel
AGCAG−AGTCC 148G(56)G(56)A(34)G(34)G(32)G
TGCGA−GGAGA 160C(46)C(46)T(35)C(34)C(35)CBccl
GCTTT−ACAGG 176C(42)C(51)T(39)C(40)C(40)C
STS Cytological location Flanking sequence Relative nucleotide position Oregon R      Canton S     w:iso2:iso3  P strain 1      P strain 2      Release 1  v:cnbwsp  RFLP? 
Q1805 43E9 Q1805      Q1049     
Dm073543E12–43F2 CCAAG−TGCGC 74C(51)T(40)C(40)T(34)T(34)TAlul
ATCTA−GACAT 97C(42)T(46)C(40)C(45)C(36)TPfl1108l
Dm074644B1 GTATC−GTTCT 140–145GCACGG(51) *** **T(56)GCACGG(33)GCACGG(40)GCACGG(51)GCACGGBscGl
Dm220944D1–44D2 CAATA−AAGTA 128A(45)G(51)A(26)G/A(10)G(29)G
TATCA−AACAT 155A(46)G(40)A(35)G/A(16)G(32)G
Dm072244D5–44E2 CACCG−TTACT 101, 104G(56)TTT(51) *(41)TTC(45)G(32)TTC(14)G/*(11)TTC/T(11) *(39)TTC(40) *TTCMspl PinAl
Dm384546B1–46B2 TTTTT−AATAC 159A(39)C(51)A(51)C(35)CDral Msel
Dm034047D6–47E2 GCGAA−TTTTA 205T(28)A(32)T(37)T(40)T(40)T
Dm008448C5–48D2 CTCAT−CAGCT 762T(56)T(51)T(56)C(35)T(49)T
TTATT−GAGAG 772C(51)C(56)A(44)C(42)C(47)CTaql
TAAGT−ATTTC 808A(51)G(51)G(40)G(40)G(40)ATsp509l
TGAGA−GCCGA 866C(51)C(51)C(40)C(42)T(40)C
Q1049 49D1–D3 Q1049 Q1047
Dm248050A3–50A4 CCTAA−GACGT 1032A(51)DEL(51)DEL(53)A(35)A(33)A
TATCC−ACAAC 1060C(37)T(37)T(51)C(44)C(42)C
Dm006450C23–50D1 TTTAC−TTTCT 119A(37)T(31)A(40)A(31)A(37)A
GCTAA−TTACT 140G(33)A(29)G(40)A(40)A(40)GTsp509l
TATTA−TACAT 154T(40)A(30)T(43)A(35)A(42)TMsel Vspl
TTTTT−AAAAA 164G(9)T(13)A(36)A(36)G(35)T(50)G(27)A(44)G(42)A(44)GTDral Msel
Dm370951E10–51E11 AGAAT−GAATC 83, 85, 87–90C(35)AG(35)AATAA(42) *(98)A*(98)A*** *(98) *A*A*** * C(31)AG(29)AATAA(35)C(35)AG(35)AATAA(37)CAGAATAA
Dm078851E5–51E8 AAAAA−CAACA 1012C(40)T(51)T(33)C(26)C(42)C
Dm266351E5–51E8 TCATT−TTTTT 156A(42)G(22)G(36)A(42)A(42)A
Dm075452B1–52B2 AATCA−ATATA 916C(33)T(51)T(37)T(44)T(46)C
Dm219254A2–54B1 AAATG−TGGTT 608T(56)G(56)T(42)G(37)G(51)T
ATTTC−GGAAT 639G(51)C(42)G(32)G(48)G(51)GBspEl Mspl
Dm242754C3–54C11 TCGCT−CCTAC 185T(42)A(53)T(36)T(45)T(40)T
Dm349054E8–54F2 AGTGT−TTTTT 169T(44) *(56)T(56)T(56)T(46)A
Dm085155A2–55B1 GAGTT−CGAGT 737G(56)A(42)A(35)A(40)A(17)A
Dm172055A2–55B1 ACTGG−GAAAA 251G(56)A(48)G(35)G(27)G(14)G
TAAGT−TGCCA 271C(40)C(45)C(21)G(35)G(38)C
Dm263155A2–55B1 TTTTA−TTGCG 161T(46)T(51)T(20)G(56)G/T(11)T
TGCCA−CATCG 179T(40)C(37)T(34)T(34)T/C(28)T
Q1047 57A3 Q1047
Dm060256B1–56B2 ATCCC−CCGGT 80T(44)G(40)T(23)T(44)TAcil
GGATT−CTC 95A(37)G(56)A(35)A(33)A
CTC−TTCT 99T(33)C(56)T(42)T(33)T
TTCT−GGCAG 104T(56)G(56)T(42)T(35)T
GGTGT−TTCCG 152A(56)G(56)A(50)A(40)A
CTGGG−CACCT 178C(40)A(56)C(42)C(40)CHaeill
AGTTC−GTTTG 221T(56)C(56)T(37)T(17)T
Dm080056B1–56B2 AAAA−TAAAA 1298T(42)T(40)A(51)T(45)A
TTTAT−AAGTG 1319T(31)A(42)T(45)T(33)TMsel
CTAGG−AATAT 1336, 1339A(29)TAC(25)A(43)TAC(17)T(40)TAA(51)A(12)TAC(24)TTAAMsel
AATAT−ATATA 1345G(32)C(35)G(51)G(29)G
ATAGT−GCAAC 1364A(10)A(29)G(45)A(15)G
Dm082457A4–57A6 GAGCT−GGCGC 64G(42)T(42)G(44)T(42)T
TATAT−CAGCT 198, 200A(51)TT(42) *(51)T*(51)A(51)TT(51) *(98)T*(98) *T*
Dm006357E2–57E8 TTAGC−TAAAC G(35)G(45)G(34)G
STS Cytological location Flanking sequence Relative nucleotide position Oregon R   Canton S w:iso2:iso3 P strain 1 P strain 2  Release 1   y:cnbwsp   RFLP? 
Q1047 57A3 Q1047
Dm227458A4 CCCCC−TTTTC 129–138TATTTTCCCG(42)GA*** *** **(56)AATTTTCTCG(56)GA*** *** **(30)TBscGl
Dm154458B10–58C4 GGTGA−GGTAT 99C(19)C(51)C(42)C/T(13)CBccl
GTGGT−AACGA 144A(33)A(56)G(56)G/A(17)A
Dm163958F2–58F7 TTCCA−CGGAT 87, 90A(40)ATC(39)G(35)ATT(56)A(35)ATC(37)A(40)ATC(40)AATCBspEl Mspl
ATGAT−GGTTT 175C(40)G(56)C(56)C(30)CDpnl Sau3Al Bccl
Dm210159A1–59A3 AGTAA−TAGAA 690T(11)T(45)G(42)A(37)TTsp509l
ACTTC−TTACT 811A(40)G(51)T(42)G(42)
CAGCC−CTGGA 858G(22)T(37)G(40)T(32)GAcll
Dm248759F2 TATCT−ATACT 49A(40)G(45)A(40)A(33)A
ATTCC−AACAA 71A(46)A(45)G(51)A(30)G
AGGCA−GCGAT 107T(51)C(39)T(42)T(42)TNiaill Sphl Thal
CAGAC−CCGCA 173A(40)G(35)G(40)G(35)G
CAGCA−ACGGA 182G(33)C(43)C(56)C(42)C
Dm032760B2–60B10 TATGTGTG−TGTGTGCT 100–101T(39)A(39)DEL(98)DEL(98)DEL(98)TA
Dm224760D10–60D11 TTCGG−GGGCG 158C(42)T(56)T(44)T(42)CAcll
3rd Chromosome
STS Cytological location Flanking sequence Relative nucleotide position Oregon R Canton S w:iso2:iso3 P strain 1Q1050 P strain 2 Release 1   y:cnbwsp   RFLP?
Dm246761C1–61C2 ATCTT−ACTGG 219T(45)T(46)T(29)A(22)AMsel
CTGTC−TTTTT 235 *(56)T(48)T(27)T(30)
Dm068861D1–61D2 AATTT−GGAGA 203A(51)C(56)ACA
AGAAT−TGGAA 252G(51)C(51)GCG
Q1050 62B4–62B5 Q1050 Q1052
Dm383562F1–62F2 CAACA−GCTAA 202G(40)G(42)G(42)A(44)A(42)G
Dm22006303 GCCGC−TTCCC 151C(45)T(45)C(29)T(30)T(42)C
TTCCC−CCCTC 157A(45)G(51)A(44)G(42)G(37)GAcll
Dm208665A6–65A10 TACGA−AAGCT 35T(33)C(37)C(47)T(38)TT
TATTT−AAA 85–86C(35)A(12) *(98)A(51) *(98)C(56)C(15)A/C(11)CACA
AAA−TTTAA 90–94TTT(51)AA(45) ***(98)A*(98) ***(98)A*(98)TTT(24)AA(28)TTTAATTTAA
TTAAT−TTATA 101–110ATATATTTAA(48) *T*TA***A*(98) *T*TA***A*(98)ATATATTTAA(33)ATATATTTAA     ATATATTTAA
TTATA−GTATT 116–128TATATAATATATA(56) *(98) *(98)TATATAATATATA(28)TATATAATATATA     TATATAATATATA
ATTAA−GTATA 136T(42)C(40)C(42)T(42)TTVspl Tall
Dm092967A1–67A2 TAATG−GTTT 508T(56)C(56)C(42)C(18)C(42)C
GTTT−GTGGT 513A(40)T(42)A(40)T(24)T(44)T
Dm220267C10–67D1 CATTT−CCGAA 880G(42)A(40)G(40)A(37)A(35)G
Q1052 68A1–68A2 Q1052 Q1060
Dm221568B1–68B2 ATTCA−CTAAT 593, 596G(40)TTC(38)G(42)TTC(38)T(56)TTT(40)G(35)TTC(28)G(42)TTC(40)GTTC
TTCCC−TATCC 607T(42)T(37)A(42)T(42)T(46)T
TTCAC−AATA 619A(45)A(37)G(30)A(42)A(42)A
ATAA−AAC 624G(47)G(42)A(44)G(42)G(44)G
AAC−TTCAC 628A(47)A(44)T(47)A(42)A(42)A
TCAAC−ATTTT 641T(45)T(40)C(42)T(38)T(40)T
GTTTT−GTGGG 753T(45)T(40)C(42)T(42)T(42)T
Dm254469D1 TACTT−CGCTG 642G(27)G(40)G(40)A(51)A(40)GHhal
STS Cytological location Flanking sequence Relative nucleotide position Oregon R Canton S  w:iso2:iso3  P strain 1  P strain 2   Release 1  v:cnbwsp  RFLP?
Dm323971F1–71F2AGAGC−GAACT73C(40)C(37)T(35)T(35)C(40)TAlul
GTGTG−GAGAT88G(46)G(51)A(30)A(42)G(40)G
AATTC−TCCTA114–132 **T*** *A**G

*** *** *G*(56)
**T*** *A**G

*** ***G*(44)
AATTTTTAATGT

 TTCTAGG(33)
AATTTTTAATGT

 TTCTAGG(37)
**T*** *A**G

*** ***G*(35)
AATTTTTAATGT

 TTCTAGG
Bfal Msel Tsp509l
Dm180972C1–72D6AGGCG−AGGAT714C(40)C(39)A(19)ACCHhal
Q1060 73A1–A4 Q1060 Q1056
Dm362674F4–75A2AGTGG–AAAGC1267A(30)A(56)G(56)A(33)A(56)A
ATGGG−GGTGG1301A(22)T(51)T(56)A(42)T(47)T
Dm222076B6–76B10GCAAT−CCGAT124T(16)T(29)G(33)T(22)T(42)GTsp509l
Dm308976C3CGATG−TTTAT51 *(98) *(98)T(51) *(98) *(98)T
AATAT−TA156, 158A(51)TG(51)A(45)TG(45)G(44)TA(44)A(40)TG(45)A(56)TG(56)GTA
TA−TATAT161–167TATGTAG(45)TATGTAG(45)CTACGTA(56)TATGTAG(42)TATGTAG(37)CTACGTA
Q1056 83A7–A8 Q1056 Q1058
Dm156886A5TTGCA−CAAGA155A(51)A(38)C(27)A(33)A(30)A
AACTC−TCCAC165C(56)A(46)C(31)A(35)C(42)A
Dm332687F13−88A1GCTGG−AAATA71–91ACATGAACTTAT

 TCAGCA

 ACG(35)
*** *** *** *

*** *** *** *(98)
ACATGAACTTAT

 TCAGCAACG(31)
*** ***

*** *** ***(98)
*** ***

*** *** **(98)
ACATGAACTTAT

 TCAGCAACG(31)
BspGl
Dm347888A1TCCGT−CAGCA90G(37)G(35)A(37)G(37)G(40)ACviRl Rsal
Dm033789C2–89C5TCCTG−TATAT856C(51)G(56)G(31)C(40)C(42)G
Dm270989D2–89D4CAGGG−GATGA668A(15)G(56)A(42)A(48)A(44)A
GAGCT−TCCAT677, 680, 683C(27)CTG

 (46)TAT(39)
G(30)CTC(33)

 TAC(42)
A(42)CTG(38)

 TAT(38)
C(56)CTG(42)

 TAT(56)
C(42)CTG(46)

 TAT(42)
ACTGTATSacl
ATGAT−CCGCG692A(37)T(56)A(34)A(42)A(35)A
CCGCG−CCCAT698A(37)G(35)A(30)A(37)A(31)AHaelll Sacll Siml
GCCGA−GAATG710–711G(42)C(42)A(41)A(41)G(38)C(38)G(46)G(46)G(42)C(42)GC
GAATG–AGCGA717, 720–722A(41)GAA(56)

 A(56)G(56)
C(27)GAC(38)

 T(38)C(38)
A(42)GAA(46)

 A(42)G(44)
A(42)GAA(56)

 A(56)G(51)
A(42)GAA(56)

 A(56)G(46)
A
AGCGA−CAGAG731, 734,

 737, 740
G(32)GAG(44)

 CAC(45)

 GTG(45)

 TGT(45)
A(48)GAA(37)

 CAG(45)

 GTT(51)TGC(45)
G(46)GAG(37)

 CAC(42)

 GTG(44)TGT(42)
G(51)GAG(46)

 CAC(44)

 GTG(40)TGT(51)
G(56)GAG(48)

 CAC(37)

 GTG(45)TGT(56)
GGAGCACGTGTGTPmll Tall
Dm249489F4–90A2GGATG−CATCC503C(38)C(20)C(56)T(44)C(42)CBccl
TCGGG−TACTA623C(37)A(51)C(39)A(40)C(35)C
GATTT−CAACT659G(45)G(45)G(42)C(29)G(35)GCvlRl
ATGCT−AGCTA678T(45)A(51)T(38)A(30)T(42)T
Dm395290D1–90E2AATTC−TTTTT717T(27)T(30)T(44) *(98) *(98) *
Q1059 91A1–A2 Q1058 Q1059
Dm329392D1–91D2TTGTG−GACCC94T(51)C(40)C(35)C(42)T(42)C
CTGAG−ACTGA112G(56)C(42)C(38)C(42)G(35)C
GGAAT−AGAAT183T(40)C(45)C(35)C(35)T(33)CTsp509l
Dm170592D1–92D2GATAT−CCAAT473C(51)C(42)T(42)T(42)C(28)CEcoRV
TCATA−AAG518C(33)C(28)A(42)A(33)C(28)C
AAG−CGGAT522C(42)C(56)A(37)A(44)C(42)C
TGCGG−AGATC617T(45)G(51)G(26)G(17)G(32)G
Dm277294C1–94C5CATTT−GGCGG409C(45)C(40)T(40)T(42)C(42)C
STS Cytological location Flanking sequence Relative nucleotide position Oregon R   Canton S   w:iso2:iso3   P strain 1  P strain 2  Release 1   v:cnbwsp   RFLP?
Dm169094E6–94E12 AGTTT−GGATT 802G(56)G(56)T(56)G(33)G(9)G
TAAAT−TGGAA 872T(44)T(45)T(42)G(36)G(9)TTsp509l
AACAT–TCTGA 902–903A(44)T(48)A(51)T(51)A(40)T(45)T(42)A(42)T(9)A(9)TA
TCTTT−GCCTG 916T(56)C(40)C(40)C(44)C/T(9)T
GCCTG−TTATT 922T(56)A(56)A(45)A(40)T(9)T
TTATT−AAGTG 940–950 *** *** *A***(98) *** *** *A***(98) *** *** *A***(98)TGTTCTTATTG(37)TGTTCTTATTG(11) *** ***A*** Msel
AGTGA−ATTGT 957C(51)T(56)T(44)T(27)T(11)C
GTATT−TTCAA 966C(56)C(56)C(56)T(42)T(20)C
AATAA−TTAGT 996T(45)T(43)T(40)A(17)A(16)T
TTAGT−TATAT 1002A(51)A(51)A(40)T(42)T(33)A
TATAT−TAAAC 1008A(51)A(51)C(40)C(17)C(20)C
Dm159295D1–95D6 AGACA−TTCAC 911C(56)T(44)T(31)C(56)C(56)C
Dm066798E11–98F3 TTTAA−ATTTC 1032C(29)G(35)G(56)G(42)C(30)G
CTAGC−TTGAA 1042C(51)T(56)T(46)T(37)C(35)GAlul
GTCCG−CGA 1081A(29)A/G(21)A(26)G(33)A(14)G
CGA−TTTAG 1085–1100 **T*T***A***G(36)

 A**(45)
GGTATGGCAT

 TCTATG(37)
GGTATGGCAT

 TCTATG(48)
GGTATGGCAT

 TCTATG(42)
**T*T***A***

 G(42)A**(98)
GGTATGGCAT

 TCTATG(48)
AAACT−TTTTT 1138, 1142T(56)ATTG(37)G(56)ATTA(51)G(44)ATTA(45)G(37)ATTA(37)T(22)ATTG(15)GATTA
CAGAA−ATTGA 1156A(56)T(51)T(30)T(29)A(40)TTsp509l Sspl
TTGGC−AAAAT 1170C(42)A(51)A(44)A(44)C(26)AHaelil Mscl
AAAAT−TATTT 1176, 1178A(38)CG(40)G(56)CA(56)G(50)CA(37)G(30)CA(35)A(32)CA(21)GCASnaBl Tall CvlRl Nsil
Q1058 96F10–F11 Q1058 Q1059
Dm165797C1–97C3 CATGG−CGCAT 968C(37)C(40)A(30)C(38)C(24)CAcil Haelil
AGTTT−TGTTT 1024G(56)T(56)G(56)G(37)G(37)T
GCCCC−CCATC 1049G(40)A(40)A(40)A(37)G(40)AAcll
Dm247399D1–99D2 GTCGC−GAATC 1029 *(19)T(40)T(30)T(40)C(40)CNrul Thal
ATTGG−CCGGA 1049–1049,

 1052
C*(11)T(11)CAG(22)T(40)T(40)CAA(40)T(23)T(25)CAA(23)T(38)T(37)CAA(33)T(35)C(35)CAG(33)CTTCAADrdll
Dm2288100C1–100C4 CTGCT−CATTT 1040C(42)T(44)T(47)C(21)C(29)C
TTGGC−GTTCC 1110–1113,

 1116
*** *GG*(98)GTTT(40)GGT(40)GTTT(42)GGT(38) *** *GG*(98) *** *GG*(98)ATTTGGTAcil
CAAAG−TGCGT 1139T(51)C(37)C(42)T(42)T(56)CAlul

[i] List of SNP sequences identified among CS, OreR, w;iso2;iso3, y;cnbwsp, and pairwise combinations of 17 single P-element insertion strains. Identified SNPs are listed according to the STS from which they were derived. For each SNP entry, the approximate cytological location and ∼10 bp of sequence flanking the polymorphic site are indicated. Cytological locations are derived from the P1-based genome physical map (Kimmerly et al. 1996) and confirmed by comparison of the STS sequence to a database of Release 1 genome sequences with associated cytological locations (Adams et al. 2000). “Nucleotide Position” refers to nucleotide positions within sequence assemblies and is included to provide relative, rather than absolute, positions for each polymorphism. The identity of the dimorphic base for each SNP is listed for each strain, and the corresponding Phredquality score of that base in the sequence trace is listed parenthetically. Sequences from the y;cn bw sp strain were derived from the Release 1 sequence, where available. Restriction endonucleases listed under “RFLP?” are those for which the recognition site is present in one allele and absent in the other. The location of P-element insertions (designated as Q1040, etc.) is indicated; SNP sequences are listed between each pairwise set of P-element strains that can be used for eye-color-based selection of recombinants in that interval. All polymorphisms between the w;iso2;iso3 and P-element strains were initially detected by Denaturing HPLC. For additional data, see http://www.fruitfly.org/SNP/index.html.

The chromosome-wide distribution of the polymorphisms identified in this study is represented in Figure 1. The distribution is relatively even across the genome, with an average density of one SNP marker per 225 kb on the autosomes and one per megabase on the X chromosome. The lower density of SNPs discovered on the X chromosome is due primarily to a lower density of available STSs. The region flanking the centromere of chromosome 3, including cytological divisions 77 through 83, contains far fewer polymorphisms per kilobase than the rest of the genome. All available STSs in this region were screened in the w;iso2;iso3/P-element strain comparison, and additional comparison of nearly 100 kb of sequence revealed a lack of polymorphism between these strains (data not shown). This observation is consistent with previous studies indicating that levels of sequence polymorphism correlate with recombination rates inDrosophila (Begun and Aquadro 1992; Charlesworth 1996). However, we do not observe a similar reduction of variation near the centromere of chromosome 2.

Figure 1.

Distribution in the Drosophila genome of SNPs identified in this study. The euchromatic portion of the genome is represented by horizontal bars, with the extent of each cytological division representing the genomic extent as estimated by Sorsa (1988). The positions of SNP markers identified in this study are represented by vertical hatch marks: SNPs for which sequence has been determined in all strains are represented by red hatch marks; those that were identified in pairwise subsets of strains are indicated by blue or green. The distribution is relatively even throughout the genome, with the exception of cytological divisions 78–83 near the centromere of chromosome 3. Strain designations (e.g., Q1040) and a small downward arrow indicate the positions of P elements useful for fine-scale mapping.

12f1_L4TT

Mapping with SNPs

Having identified a sufficiently dense set of SNPs, we designed a simple strategy to use these markers for mapping single gene mutations (Fig. 2). This strategy requires two to four generations and informative assays for 10–20 SNPs. In the first step, the mutation of interest is mapped relative to six markers that span the chromosome at 8–10 Mb intervals (Fig. 2A), by using 96 unselected, chromosome-wide recombinants. These markers can be SNPs, transposon insertions, or visible markers. Once the interval in which the mutation lies is defined, SNPs are tested on the appropriate recombinants to further refine the position of the gene (Fig. 2B,C). Using this strategy, we mapped two independent recessive lethal mutations isolated in a screen for suppressors of a human p21 overexpression phenotype. These mutations were placed between the STSs Dm1601 and Dm1655 on chromosome 3, an interval of <1 Mb (Fig. 2C) and subsequently were shown to be alleles of the hedgehoggene (Tabata and Kornberg 1994).

To achieve the high-resolution mapping suitable for positional cloning, we have used the P-element strains listed in Table 2 in a strategy to select many white (−) recombinants in a small region (Cutforth and Rubin 1994; Spradling et al. 1995; see Discussion).

Discussion

Determining the locations of mutations in the genome is a critical component of genetic analysis. We have identified a set of 474 SNP markers spanning the Drosophila genome and showed their use as a new and powerful resource for genetic mapping. The density of available SNPs is high, so the mapping resolution achievable with SNPs is much greater than with traditional, phenotypic markers. With the example of the mutations in hedgehog, we show the ease with which mutations can be localized to intervals of ∼1 Mb, and we routinely use this approach to map genes to inter-vals of <500 kb. If the mutation is mapped relative to a P-element insertion in the initial mapping, a chromosome that contains the P element linked to the muta tion can be used for finer scale localization. For instance, a useful approach is to generate many recombinants between the P[w +] elements shown in Table 2, which can be selected easily using the white eye color phenotype (Cutforth and Rubin 1994; Spradling et al. 1995). In a P[w +] interval of ∼10 Mb, 500 selected recombinants result in one recombination event every 20 kb on average. Using this approach and a combination of the SNPs reported here and newly discovered SNPs in a selected interval, we have localized a single mutation to <25 kb, an interval in which DNA sequencing is a realistic method for identifying mutations (D.A. Ruddy and M.C. Ellis, unpubl.).

Although the paucity of molecular variation and recombination events near the centromeres is currently a limitation for recombination mapping, new techniques based on male-specific recombination for mapping relative to P-element insertion sites may provide a solution (Chen et al. 1998). A commonly used but relatively low-resolution approach to mapping mutations is complementation testing with chromosomal deficiencies, which sometimes can be confounded by synthetic interactions. Existing large collections of P-element insertion mutants also can be used in complementation tests with new, unmapped mutations, an approach that sometimes will obviate the need for high-resolution recombination mapping (Spradling et al. 1999). However, the spectrum of genes that can be recovered by insertional mutagenesis is limited by insertion site biases. Point mutations are useful for many genetic analyses and are recoverable only by chemical mutagenesis. A combined approach that makes use of both recombination mapping with SNPs and analysis of transposon insertions in candidate genes is an efficient approach to positional cloning.

Another advantage of SNPs as markers is the ability to use standardized and scalable genotyping methods. We have found DHPLC to be a useful technology for SNP scoring, as it is a technically simple and robust technology that is inexpensive to operate and well suited to detection of heteroduplex DNA molecules derived from heterozygous individuals. Assay technologies that do not require specialized instrumentation, such as PCR followed by restriction site polymorphism analysis, can be applied to many of the polymorphisms identified in this study. This approach has been shown to be effective for SNP mapping in C. elegans (S. Wicks, pers. comm.). A vast array of technologies for genotyping SNPs has been developed, a partial sampling of which includes systems based on primer extension, oligonucleotide ligation, or nuclease assays (Landegren et al. 1998), and various microarray formats (Hirschhorn et al. 2000; Pastinen et al. 2000).

The set of SNPs described here can be used in mapping crosses involving any strain. For example, a mutation in an unknown genetic background could be mapped in two sets of crosses to any two of the isogenic strains described here, using SNP markers that distinguish the two mapping strains. Because SNP markers are biallelic, essentially all SNPs will distinguish the unknown strain from one or the other mapping strain, so essentially all markers will be informative in one cross or the other. Furthermore, genotyping assays can be used to identify a subset of the SNPs presented here that differentiate two unknown strains. However, the development of standardized Drosophilamapping resources may benefit from the selection of standardized strains and a standard set of SNPs that can be scored using a widely available and easily accessible scoring technology.

With the complete genomic sequence now available, we can look toward rapid developments in genomics-based approaches to biological problems in Drosophila. For example, a higher density set of SNPs could be developed to enable even higher resolution mapping strategies. Additional large-scale SNP discovery is now very straightforward using the reference Drosophila genome sequence as a guide. SNPs will be useful in other mapping applications, including characterization of complex traits, QTL mapping, and loss-of-heterozygosity approaches to defining deletion end points. SNPs also may be useful asDrosophila strain identifiers. Finally, SNPs may have use in genomic sequence-based screening approaches, whereby randomly mutagenized chromosomes are screened for molecular lesions (Bentley et al. 2000).

METHODS

Drosophila Strains

Wild-type Drosophila strains were obtained from the Bloomington Stock Center. Canton S and Oregon R were obtained from the laboratory of G.M. Rubin; single P-element insertion strains were obtained from C. Goodman and G.M. Rubin and were derived from an enhancer trap screen performed in their laboratories. Canton S, Oregon R, and w;iso2;iso3 were made isogenic for the indicated chromosomes (X, 2, or 3) according to standard techniques, with the exception of Canton S and Oregon R isogenic chromosome 2 stocks, which were obtained from T. Laverty and G.M. Rubin. All P-element strains and newly isogenized wild-type strains are available from the Bloomington Stock Center.

Molecular Biology

Drosophila genomic DNA was isolated either from adult populations or single recombinant flies according to standard techniques. STSs were amplified from genomic DNA preparations using either standard or touchdown PCR (Don et al. 1991) to facilitate amplification of most STSs under a single set of conditions.

Identification of Sequence Variants

The STSs used in this study were developed by the BerkeleyDrosophila Genome Project (BDGP; Kimmerly et al. 1996), and were selected based on map position inferred from the P1 clones and contigs with which they are associated. STS sequences, primer sequences, and PCR conditions are available on the BDGP Web site (http://www.fruitfly.org). STS sequences were compared among different strains by using the two approaches discussed below.

DNA sequencing

One thousand sixteen STSs were selected for sequencing in the Canton S and Oregon R strains based on the following criteria: primer annealing temperature of 58°C, PCR product length of >180 bp, and reliable amplification in STS content mapping experiments. STSs were PCR amplified with AmpliTaq Gold polymerase (Applied Biosystems) by using PCR conditions described in individual STS reports available athttp://www.fruitfly.org. PCR products were treated with exonuclease I and shrimp alkaline phosphatase to degrade primers and free nucleotides (Werle et al. 1994). Treated products were sequenced using the PCR primers and BigDye terminator sequencing chemistry (Applied Biosystems) and analyzed on ABI 377 sequencing machines. Sequences were assembled using the Phred/Phrap/Consedpackage (Ewing and Green 1998; Gordon et al. 1998;http://genome.washington.edu). Candidate polymorphisms were detected both by inspection of traces and by automated detection of high-quality sequence discrepancies in Consed. High-quality sequence was obtained from 796 of the 1016 STSs, and 309 of these contained at least one polymorphism. The sequences in Canton S and Oregon R of 49 of these polymorphic STSs have been reported previously (Teeter et al. 2000).

DHPLC

Approximately 1050 STSs between 180 and 200 bp in size were selected and amplified by PCR from the P-element and w; iso2; iso3 strains. Successful amplification was confirmed by gel electrophoresis. For each STS, the products amplified from the two strains were mixed in a 1 : 1 (v/v) ratio. The mixed products were denatured for 5 min at 95°C and reannealed slowly to create heteroduplex molecules. Presence or absence of heterozygosity was analyzed using DHPLC (Oefner and Underhill 1998) under the following conditions: Samples were run on a metal-free HPLC system (Varian Chromatography Systems) and fitted with a column capable of high-resolution DNA separations; the Eclipse and Helix columns (Varian Chromatography Systems) or the DNASep column (Transgenomic, Inc.) were used. Chromatographic separations were performed using a uniform gradient (1.8% acetonitrile/min for DNAsep columns, 4%/min for other columns), and all STSs were screened at 52°C, 54°C, 56°C, 58°C, and 60°C to determine the optimum temperature for heteroduplex detection. STSs positive for heterozygosity were DHPLC-analyzed in the homozygous strains and sequenced to identify the variant base.

Sequences from both approaches were assembled usingPhred and Phrap and analyzed inConsed to generate the sequences represented in Table 2and at http://www.fruitfly.org. All STSs containing SNPs were sequenced on both strands; in some cases, this resulted in single-stranded coverage of SNPs near the ends of STSs. The SNPs listed in Table 2 were sequenced in the strains that contain the two flanking P elements to identify those useful for fine-scale mapping.

Recombination Mapping

Alleles of the hedgehog gene were isolated in a screen for ethyl methanesulfonate-induced mutants that suppress a p21 overexpression phenotype. These mutations were induced in a w;iso2;iso3 strain, and balanced mutant stocks were established over TM3 or TM6 by standard methods. The mutations were localized initially by crossing to a mapping chromosome that contained six molecular polymorphisms at evenly spaced intervals. By analyzing 96 random recombinants for both (1) the allele at each marker, and (2) the presence or absence of the mutant gene, the position of the mutation along the chromosome can be established to within a 10–20-Mb interval of the chromosome between the molecular markers. SNP markers in the appropriate interval were amplified from each recombinant and analyzed using DHPLC to generate the haplotypes shown in Figure 2.

We thank Brett Milash for establishing an STS location database at Exelixis, Ross Buchholz and Wes Miyazaki for generatinghedgehog recombinants, the Exelixis sequencing group for sequencing support, Bruce Kimmel and Mike Palazzolo for advice and discussions, and Gerald M. Rubin for support on NIH grant P50-HG00750. This work was funded in part by grant BG94-206 to E.M.R., and from Rhone-Poulenc to R.A.H.

The publication costs of this article were defrayed in part by payment of page charges. This article must therefore be hereby marked “advertisement” in accordance with 18 USC section 1734 solely to indicate this fact.

Notes

[3] Corresponding authors.

[4] E-MAIL ; FAX (510) 486-6798.

[5] E-MAIL ; FAX (650) 837-7220.

Notes

[6] Article published on-line before print: Genome Res., 10.1101/gr.178001.

[7] Article and publication are at www.genome.org/cgi/doi/10.1101/gr.178001.

REFERENCES

  1. M.D. AdamsS.E. CelnikerR.A. HoltC.A. EvansJ.D. GocayneP.G. AmanatidesS.E. SchererP.W. LiR.A. HoskinsR.F. Galle(2000) The genome sequence of Drosophila melanogaster. Science 287:2185–2195.
  2. D.J. BegunC.F. Aquadro(1992) Levels of naturally occurring DNA polymorphism correlate with recombination rates in D. melanogaster. Nature 356:519–520.
  3. (1993) African and North American populations of Drosophila melanogaster are very different at the DNA level. Nature 365:548–550, ibid.
  4. A. BentleyB. MacLennanJ. CalvoC.R. Dearolf(2000) Targeted recovery of mutations in Drosophila. Genetics 156:1169–1173.
  5. M. CargillD. AltshulerJ. IrelandP. SklarK. ArdlieN. PatilN. ShawC.R. LaneE.P. LimN. Kalyanaraman(1999) Characterization of single-nucleotide polymorphisms in coding regions of human genes. Nat. Genet. 22:231–238.
  6. B. Charlesworth(1996) Background selection and patterns of genetic diversity in Drosophila melanogaster. Genet. Res. 68:131–149.
  7. B. ChenT. ChuE. HarmsJ.P. GergenS. Strickland(1998) Mapping of Drosophila mutations using site-specific male recombination. Genetics 149:157–163.
  8. R.J. ChoM. MindrinosD.R. RichardsR.J. SapolskyM. AndersonE. DrenkardJ. DewdneyT.L. ReuberM. StammersN. Federspiel(1999) Genome-wide mapping with biallelic markers in Arabidopsis thaliana. Nat. Genet. 23:203–207.
  9. T. CutforthG.M. Rubin(1994) Mutations in Hsp83 and cdc37 impair signaling by the sevenless receptor tyrosine kinase in Drosophila. Cell 77:1027–1036.
  10. R.H. DonP.T. CoxB.J. WainwrightK. BakerJ.S. Mattick(1991) “Touchdown” PCR to circumvent spurious priming during gene amplification. Nucleic Acids Res. 19:4008.
  11. B. EwingP. Green(1998) Base-calling of automated sequencer traces using Phred. II. Error probabilities. Genome Res. 8:186–194.
  12. D. GordonC. AbajianP. Green(1998) Consed: A graphical tool for sequence finishing. Genome Res. 8:195–202.
  13. J.G. HaciaJ.B. FanO. RyderL. JinK. EdgemonG. GhandourR.A. MayerB. SunL. HsieC.M. Robbins(1999) Determination of ancestral alleles for human single-nucleotide polymorphisms using high-density oligonucleotide arrays. Nat. Genet. 22:164–167.
  14. J.N. HirschhornP. SklarK. Lindblad-TohY.M. LimM. Ruiz-GutierrezS. BolkB. LanghorstS. SchaffnerE. WinchesterE.S. Lander(2000) SBE-TAGS: An array-based method for efficient single-nucleotide polymorphism genotyping. Proc. Natl. Acad. Sci. 97:12164–12169.
  15. W. KimmerlyK. StultzS. LewisK. LewisV. LustreR. RomeroJ. BenkeD. SunG. ShirleyC. Martin(1996) A P1-based physical map of the Drosophila euchromatic genome. Genome Res. 6:414–430.
  16. R. KochH.G.A.M. van LuenenM. van der HorstK.L. ThijssenR.H.A. Plasterk(2000) Single nucleotide polymorphisms in wild isolates of Caenorhabditis elegans. Genome Res. 10:1690–1696.
  17. U. LandegrenM. NilssonP.Y. Kwok(1998) Reading bits of genetic information: Methods for single-nucleotide polymorphism analysis. Genome Res. 8:769–776.
  18. K. Lindblad-TohE. WinchesterM.J. DalyD.G. WangJ.N. HirschhornJ.P. LavioletteK. ArdlieD.E. ReichE. RobinsonP. Sklar(2000) Large-scale discovery and genotyping of single-nucleotide polymorphisms in the mouse. Nat. Genet. 24:381–386.
  19. E.N. MoriyamaJ.R. Powell(1996) Intraspecific nuclear DNA variation in Drosophila. Mol. Biol. Evol. 13:261–277.
  20. P.J. OefnerP.A. Underhill(1998) DNA mutation detection using denaturing high-performance liquid chromatography (DHPLC). in Current protocols in human genetics. ed N.C. Dracopoli(Wiley, New York), Supplement 19:7.10.1–7.10.12.
  21. T. PastinenM. RaitioK. LindroosP. TainolaL. PeltonenA.C. Syvanen(2000) A system for specific, high-throughput genotyping by allele-specific primer extension on microarrays. Genome Res. 10:1031–1042.
  22. M.D. SchugT.F. MackayC.F. Aquadro(1997) Low mutation rates of microsatellite loci in Drosophila melanogaster. Nat. Genet. 15:99–102.
  23. V. Sorsa(1988) Chromosome maps of Drosophila. (CRC Press, Boca Raton, FL), II.
  24. J.I. SpiegelmanM.N. MindrinosC. FankhauserD. RichardsJ. LutesJ. ChoryP.J. Oefner(2000) Cloning of the Arabidopsis RSF1 gene using a mapping strategy based on high density DNA arrays and denaturing high performance liquid chromatography. Plant Cell 12:2485–2498.
  25. A.C. SpradlingD.M. SternI. KissJ. RooteT. LavertyG.M. Rubin(1995) Gene disruptions using P transposable elements: An integral component of the Drosophila genome project. Proc. Natl. Acad. Sci. 92:10824–10830.
  26. A.C. SpradlingD. SternA. BeatonE.J. RhemT. LavertyN. MozdenS. MisraG.M. Rubin(1999) The Berkeley Drosophila Genome Project gene disruption project: Single P-element insertions mutating 25% of vital Drosophila genes. Genetics 153:135–177.
  27. T. TabataT.B. Kornberg(1994) Hedgehog is a signaling protein with a key role in patterning Drosophila imaginal discs. Cell 76:89–102.
  28. K. TeeterM. NaeemuddinR. GasperiniE. ZimmermanK.P. WhiteR. HoskinsG. Gibson(2000) Haplotype dimorphism in a SNP collection from Drosophila melanogaster. J. Exp. Zool. 288:63–75.
  29. P.A. UnderhillL. JinA.A. LinS.Q. MehdiT. JenkinsD. VollrathR.W. DavisL.L. Cavalli-SforzaP.J. Oefner(1997) Detection of numerous Y chromosome biallelic polymorphisms by denaturing high-performance liquid chromatography. Genome Res. 7:996–1005.
  30. D.G. WangJ.B. FanC.J. SiaoA. BernoP. YoungR. SapolskyG. GhandourN. PerkinsE. WinchesterJ. Spencer(1998) Large-scale identification, mapping, and genotyping of single- nucleotide polymorphisms in the human genome. Science 280:1077–1082.
  31. E. WerleC. SchneiderM. RennerM. VolkerW. Fiehn(1994) Convenient single-step, one tube purification of PCR products for direct sequencing. Nucleic Acids Res. 22:4354–4355.
  32. E.A. WinzelerD.R. RichardsA.R. ConwayA.L. GoldsteinS. KalmanM.J. McCulloughJ.H. McCuskerD.A. StevensL. WodickaD.J. Lockhart(1998) Direct allelic variation scanning of the yeast genome. Science 281:1194–1197.
Loading
Loading
Loading
Back to top