Y-Chromosomal SNPs in Finno–Ugric-Speaking Populations Analyzed by Minisequencing on Microarrays

  1. Mirja Raitio1,5,
  2. Katarina Lindroos1,4,5,
  3. Minna Laukkanen1,
  4. Tomi Pastinen1,
  5. Pertti Sistonen2,
  6. Antti Sajantila3, and
  7. Ann-Christine Syvänen1,4,6
  1. 1Department of Human Molecular Genetics, National Public Health Institute, Helsinki, Finland; 2Finnish Red Cross Blood Transfusion Service, Helsinki, Finland; 3Department of Forensic Medicine, University of Helsinki, Helsinki, Finland; 4Molecular Medicine, Department of Medical Sciences, Uppsala University, Uppsala, Sweden

Abstract

An increasing number of single nucleotide polymorphisms (SNPs) on the Y chromosome are being identified. To utilize the full potential of the SNP markers in population genetic studies, new genotyping methods with high throughput are required. We describe a microarray system based on the minisequencing single nucleotide primer extension principle for multiplex genotyping of Y-chromosomal SNP markers. The system was applied for screening a panel of 25 Y-chromosomal SNPs in a unique collection of samples representing five Finno–Ugric populations. The specific minisequencing reaction provides 5-fold to infinite discrimination between the Y-chromosomal genotypes, and the microarray format of the system allows parallel and simultaneous analysis of large numbers of SNPs and samples. In addition to the SNP markers, five Y-chromosomal microsatellite loci were typed. Altogether 10,000 genotypes were generated to assess the genetic diversity in these population samples. Six of the 25 SNP markers (M9,Tat, SRY10831, M17, M12,92R7) were polymorphic in the analyzed populations, yielding six distinct SNP haplotypes. The microsatellite data were used to study the genetic structure of two major SNP haplotypes in the Finns and the Saami in more detail. We found that the most common haplotypes are shared between the Finns and the Saami, and that the SNP haplotypes show regional differences within the Finns and the Saami, which supports the hypothesis of two separate settlement waves to Finland.

The sequence variation on the Y chromosome has appeared to be low, and until recently only a modest number of single nucleotide polymorphisms (SNPs) were known (for references, see Table1). By screening 18 kb of coding sequence in four genes in the nonrecombining parts of the Y chromosome in samples from the five continents, ∼100 new Y-chromosomal SNPs were recently discovered (Shen et al. 2000). The frequency of SNPs was found to be lower than that of autosomal SNPs (Shen et al. 2000), in accordance with the four-times-lower effective population size of the Y chromosome and a recent common ancestor for the Y chromosome (Hammer 1995; Thomson et al. 2000). The Y-chromosomal SNPs are useful biallelic markers for following paternal lineages as a complement to the widely analyzed sequence variation of the maternally inherited mitochondrial DNA (DiRienzo and Wilson 1991; Sajantila et al. 1995; Sigurðardóttir et al. 2000). Because SNPs in the nonrecombining parts of the Y chromosome can be considered as the results of unique events during evolution, they can be used in combination with the more rapidly evolving microsatellite markers to construct well-defined haplotypes to improve the power of the statistical analysis of the results (Jobling et al. 1997).

Table 1.

PCR and Minisequencing Primers for the Y-Chromosomal SNP Markers

A variety of techniques have been applied for discovering and genotyping Y-chromosomal SNPs; of these, heteroduplex analysis using denaturing high-performance liquid chromatography (DHPLC) has been the most successful approach (Underhill et al. 1997). DHPLC has proven to be particularly efficient for discovering new SNPs, but the technique is useful also for scoring previously known ones (Underhill et al. 1997; Shen et al. 2000). As the number of known SNPs on the Y chromosome increases, methods with higher throughput than those currently available will be required to utilize the potential of known SNP markers in population studies effectively. Microarray-based genotyping technology is a promising alternative for parallel and simultaneous analysis of many SNPs in large sample sets (Hacia 1999;Pastinen et al. 2000). In the present study we describe a microarray system based on the minisequencing single nucleotide primer extension principle (Syvänen et al. 1990; Pastinen et al. 1997) to facilitate rapid and reliable multiplex genotyping of Y-chromosomal SNPs. Using this newly developed genotyping system we screened a panel of 25 human Y-chromosomal SNPs in a unique collection of 300 samples representing five Finno–Ugric-speaking populations. The samples were from Finns originating from various geographical locations in Finland, three different Saami groups, Karelians from Russia, and Ob-Ugric Mansi and Khantyi speakers (Fig. 1). In addition to the Y-chromosomal SNPs, we typed five Y-chromosomal microsatellite loci in these samples. Altogether, 10,000 genotypes were generated to assess the genetic diversity in these population samples.

Figure 1.

The geographic origin of the sampled Finno–Ugric-speaking populations and the frequencies of Y-chromosomal haplotypes formed by SNPs in these populations. For haplotype designation and nomenclature as well as the population frequencies of the observed haplotypes, see Table 5.

RESULTS AND DISCUSSION

The Genotyping System

A method based on minisequencing single nucleotide primer extension (Syvänen 1999) in a micorarray format was developed for multiplex genotyping of 25 Y-chromosomal SNPs. The 25 SNPs to be included in the panel were selected from the literature, and minisequencing primers for each SNP were designed to anneal to the DNA region immediately adjacent to the site of the SNP (Table 1). The primers were immobilized covalently on glass microscope slides as 5 × 5 arrays (Fig.2). The genotyping procedure involves (1) multiplex PCR amplification of the Y-chromosomal DNA regions spanning the SNPs; (2) extension of the immobilized primers with labeled ddNTPs using a DNA polymerase with the multiplex PCR products as templates; (3) measurement of the incorporated label on the microarrays; and (4) interpretation of the results.

Figure 2.

Result from validation of the microarray-based genotyping system. Two mixtures of 25 synthetic oligonucleotide templates corresponding to one of the alleles at each SNP site were analyzed in minisequencing reactions on the microarrays to verify that both alleles of each SNP marker would be detected. (Left panel) The fluorescence image obtained from four detection reactions (A, C, G, T) with TAMRA-labeled ddNTPs. (Right panel) Order of the SNP markers on the array with the nucleotide variation at each site. The asterisk indicates those alleles that are detected from complementary DNA strands compared to earlier published literature.

Performing multiplex PCR amplification is a major rate-limiting step in all currently applied genotyping procedures. Amplification of more than 10 fragments reproducibly and successfully from multiple samples has proven to be difficult (Hacia et al. 1998; Pastinen et al. 2000). In the present study a touchdown PCR procedure (Don et al. 1991) was used to circumvent differences in amplification efficiency arising from differences in melting temperatures between the PCR primers in combination with universal 5′ sequences on the primers (Shuber et al. 1995). This procedure allowed us to unify the reaction kinetics of the primer annealing and amplify 24 Y-chromosomal SNPs as two sets of multiplex PCRs with 17 and 6 primer pairs, respectively. Although the success rate of the optimized multiplex PCRs was 100% for six of the SNPs, in 4% of the samples one or more SNP sites failed to amplify in the multiplex PCR (Table 2). These failures were recognized as absence of signal in the reactions on the microarrays, and these genotypes were obtained after reamplification of the first PCR product, followed by retyping using microarrays. The marker 92R7 was amplified and genotyped individually by minisequencing in the microtiter plate format (Syvänen 1997) because the SNP is located in one out of several repetitive segments (C. Tyler-Smith, pers. comm.), and our initial experiments showed that it was difficult to genotype on the arrays. The minisequencing reactions were optimized with respect to concentration of detection primers during their immobilization to the microarrays, and with respect to concentration of ddNTPs and DNA polymerase during the reactions on the microarrays. The high reaction temperature and short reaction time were found to be critical for avoiding false positive signals caused by template-independent extension of some of the primers. The protocol given in the Methods section is the result of these optimization experiments.

Table 2.

Success Rate of Multiplex PCRs

To validate that the microarray-based system would detect either allele of all 25 Y-chromosomal SNPs correctly, two mixtures of artificial single-stranded oligonucleotide templates representing one of the alleles at each site were analyzed. Figure 2 shows the fluorescence image from this analysis. Table 3 gives the corresponding numeric values measured by the fluorescence scanner and the calculated signal ratios defining the haploid Y-chromosomal genotypes. As can be seen, the 5-fold to infinite differences in signal ratios (R values) between the two genotypes provide unequivocal results at all sites. Unequivocal discrimination between genotypes is achieved also when multiplex PCR fragments amplified from the genomic DNA samples are analyzed. The range of R values for the monomorphic markers was 0.005–0.070 with an average of 0.03, or 0.98–1.0 with an average of 0.99, respectively. Table4 illustrates the genotyping results when 5 or 10 samples of each genotype at the polymorphic Tat,M9, SRY10831, and M17 SNP sites were analyzed using both radioactive (33P) and fluorescence (TAMRA) detection. Both the detection systems perform well, with the largest difference in fluorescent and radioactive signal ratios for the marker M9. The microarray-based minisequencing system allows genotyping of diploid SNPs both using 33P as label as shown in a previous study (Pastinen et al. 1998) and using fluorescence based on our more recent work (K. Lindroos et al., unpubl.). The major advantage of the fluorescence detection system is the higher resolution of the fluorescence scanner, which allows measurement of an extremely large number of data points on a miniaturized area.

Table 3.

Results from Fluorescent Detection of Both Alleles of 25 Y-Chromosomal SNPs in Artificial Single Stranded Templates

Table 4.

Accuracy of Genotype Discrimination and Comparison of Radioactive and Fluorescent Detection in Minisequencing on Microarrays for Four Polymorphic Y-Chromosomal SNPs

Y-Chromosomal SNP and Haplotype Frequencies

Our analysis of the Finnish, Saami, Karelian, Mansi, and Khantyi samples for the 25 Y-chromosomal SNP markers included in the panel revealed that five of them (M9, Tat,SRY10831, M17, and 92R7) were polymorphic in all our population samples. The SNP M12 (DYS260), with a G → T transversion, was also polymorphic to some extent in the Kola-Saami samples (Fig. 3). Interestingly, the M12 T allele has also been found in other European populations as well as in subcontinent Indians (Underhill et al. 1997). The rest of the 19 markers were monomorphic, which agrees well with the earlier published literature (Bianchi et al. 1997; Hammer et al. 1997; Karafet et al. 1999). The genotypes of the polymorphic markers were confirmed by solid-phase minisequencing in a microtiter format (data not shown; Syvänen 1997).

Figure 3.

Allele frequencies of six polymorphic Y-chromosomal SNP markers.

Thus, we found six polymorphic Y-chromosomal SNPs in our population samples, and six different haplotypes could be constructed from these (see Table 5 for the haplotype nomenclature). The most striking difference in haplotype frequencies between the populations is that the haplotype H26 is present at high frequency in the Ob-Ugric sample (55%, Fig. 1), whereas it is absent in other groups, with the exception of the Finns, where it was found once in the subgroup of Eastern Finns (Fig. 1; Table 5). On the other hand, haplotype H2, which occurs with high frequency in other population samples (15%–52%), is absent in our Ob-Ugric sample. Haplotype H16, which has earlier been reported to be present in several Northern Eurasian samples, was represented in our study with high frequency in all samples at a frequency varying from 38% in Ob-Ugrics to 77% in the Finns. Figure 1 and Table 5 summarize the haplotype frequencies in the analyzed populations.

Table 5.

Nomenclature and Frequencies for Y-Chromosomal SNP Haplotypes

Many of the SNPs analyzed in our panel seem to be restricted to specific populations. The DYS199 C → T transition, for instance, has so far been shown to be polymorphic only among AmerIndians and in the Chukchan populations in Northeastern Siberia (Underhill et al. 1996; Karafet et al. 1997; Lell et al. 1997;Ruiz-Linarez et al. 1999). Only the C allele was identified in our sample collection. Similarly, the SNP M19, with a T → A transversion, has been found only in some South American populations (Underhill et al. 1997; Ruiz-Linarez et al. 1999). The TatT → C transition is also geographically restricted and has so far been observed only in Northern Eurasian populations (Zerjal et al. 1997; Lahermo et al. 1999). In our samples the Finns had the highest frequency (78%) of the C allele, but this variant also appeared in all the other populations, including the Norwegians (which were used as a reference population) with a frequency of 12% (Fig. 3). In an earlier study the frequency in Norwegians was found to be 4% (Zerjal et al. 1997). However, the C → G substitution at the M9 locus appears to be an old mutation because it is frequent worldwide with the exception of Africa. Its absence in Africa might suggest that it occurred initially outside Africa during an early divergence out of Africa. The G allele has been shown to be prevalent in Eurasia, which is consistent with our data (Underhill et al. 1997). It is highly frequent in the Finns and in the Saami, and usually occurs on the same chromosome as the Tat C allele. Assuming that the order of mutation is correct, and that no back mutations have occurred at these sites, the two major haplogroups are divided by two mutation steps.

One of the major haplotypes (H2) carries a chromosome pool with the M9 C allele together with the Tat T allele, whereas the other major haplotype (H16) is the result of two subsequent mutations, wherein first the M9 C has been substituted with G after which the Tat T has been substituted with C. The latter has been shown to be the most frequent allele in Finland (Zerjal et al. 1997; Lahermo et al. 1999). This haplotype is highly frequent, especially in the Northern and Eastern parts of Finland, where its frequency reaches 93% and 84%, respectively (Fig.1; Table 5). These regional differences between the Eastern and Northern parts compared to Western Finland support the earlier archeological and genetic evidence for two separate settlement waves to Finland (Kittles et al. 1998; Lahermo et al 1999). Even thoughYAP+ is supposed to have an equal time depth as theM9, and is present in different European populations as well as Asian populations it, has not been found in the Finns and Saami (Lahermo et al. 1999; Altheide and Hammer 1997). This is consistent with our data, as the YAP+ and DYS271 G allele are associated, and only the A allele of DYS271 was found in our samples. In earlier studies the PN2 C → T transition was reported to occur after the SRY4064 G → A transition (Altheide and Hammer 1997) and also after the DYS271A → G transition in a YAP+ haplotype lineage (Hammer et al. 1997). Our data also support this finding, because our samples only contained the ancestral allele of these three markers.

To further characterize the genetic hierarchy of the Finno–Ugric-speaking populations we applied an analysis of molecular variance (AMOVA) by grouping the populations according to linguistics into Finnish speakers, Saami speakers, and Ob-Ugric speakers. This division also corresponds to their geographical localities, and thus no further division according to geography could be made. According to the AMOVA analysis, the vast majority (83%) of the sequence variation on the Y chromosome based on the analyzed SNPs occurs within different populations, whereas 10% occurs among the linguistic subgroups, and only 6.5% occurs between populations among linguistic subgroups. Also earlier studies based on autosomal markers have indicated that 80%–90% of the genetic variation in humans occurs within populations (Barbujani et al. 1997). Our study confirms this finding, although a large variation owing to geographical distances is not expected, since our study populations are from areas relatively close to each other. Obviously, more populations with large number of individuals should be studied to further characterize the underlying demographic or cultural factors that have played a role in forming the genetic structure of these rather isolated populations.

Y-Chromosomal Microvariation in the Finnish and the Saami Subgroups

We further studied the genetic structure of the Finns and the Saami, who have proved to be genetically distinct based on mitochondrial DNA sequences (Sajantila et al. 1995) and autosomal markers (Sajantila and Pääbo 1995), but similar based on Y-chromosomal microsatellites (Lahermo et al. 1999). For this analysis, we genotyped five Y-chromosomal microsatellite markers, DYS389-I,DYS389-II, DYS391, DYS393, andDYS19, and assigned the microsatellite alleles to the SNP haplotypes in our population samples. We found that within the major haplotypes H2 and H16 in the Finns and Saami, the most frequent microsatellite allele was identical, with the exception of the DYS19 locus in haplotype H2 and theDYS389-1 locus in haplotype H16 (Table6). In addition, shared microsatellite haplotypes were observed within the haplotypes formed by the SNPs, but not between them. Therefore, the most common Y-chromosomal haplotypes are shared between the Finns and the Saami, indicating that there have been at least two founding Y-chromosomal lineages in these populations. This finding is in accordance with the archeological data that indicates a dual origin for the Finns, and also with the earlier Y-chromosomal data from the Finns and the Saami (Kittles et al. 1999;Lahermo et al. 1999; Kittles et al. 1999; Semino et al. 2000). However, our data also indicate that within the Finns and the Saami there might be significant subpopulations with regional differences. This is not only seen in the variation of the haplogroup frequencies, but also in the genetic diversity values. For example, the nucleotide diversity in the eastern and northern Finns is strikingly low compared to that of other study populations and the western Finns in particular (Table 5).

Table 6.

Microsatellite Allele Frequencies within The Major Y-Chromosomal Haplotypes H2 and H16 in the Finns and Saami

METHODS

Population Samples

The Finnish samples originated from two sources. One part of the samples had been collected for a cross-sectional survey on coronary heart disease risk factors in Finland (Vartiainen et al. 1994). The second part of the samples was collected from healthy, unrelated male blood donors (Finnish Red Cross Blood Transfusion Service, Helsinki, Finland). The birthplace of each individual sample donor as well as that of his parents and grandparents were known. The Ob-Ugric and Kola Saami samples were collected by one of the authors (A. Sajantila) during anthropological field surveys to West Siberia in 1995 and the Kola Peninsula in 1996 (see Fig. 1). The collection of the Inari and Skolt Saami and Karelian samples has been described earlier (Lahermo et al. 1999); these were provided by M.-L. Savontaus (University of Turku, Finland). The samples from Norwegian males were provided by M. Stenersen (University of Oslo, Norway). DNA was extracted from the blood samples by standard procedures.

Preparation of Oligonucleotide Arrays

The primer arrays were prepared on microscope glass slides with Teflon lining forming 8 wells of 11 mm in diameter or 12 wells of 5 mm in diameter (Erie Scientific). The slides were prewashed with a 1% Alconox (Aldrich) solution, and rinsed several times with distilled water and ethanol. The minisequencing detection primers with 5′-amino groups (Table 1) were immobilized to isothiocyanate-activated glass surfaces essentially as described by Guo et al. (1994), except that 3-aminopropyltriethoxysilane (Sigma) was used for silanization instead of the methoxy-derivative. The detection primers were dissolved in 400 mM sodium carbonate buffer at pH 9.0 to a final concentration of 25 μM prior to spotting. Immediately after spotting, the slides were exposed to vaporized ammonia for 1 h, followed by three washes in distilled water. The arrays were stored at −70°C for up to 8 weeks.

A custom-built, modified industrial robot (Isel) with two TeleChem CPH-2 printing pins controlled by an MCM-310 operating system andNUMO-6.0 software (Merval) was used to print the oligonucleotide detection primers onto the coated slides. For radioactive detection, two adjacent spots 125–150 μm in diameter of each detection primer at a 400-μm distance from each other were printed to increase the signal intensities. The distance from the middle of the double spot to the next one was 1000 μm. The primer order was the same as for the fluorescent detection system, for which the detection primers were applied as spots 125–150 μm in diameter at a center-to-center distance of 250 μm (Fig. 2).

PCR Amplification of SNPs

The PCR primers were synthesized by Interactiva Biotechnologie GmbH (Ulm). Their sequences are listed in Table 1. DNA fragments spanning the SNP sites were amplified using a touchdown thermocycling profile in a Programmable Thermal Controller (MJ Tetrad Research). The parameters were: Initial activation of the polymerase at 95°C for 11 min, then 95°C for 30 sec, and 65°C − °C per cycle for 4 min for 5 cycles; 95°C for 30 sec, 60°C − 0.5°C per cycle for 2 min, and 68°C for 2 min for 15 cycles; 95°C for 30 sec, 53°C for 30 sec, and 68°C for 2 min for 14 cycles; 68°C for 2 min. One multiplex PCR containing primers for the 18 markers DYS271(M2), DYS199 (M3), SRY4064,SRY9138, SRY10831, SRY2627, PN3,M4, M6, M7, M8, M9,M11, M14, M17, M19, M20, and Tat, and a second multiplex PCR with the 6 markersPN1, PN2, M12, M13, M21, and M22 were performed using 10–50 ng of genomic DNA, 3.5 U of AmpliTaq Gold DNA polymerase (Perkin-Elmer), and 200 μM dNTPs in 100 μL of DNA polymerase buffer supplied with the enzyme (N808-0244). The primer concentrations ranged from 0.1 to 1.2 μM, and they had initially been adjusted to yield similar amounts of each DNA fragment and to minimize primer–dimer formation. For occasional reamplification of individual fragments 1 μL of a 1/100 dilution of the multiplex PCR product was used under the conditions given above. The marker 92R7 was amplified individually under the same conditions as the multiplex PCRs.

SNP Genotyping on Oligonucleotide Arrays

The combined PCR products were precipitated by ethanol, followed by suspension into 80 μL of 50 mM Tris-HCl at pH 8.5 and 5 mM MgCl2 buffer containing 0.002 U/μL DNase (Ampliscribe T7 Transcription Kit, Epicentre Technologies) and 0.01 U/μL shrimp alkaline phosphatase (Boehringer Mannheim, Germany). The mixture was incubated at 37°C for 10 min, followed by inactivation of the enzymes at 95°C for 20 min. Then 20 μL of a buffer containing 500 mM Tris-HCl at pH 8.0, 250 mM EDTA, 1 M NaCl, and 1% Triton X 100 was added, and the mixture was preheated together with the arrays at 95°C for 1.5 min. Next 10 μL of this mixture was applied to the arrays to four reaction wells 5 mm in diameter. In case of the larger 11-mm wells, 20 μL of the mixture was used. The annealing reaction was allowed to proceed in a humid chamber at 37°C for 15 min. The arrays were briefly washed with a solution of 50 mM Tris-HCl at pH 8.0, 25 mM EDTA, 100 mM NaCl, and 0.1% Triton-X 100.

Each minisequencing reaction mixture contained one of the four33P-labeled ddNTPs (AP Biotech) at a 0.01 μM concentration or TAMRA-labeled ddATP, ddGTP, or ddTTP at a 0.05 μM concentration or ddCTP at a 0.2 μM concentration (NEN Life Science Products), together with the other three unlabeled ddNTPs at a concentration of 0.1 μM in 26 mM Tris-HCl at pH 9.5, 6.5 mM MgCl2, and 0.2% Triton X-100 buffer with 0.05 U/μL DynaSeq DNA polymerase (a kind gift from Finnzymes, Helsinki, Finland) or ThermoSequenase (AP Biotech). The reaction mixture and the arrays carrying the annealed templates were preheated to 68°C for 2 min. The reaction volume for the 5-mm wells was 10 μL, and for the 11-mm wells it was 20 μL. The reaction was allowed to proceed at 68°C for 5 min in a humid chamber, after which the slides were washed three times for 15 min at room temperature in a solution of 90 mM Na-citrate, 900 mM NaCl, and 0.05% N-lauroyl-sarcosine. The slides were further washed twice with dH2O for 15 min, once with 50 mM NaOH for 5 min, and finally with dH2O for 15 min.

Radioactive signals were detected after an overnight exposure to imaging plates (Fuji, Kanawaga, Japan) using a phosphorimager instrument (Fuji BAS 1500 Bioimaging Analyzer). The signal intensities were measured with the Tina 2.10 software (Raytest). Fluorescence signals were detected using an array scanner (ScanArray 4000, GSI Lumonics), and the signal intensities were measured with theQuantArray analysis software (GSI Lumonics). The ratio between the signal from the reaction for one allele divided by the total signals from the reactions for both alleles was calculated.

Minisequencing in a Microtiter Format

The SNP marker 92R7 was genotyped, and the genotypes of the polymorphic SNP markers M17, SRY10831,M12, Tat, and M9 were confirmed by solid-phase minisequencing in a microtiter plate (Syvänen 1997). For amplification of the individual fragments, the PCR primers were those given in Table 1, except that the 3′ PCR primers for markers92R7 and SRY10831 were biotinylated in their 5′ end. The 92R7 and SRY10381 sites were amplified individually from 25 ng of genomic DNA using 1.75 U of AmpliTaq Gold DNA polymerase (Perkin-Elmer) with a 0.2 μM PCR primer concentration in a 50-μL volume, except that for theSRY10831 marker the biotinylated 3′ PCR primer was used at a 0.04 μM concentration. The PCR parameters were the same as for the multiplex PCR given above. The markers Tat, M9,M17, and M12 were amplified individually with both primers at a 0.2 μM concentration under the conditions given above. Of this first amplification product, 1 μL of a 1/100 dilution was subjected to a second amplification with a biotinylated universal primer 5′-GCGGTCCCAAAAGGGTCAGT to introduce a biotin residue to the PCR products for affinity capture for the minisequencing reaction. The concentration of the biotinylated universal primer was 0.04–0.08 μM, depending on the marker, and it was used together with one of the unbiotinylated primers used in the first amplification at a 0.2 μM concentration under the PCR conditions given above.

The minisequencing primers for the markers M12 andM17 were those given in Table 1. For 92R7, the minisequencing primer was 5′-ATGAACACAAAAGACGTAGAAG; forSRY10831 it was 5′-GTATCTGACTTTTTCACACAGT; for M9it was 5′-GTCTAAATTAAAAGAAAAATAAAGAG; and for Tat it was T(15) TGAGTGTAGACTTGTGAATTCA from the complementary DNA strand.

Genotyping of Microsatellite Markers

The PCR primer sequences for the microsatellite markersDYS19, DYS389-I, DYS389-II, DYS391, and DYS393 were obtained from the Genome Database (http://www.gdb.org/). The forward primers were fluorescently labeled (DYS 19-TET, DYS 389-FAM, DYS 391-FAM, and DYS 393-TET). The microsatellite markers were amplified using 20 ng of genomic DNA with 1.2 U AmpliTaq Gold DNA polymerase and 200 μM of dNTPs in 100 μL of DNA polymerase buffer supplied with the enzyme (N808-0244). Two separate multiplex PCRs were performed. One of the reactions contained primers for the markers DYS19, DYS389-I, and DYS389-II at 0.3 μM and 0.2 μM concentrations (DYS389-1 and DYS389-II are in one fragment). The cycling parameters were 95°C for 10 min, 94°C for 45 sec for 30 cycles, 55°C for 1 min, 72°C for 1 min, and a final extension at 72°C for 5 min.

The other multiplex PCR mixture contained primers for the markersDYS391 and DYS393 at 0.3 μM and 0.1 μM concentrations. The cycling parameters were 95°C for 10 min, 94°C for 1 min for 30 cycles, 51°C for 1 min, and a final extension at 72°C for 5 min.

The PCR products were combined and run in an ABI 377 DNA Sequencer according to the manuals supplied with the sequencer (PE Biosystems). The allelic fragments were detected and sized by theGeneScan 3.1 and ABI-Genotyper 2.0 programs. The fragment sizes were converted into number of repeats using five previously sequenced samples (kindly provided by Manfred Kayser, Leipzig, Germany).

Statistical Analysis

The population genetic analyses of the Y-chromosomal SNP, microsatellite, and haplotype data were performed using theARLEQUIN (ver 2.0) software package (Schneider et al. 1997). The hierarchic distribution of Y-chromosome diversity was computed using the analysis of molecular variance (AMOVA) software.

Acknowledgments

We thank Auli Bengs, Minttu Hedman, Kirsti Höök, Liisa Kauppi, Minna Levander, and Päivi Tainola for assistance with laboratory work, and Paavo Niini for technical expertise with the arrayer. We are grateful to Michael Hammer, Chris Tyler-Smith, and Peter Underhill for unpublished sequences, and to Manfred Kayser for the control microsatellite sequences. We thank Finnzymes for providing the DynaSeq DNA polymerase free of charge. This work was funded by EC Biomed2, Contract no. BMH4-CT97-2013, the Instrumentarium Foundation, and the Academy of Finland (project no. 42183).

The publication costs of this article were defrayed in part by payment of page charges. This article must therefore be hereby marked “advertisement” in accordance with 18 USC section 1734 solely to indicate this fact.

Footnotes

  • 5 These authors contributed equally to this work.

  • 6 Corresponding author.

  • E-MAIL Ann-Christine.Syvanen{at}medsci.uu.se; FAX 46-18-6112519.

  • Article and publication are at www.genome.org/cgi/doi/10.1101/gr.156301 .

    • Received July 18, 2000.
    • Accepted December 29, 2000.

REFERENCES

| Table of Contents

Preprint Server