Integrated YAC Contig Map of the Prader–Willi/Angelman Region on Chromosome 15q11–q13 with Average STS Spacing of 35 kb

  1. Susan L. Christian1,
  2. Nehal K. Bhatt1,
  3. Scott A. Martin1,
  4. James S. Sutcliffe2,
  5. Takeo Kubota3,
  6. Bing Huang4,
  7. Apiwat Mutirangura5,
  8. A. Craig Chinault6,
  9. Arthur L. Beaudet6,7, and
  10. David H. Ledbetter1,8
  1. 1Department of Human Genetics, The University of Chicago, Chicago, Illinois 60637 USA; 2Department of Molecular Physiology and Biophysics, Vanderbilt University School of Medicine, Nashville, Tennessee 37232 USA; 3Department of Hygiene and Medical Genetics, Shinshu University, Nagano, Japan; 4Genzyme Genetics, Long Beach, California 90806 USA; 5Genetics Unit, Department of Anatomy, Chulalongkorn University, Bangkok, Thailand; 6Department of Molecular and Human Genetics and 7Howard Hughes Medical Institute, Baylor College of Medicine, Houston, Texas 77030 USA

Abstract

Prader–Willi syndrome and Angelman syndrome are associated with parent-of-origin-specific abnormalities of chromosome 15q11–q13, most frequently a deletion of an ∼4-Mb region. Because of genomic imprinting, paternal deficiency of this region leads to PWS and maternal deficiency to AS. Additionally, this region is frequently involved in other chromosomal rearrangements including duplications, triplications, or supernumerary marker formation. A detailed physical map of this region is important for elucidating the genes and mechanisms involved in genomic imprinting, as well as for understanding the mechanism of recurrent chromosomal rearrangments. An initial YAC contig extended from D15S18 to D15S12 and was comprised of 23 YACs and 21 STSs providing an average resolution of about one STS per 200 kb. To close two gaps in this contig, YAC screening was performed using two STSs that flank the gap between D15S18 and 254B5R and three STSs located distal to the GABRA5–149A9L gap. Additionally, we developed 11 new STSs, including seven polymorphic markers. Although several groups have developed whole-genome genetic and radiation hybrid maps, the depth of coverage for 15q11–q13 has been somewhat limited and discrepancies in marker order exist between the maps. To resolve the inconsistencies and to provide a more detailed map order of STSs in this region, we have constructed an integrated YAC STS-based physical map of chromosome 15q11–q13 containing 118 YACs and 118 STSs, including 38 STRs and 49 genes/ESTs. Using an estimate of 4 Mb for the size of this region, the map provides an average STS spacing of 35 kb. This map provides a valuable resource for identification of disease genes localized to this region as well as a framework for complete DNA sequencing.

Chromosome 15q11–q13 serves as an important paradigm for genomic imprinting in human disease as two distinct mental retardation disorders, Prader–Willi syndrome (PWS) and Angelman syndrome (AS), are associated with parent-of-origin-specific deficiencies of this region. PWS is caused by a paternal deficiency due to paternal deletions (∼70%), maternal uniparental disomy (∼25%), or imprinting center defects (<5%) (Reis et al. 1994;Buiting et al. 1995). AS is caused by a maternal deficiency due to maternal deletions (∼70%), paternal uniparental disomy (∼5%), imprinting defects (∼1%–2%), maternal loss-of-function mutations in the imprinted gene UBE3A (∼2%–4%) (Albrecht et al. 1997; Kishino et al. 1997; Matsuura et al. 1997; Rougeulle et al. 1997;Vu et al. 1997), or unknown defects (10%–20%). The identification of multiple genes or transcripts with paternal-specific expression patterns within this region (SNRPN, IPW, PAR-1, PAR-5, PAR-SN, and ZNF127) (Driscoll et al. 1992; Glenn et al. 1993; Sutcliffe et al. 1994; Wevrick et al. 1994; Ning et al. 1996) provides a number of candidate genes for PWS, although it is difficult at present to assess the role of individual genes in its pathophysiology. Additional studies are needed to identify all imprinted genes present on chromosome 15 and to explore their potential functional relationship to PWS. Only five additional genes (P, GABRB3, GABRA5, GABRG3, and NDN) have currently been identified for this ∼4-Mb region (Gardner et al. 1992; Rinchik et al. 1993; Greger et al. 1995; Glatt et al. 1997; MacDonald and Wevrick 1997; Sutcliffe et al. 1997), whereas a typical region of this size might be expected to have 100 or more genes.

Chromosome 15q11–q13 is also associated with an unusually high frequency and diversity of recurring rearrangements. The ∼4-Mb common deletion observed in PWS and AS occurs at a frequency of 1/10,000–1/20,000 live births. Additionally, >50% of supernumerary marker formation (∼1/2500 live births) involves rearrangement of the regions at the proximal and distal ends of 15q11–q13 (Huang et al. 1997). Other rearrangements observed less often include duplications, triplications, balanced reciprocal translocations, and jumping translocations. Evidence suggests the presence of at least four hot spots of chromosomal breakage in this region that require detailed physical maps to eluciate the mechanism(s) of rearrangement involved (Christian et al. 1995; Robinson et al. 1997).

One of the goals of the Human Genome Project is the construction of a physical map with sufficient ordered landmarks to provide an average resolution of 100 kb. Most landmarks are sequence-tagged-sites (STSs) that can be assayed using PCR (Olson et al. 1989; Green and Olson 1990). An initial STS-based YAC contig was developed in 1993 with a density of about one STS per 200 kb, but this physical map contained two gaps (Mutirangura et al. 1993b). Several genome-wide mapping efforts have provided additional mapping information for 15q11–q13 (Adamson et al. 1995; Chumakov et al. 1995; Hudson et al. 1995;Sheffield et al. 1995; Dib et al. 1996; Schuler et al. 1996; Stewart et al. 1997). However, the simultaneous development of these maps with few framework markers in common has made integration of the data difficult. Additionally, analysis of the data from these multiple sources for the chromosome 15q11–q13 region has yielded inconsistensies in the ordering of markers, making detailed analysis of this region extremely difficult.

To resolve these inconsistencies and increase the STS density for the chromosome 15q11–q13 region, we have analyzed the available YACs and STSs to create a fully integrated YAC-based STS content map. We now report an integrated map containing 118 YACs and 118 STSs, including 38 short tandam repeats (STRs) and 49 genes/ESTs, which resolves the discrepancies present within the other maps. This map has an average STS spacing of 35 kb, thus surpassing the goal of one STS per 100 kb. Additionally, this map will be a valuable resource toward elucidating the mechanism of rearrangement, aiding positional cloning of disease genes, as well as serving as an important framework for the systematic sequencing of the region.

RESULTS

One of the greatest advantages of the whole genome mapping effort is the availability of a large number of new reagents (i.e., STSs and YACs) for fine mapping of specific regions of the human genome. To create the integrated YAC contig of 15q11–q13, we pooled STSs developed by our laboratory and others (see Table 1 for references) and YACs developed through whole-genome mapping efforts (Chumakov et al. 1995; Hudson et al. 1995).

Table 1.

Chromosome 15q11–q13 STSs

YAC Screening to Fill Gaps

The initial YAC contig of the PWS/AS region provided a one- to fourfold coverage of the ∼4-Mb 15q11–q13 region but contained two gaps (Mutirangura et al. 1993b). Prior to the development of the whole genome level YAC contigs, the CEPH Mark I and II YAC libraries were screened using markers D15S541 and D15S543, known to flank this region, to fill the gap between D15S18 and 254B5R (Fig. 1) (Christian et al. 1995). Screening the CEPH libraries with D15S541 identified YAC y705C2. Marker D15S543 amplifies two fragments of different sizes, one from chromosome 15 and the other from chromosome 16 (Christian et al. 1995). Screening the CEPH Mark I library with D15S543 identified YACs y55F10, y166G7, y254B5, y338B6, y368H3, and y409C4 from chromosome 15 and YACs y42D4, y64C11, y64E4, y391E3, and y474H1 from chromosome 16.

Figure 1.

Refined YAC STS content map of chromosome 15q11–q13. STSs are listed vertically with STRs italicized and underlined. The region is oriented toward the centromere on the upper left and toward the telomere on the lower right. An asterisk (*) is used to indicate STSs, which also represent either known genes or ESTs. Below the STSs are horizontal lines representing YACs with the YAC name indicated. (•) A positive PCR reaction between the STS and YAC indicated. An X is used to represent a negative PCR reaction. (▪) STSs developed from YAC ends. Above the STSs is a horizontal line indicating the genetic distances between STRs present on the Genethon genetic map. The right end of y254B5, used as a hybridization probe in Mutirangura et al. (1993b), is indicated with a circled R. The brackets indicate STSs that cannot be uniquely ordered.

To fill the gap between GABRA5 and 149A9L, an STS developed from the left end of YAC 149A9 was used to screen the CEPH Mark I YAC library. The YACs identified were y97F4, y128G7, y149A9, y162C1, y369B10, y406B2, y406C2, y469E3, and y483H8. Two additional STRs, D15S156 and D15S219, were identified that mapped betweenGABRB3 and D15S165 (Beckmann et al. 1993). Screening with D15S156 identified YACs y97F4, y128G7, y180F2, y369B10, y407B1, y483H8, and y493C4, and screening with D15S219 identified YACs y97F4, y128G7, y162C1, y483H8, and y493C4. The overlapping YACs identified for these three STSs placed D15S156 and D15S219 within theGABRA5–149A9L gap but did not close it. The gap was closed when YACs y897B10 and y781B9—localized to this region by radiation hybrid mapping (Hudson et al. 1995)—were analyzed using STSsGABRA5 and WI-10300.

Development of STSs

Although most of the STSs utilized in construction of this map have been published previously (Table 1), 11 new STSs, including 7 polymorphic STRs, have been developed that have not been described previously. Three STRs (S1524, S1525, and S1526) were identified from YAC 307A12 in a region previously limited in STR coverage. D15S1526, located distal to D15S13, contains a tetranucleotide repeat of [CTAT]12. D15S1524 and S1525, located proximal to D15S13, are overlapping repeats where S1524 contains a [TG]16repeat, whereas S1525 contains a [TC]17[CA]15–{111 bp}–[TG]16 repeat. Two additional STRs have been developed from cosmids isolated from YAC A229A2. D15S1364 contains a [CA]16 repeat developed from a cosmid that also contains D15S113, whereas D15S1365, located distal to D15S113, contains a tetranucleotide repeat of [GAAA]15GAGAAAA[GAAA]17. Two additional polymorphic regions were found within the sequence of the PAR-5 and PAR-7 transcripts (Sutcliffe et al. 1994).

Four nonpolymorphic STSs have also been developed. An STS for the left end of YAC 93C9 (D15S1531) was developed following publication of the earlier contig. Three additional STSs (D15S15, D15S16, and D15S17) were developed by end-sequencing clones originally isolated from an inv dup(15)(q13)-enriched phage library (Donlon et al. 1986; Tantravahi et al. 1989).

YAC–STS Map Construction

To construct this map, the initial YAC contig was used as the starting framework. The order of YACs had been established previously using interphase fluorescence in situ hybridization (FISH) (Kuwano et al. 1992) and confirmed in other fine-mapping experiments (Nakao et al. 1994; Sutcliffe et al. 1994, 1997; Christian et al. 1995; Huang et al. 1997). STSs present within the mega-YAC contigs (Chumakov et al. 1995;Hudson et al. 1995) were mapped by PCR analysis against all YACs in the framework map to integrate the three maps. Once a preliminary integrated map was established, 12 subregions were created to establish the precise order of markers. Each STS, including those present on the original contig, was analyzed against all YACs that mapped to that subregion. Table 1 provides a list of all STSs utilized in this map. Based on the positive and negative PCR results for each STS, the final map was constructed (Fig. 1). Several YACs identified in the CEPH library screenings, described above, either gave negative PCR results with all STSs analyzed (y55F10 and y493C4), or the clones did not grow (y406B2, y406C2, and y469E3) and do not appear on the contig.

Although the current map represents the data as accurately as possible, certain caveats in interpretation of map order need to be considered. The primary concern is the presence of false-positive and false-negative PCR results as a possible source of errors in building YAC contigs (Bouffard et al. 1997). To minimize this problem, any questionable PCR results, either positive or negative, were repeated to provide the most accurate data possible. However, interstitial deletions within individual YACs exist that could influence the ordering of STSs. Therefore, four YACs, (y755A2, y785E7, y843F4, and y965G9), each containing two to five interstitial deletions that could not be mapped clearly, were excluded from the contig. The level of density of other YACs within this region, which showed consistent results, increased our level of confidence that these four YACs were of poor quality. Additionally, the map construction program SEGMAP also segregates YACs with inconsistent data (Bouffard et al. 1997).

Features of the Map

STRs

The current map contains 38 polymorphic markers extending across ∼4 Mb. Of these, eight represent markers present on the Genethon genetic map (Gyapay et al. 1994; Dib et al. 1996). The genetic distance between these markers is indicated above the YAC contig to integrate the genetic and physical maps (Fig. 1). The localization of 30 additional STRs in this region between the Genethon markers provides the resources for improved linkage analysis and disease gene identification.

ESTs

Of the 49 genes/ESTs present on the current map, 14 represent known genes, 4 represent ESTs for known genes, 1 represents an EST for a new gene, and the other 30 are ESTs that have not yet been characterized. The four ESTs identifying previously characterized genes are SGC35648 (ZNF127), SGC31492 (SNRPN), D15S903 (IPW), and WI-15852 (P). One EST, SGC30582, was identified in a BLAST search as the human homolog of the mouse necdin-encoding gene (Ndn), which has been identified recently as an imprinted gene in both human and mouse (MacDonald and Wevrick 1997; Sutcliffe et al. 1997). The 30 uncharacterized ESTs will provide candidate genes for studies showing linkage to a particular subregion within this area.

Nonrandom Distribution of ESTs

The region from SGC30582 to SNRPN, representing ∼1 Mb in physical distance, contains no genes or ESTs. In contrast, the region between SNRPN and UBE3A, representing ∼500 kb in physical distance, contains 22 genes/ESTs, of which 8 represent known genes. The other 14 ESTs represent two of these known genes and 12 new uncharacterized ESTs. The domain from SNRPN to UBE3A is known to be imprinted (Nakao et al. 1994; Sutcliffe et al. 1994; Wevrick et al. 1994; Ning et al. 1996; Albrecht et al. 1997), so it will be of interest to examine the imprinting status of these ESTs and determine the relationship, if any, of these genes to the phenotype of PWS.

Depth of Coverage

The initial YAC contig represented a minimal tiling path through the 15q11–q13 region with a one- to fourfold coverage (Mutirangura et al. 1993b). This updated YAC contig has increased the depth of coverage to 3- to 15-fold for most of the q11–q13 region. However, three regions with only one- to twofold coverage are still present on the current map. One lies at the most proximal end of the map, one lies between D15S1259 and 368H3L, and the third lies between D15S822 and D15S156. The most proximal region, including markers D15S1239, A002B45, and D15S912, contains the single YAC clone y931C4. Screening of the CEPH library with these most proximal markers failed to identify any additional YACs. It is possible that this region, which resides near the centromere of chromosome 15, may be under-represented in the YAC libraries. Screening of PAC, BAC, and cosmid libraries may be necessary to complete the physical map of this region. Weak coverage of the centromeric regions has also been observed for chromosomes 7 (Bouffard et al. 1997) and X (Nagaraja et al. 1997). Only the centromeric region of chromosome 10 appears to have been mapped successfully (Jackson et al. 1996).

Two additional regions with poor coverage lie in the same locations as the gaps present in the initial map (Mutirangura et al. 1993b). The region encompassing STSs MN7, A006B10, and A008B26 is associated with one of the four hot spots for chromosome breakage, involved in approximately half of PWS/AS deletion patients and half of the small marker 15 chromosomes (Christian et al. 1995; Huang et al. 1997). This region also demonstrates high genetic recombination, as ovarian teratoma mapping showed a 2.7-cM distance between S541/S542 and S543 (Christian et al. 1995). The low depth of coverage between S822 and S156 is also a region with high recombination frequencies (Robinson and Lalande 1995). The problem of under-representation of genomic regions in the YAC libraries has also been observed for other chromosomes (Chumakov et al. 1992; Foote et al. 1992; Qin et al.1996; Bouffard et al. 1997). The potential relationship between high recombination rates and underrepresentation in YAC libraries is unclear at present.

STS Density Determination

The region of common deletion observed in PWS and AS has been estimated to be ∼4 Mb by cytogenetic and FISH analysis (Kuwano et al. 1992). On the YAC contig this region extends from NIB1540 toP for class I patients or S543 to P for class II patients (Kuwano et al. 1992; Christian et al. 1995; S. Christian, unpubl.). Therefore, we estimate the physical distance of the region covered by this contig as ∼4 Mb. Using this distance, the current contig with 118 STSs would therefore provide an average spacing of 35 kb. However, some of these STSs cannot be uniquely ordered, as indicated by the brackets in Figure 1. Using the 83 uniquely ordered groups of STSs provides a larger average spacing of 48 kb between STSs; however, this value is still well within the goal of one STS per 100 kb.

DISCUSSION

The primary impetus for construction of this YAC contig of chromosome 15q11–q13 was the difficulty in integrating the data from multiple sources into a single, cohesive physical map. By analyzing all YACs and STSs from these multiple sources, the current map integrates the genetic and physical maps, resolves the inconsistencies in the order of markers, and places new STRs and ESTs for this region onto a single map.

Integration of Maps and Resolution of Discrepancies

Cytogenetic/FISH Map

The common deletion region observed in PWS extends from 15q11 to q13 and contains the G-positive q12 band and the G-negative q13 band (Ledbetter et al. 1981). This region corresponds to the D15S543–P region on the physical map, which has been confirmed by both interphase FISH (Kuwano et al. 1992) and microsatellite analysis (Mutirangura et al. 1993a; Christian et al. 1995). The association of GC-rich regions containing CpG islands with G-negative chromosome bands has been well established (Saccone et al. 1993, 1996). Although it is difficult to definitively integrate the cytogenetic and physical maps, it would be interesting to speculate that the G-positive q12 region correlates with the gene-poor region extending from SGC30582 to SNRPN.

Genetic Map

The comprehensive genetic map of the human genome contains >5000 microsatellites, of which 8 are present in chromosome 15q11–q13 (Dib et al. 1996). For almost all markers, there was complete concordance in the order of Genethon markers with the order on the physical map. However, one discrepancy has been resolved by the physical mapping data. This contig places S122 proximal to S210, whereas the genetic map has the order reversed (Dib et al. 1996). Additional fine mapping using PACs and cosmids confirms that S122 is proximal to S210. D15S122 was found to lie within the UBE3Agenomic region, whereas S210 is located within PAC 14I12 but distal to cosmid 24 (Sutcliffe et al. 1997; T. Kubota, unpubl.).

Additionally, Pàldi et al. (1995) analyzed six Genethon markers relative to the CEPH YACs in the initial contig (Mutirangura et al. 1993b). Although we confirmed the locations published for S128, S122, S210, and S986, the locations for S1035 and S975 differed. Analysis of S1035 placed it in the proximal YACs (yA124A3, yB148C8, and y931C4) rather than y254B5 and y264A1. A comparison of the sequences of S1035 and S542 indicates that these STRs overlap each other, confirming the more proximal location for S1035. D15S975 was found to map only to YACs y897B10 and y781B9, placing it within the GABRA5–A149A9L gap of the earlier map.

Genetic vs. Physical Distances

The overall genetic distance of 14.4 cM for the region from S1035 to S156 is significantly larger than expected for a physical distance of ∼4 Mb (Dib et al. 1996). Two subregions, which account for a significantly higher genetic distance compared to the physical distance, include S542–S543 (2.7 cM) and S975–S156 (2.3 cM).

Radiation Hybrid Maps

The Whitehead/MIT radiation hybrid map has proven to be a valuable resource that has allowed integration of STSs from multiple genome-wide mapping efforts (Hudson et al. 1995). The general localization of STSs within the 15q11–q13 region on the radiation hybrid map has been accurate; however, additional fine mapping is required to order these markers precisely. Comparison of the Genethon markers between the genetic (Dib et al. 1996) and radiation hybrid maps (Hudson et al. 1995) indicates a discrepancy in the ordering of the markers S128-S122-S156. The radiation hybrid map places the markers in the order S156-S128-S122, whereas the genetic map places the markers in the order S128-S122-S156. Data presented here indicate the marker order as S128-S122-S156, consistent with the genetic map.

The first 26 STSs on the most proximal contig for chromosome 15, WC 15.0 (Hudson et al. 1995), have been placed on this YAC contig. The Whitehead contig does not contain the most proximal region from D15S1239 to D15S128. Additionally, the WC 15.0 contig begins with marker GABRA5, located in the distal region of this map, and proceeds with markers S986-S122-S128. FISH experiments have confirmed the correct order of markers within this region as D15S18-S9-S11-S63-S13-S10-S113-GABRB3-S12 (Kuwano et al. 1992). Therefore, this map indicates an inversion of the ordering of the markers present in the most proximal portion of contig WC 15.0.

Disease Gene Identification

An important goal of the Human Genome Project is to map and characterize all human genes and determine how defects in these genes are associated with human disease. Three human diseases, PWS, AS, and oculocutaneous albinism type II (OCA2), have known associations with genes within chromosome 15q11–q13. PWS involves the deficiency of one or more paternally expressed genes in this region, AS is associated with deficiencies of UBE3A (Kishino et al. 1997;Matsuura et al. 1997), whereas OCA2 is associated with mutations ofP (Rinchik et al. 1993). In addition, there are several human diseases that have shown linkage to this region, although the specific gene has not yet been identified. These include spastic paraplegia 6 (Fink et al. 1995), obsessive–compulsive disorder (Dykens et al. 1996), bipolar illness (Edenberg et al. 1997), and autism (Cook et al. 1997). The development of a single physical map for chromosome 15q11–q13 with ordered ESTs provides candidate genes for analysis and will allow earlier identification of disease genes localized to this region of the human genome.

METHODS

Chromosome 15 STSs

Five STRs (D15S1364, S1365, S1524, S1525, and S1526) were developed using methods described previously (Mutirangura et al. 1993a). An STS for the left end of YAC 93C9 was obtained using vectorette PCR (Riley et al. 1990; Green 1993). The clones D15S15, S16, and S17 (Tantravahi et al. 1989) were acquired from the ATCC and used to create STSs. All other STSs were identified from previously published papers. Table 1 shows the complete list of all STSs used to develop the map, the GDB amplimer file number to access the PCR primer sequences, GenBank accession numbers for sequence information, if available, and the reference for the STS.

YAC Clones

YAC screening of the CEPH library was performed at the Baylor College of Medicine Genome Center. Other sources of YACs mapping to chromosome 15q11–q13 were identified through YAC contigs developed by CEPH (Cohen et al. 1993; Chumakov et al. 1995) and the Whitehead Institute (Hudson et al. 1995). The YAC clones from the CEPH libraries were acquired from the Baylor College of Medicine Genome Center, the National Human Genome Research Institute (Bethesda, MD), and Research Genetics, Inc. (Huntsville, AL). YACs from the St. Louis libraries were acquired from the Baylor College of Medicine Genome Center.

PCR Analysis

PCR was performed in 10-μl reactions containing 1 μl of Perkin Elmer buffer I, 200 μm dNTP mix, 0.5 μm primers, 0.5 unit of Amplitaq Gold (Perkin Elmer, Foster City, CT) and 5–25 ng of YAC DNA or 20–40 ng of genomic DNA contol. The reactions were carried out in a Perkin Elmer 9600 thermocycler using a program of initial denaturation at 95°C for 4 min followed by 35 cycles of denaturation at 94°C for 30 sec, annealing at 50°C–55°C for 30 sec, extension at 72°C for 30 sec, followed by a final extention at 72°C for 5 min. The authenticity of the PCR analysis was monitored using genomic DNA as a positive control. The reaction products were separated on a 2.0% agarose gel, stained with ethidium bromide, and visualized using a UVP GDS8000 gel documentation system.

Acknowledgments

We gratefully acknowledge Dr. Eric Green for critical reading of the manuscript.

The publication costs of this article were defrayed in part by payment of page charges. This article must therefore be hereby marked “advertisement” in accordance with 18 USC section 1734 solely to indicate this fact.

Footnotes

  • 8 Corresponding author.

  • E-MAIL dhl{at}genetics.uchicago.edu; FAX (773) 834-0505.

    • Received October 16, 1997.
    • Accepted January 5, 1998.

REFERENCES

| Table of Contents

Preprint Server