|
|
|
|
Published online before print
November 21, 2007, 10.1101/gr.6927808 Genome Res. 18:60-66, 2008 ©2008 by Cold Spring Harbor Laboratory Press; ISSN 1088-9051/08 $5.00
Letter Genomic copy number and expression variation within the C57BL/6J inbred mouse strainGenetic Disease Research Branch, National Human Genome Research Institute, National Institutes of Health, Bethesda, Maryland 20892, USA
The C57BL/6J strain is one of the most widely used animal models for biomedical research, and individual mice within the strain are often assumed to be genetically identical after more than 70 yr of inbreeding. Using a single nucleotide polymorphism (SNP) genotyping panel, we assessed if copy number variations (CNVs) could be detected within the C57BL/6J strain by comparing relative allele frequencies in first generation (F1) progeny of C57BL/6J mice. Sequencing, quantitative PCR, breeding, and array comparative genomic hybridization (CGH) together confirmed the presence of two CNVs. Both CNVs span genes encoded on chromosome 19, and quantitative RT-PCR demonstrated that they result in altered expression of the insulin-degrading enzyme (Ide) and fibroblast growth factor binding protein 3 (Fgfbp3) genes. Analysis of 39 different C57BL/6J breeders revealed that 64% of mice from the Jackson Laboratory colony were heterozygous for the CNV spanning Ide. Homozygotes with and without the duplication were present in concordance with Hardy-Weinberg equilibrium (13% and 23%, respectively), and analysis of archived samples from the C57BL/6J colony suggests that the duplication has rapidly reached a high frequency in the colony since 1994. The identification of two CNVs in the small portion of the genome screened demonstrates that individual mice of highly inbred strains are not isogenic and suggests other CNVs may be segregating within C57BL/6J as well as other carefully maintained inbred strains. These differences can influence interpretations of physiological, biomedical, and behavioral experiments and can be exploited to model CNVs apparent in the human genome.
The recent identification of extensive genomic structural variation among normal, healthy individuals has highlighted the importance of such diversity in human evolution and variation (Eichler 2001
The ability to model these and other human diseases in mouse is extremely important in a broad range of biomedical fields and is made possible in part by the existence of well characterized inbred mouse strains whose importance as an experimental system has been well documented (Paigen 1995
It is clear that different inbred strains have polymorphisms in genomic copy number (Li et al. 2004
Identification of CNVs within C57BL/6J An Illumina SNP genotyping platform was used to measure the relative ratio of parental alleles in the first generation (F1) progeny of two inbred strains, C57BL/6J and BALB/cJ. In a genome scan of 804 SNPs, a candidate CNV was identified at one SNP on chromosome 19 (refSNP ID rs30920120) where DNA samples from 158 F1 animals did not show identical heterozygosity but, instead, fell into two discrete genotype classes that differed in their signal intensity ratio. Some heterozygotes appeared to carry a single copy of the C57BL/6J allele (B6/BALB) while others appeared to carry a duplicated copy of the C57BL/6J allele (DupB6/BALB) (Fig. 1A).
To further investigate a possible copy number variation at this locus we developed a real-time PCR assay to quantitatively determine the relative ratio of the two parental alleles in heterozygous samples. Genotyping of samples included in the genome scan using this independent technique confirmed the presence of two genotype classes of heterozygote F1 animals (Fig. 1B). Additionally, sequencing across SNP rs30920120 in representative samples from each F1 genotype class revealed altered peak height ratios consistent with the ratios we observed on the Illumina platform and real-time PCR assays (Fig. 1C). Because the initial population of heterozygote F1 animals screened on the Illumina platform included progeny that were bred as part of an ENU screen, we could not rule out that ENU treatment was introducing a high frequency of de novo duplications or deletions. Thus, we bred additional mice to confirm the presence of a CNV in a non-ENU-treated, wild-type colony. In total, 166 heterozygote F1 progeny from the wild-type colony were genotyped at rs30920120 using our real-time PCR SNP genotyping assay. These progeny were bred from C57BL/6J and BALB/cJ inbred mice obtained directly from the Jackson Laboratory. We observed both classes of F1 animals in crosses set up in reciprocal directions with respect to maternal and paternal strain, and the two classes did not segregate with gender of the F1 offspring. However, we observed that not every breeding pair produced both genotype classes. A subset of C57BL/6J breeders produced both B6/BALB and DupB6/BALB F1 offspring in a 50:50 ratio whereas other breeders produced offspring only of a single genotype class (Fig. 2A). The genotype of the F1 offspring produced by each C57BL/6J breeder was consistent when rotated through multiple BALB/cJ breeders. However, the reciprocal was not true. Individual BALB/cJ breeders produced different classes of F1 offspring that could vary from litter to litter as C57BL/6J breeders were rotated. For example, a single BALB/cJ breeder produced 11 B6/BALB offspring with one C57BL/6J mating partner and then in subsequent litters the same BALB/cJ breeder produced a mix of offspring (six B6/BALB and six DupB6/BALB) with a second C57BL/6J mating partner. This result suggested that the C57BL/6J breeders were introducing the variability we observed in the F1 offspring and led us to hypothesize that a CNV was present within the C57BL/6J inbred colony and some individual C57BL/6J mice were heterozygous with respect to their copy number at this SNP.
To confirm the presence of a CNV in our individual C57BL/6J breeders, we carried out array CGH analysis using DNA samples from 14 animals within our pedigree. The array CGH confirmed that the source of the CNV was indeed differences among individual C57BL/6J breeders (Fig. 2B). Analysis of two BALB/cJ animals, one that had produced only B6/BALB offspring and one that had produced a mix of B6/BALB and DupB6/BALB offspring, showed that both BALB/cJ breeders had the same genomic copy number in the region of chromosome 19 flanking SNP rs30920120. However, among seven different C57BL/6J animals included in the CGH analysis, three carried an extra copy of this region. In all three cases those animals had produced both F1 genotype classes in their offspring. Four C57BL/6J animals showed no increased copy number in the CGH analysis, indicating they were homozygous for a single copy of the locus and consistent with breeding results where all four produced a single genotype (B6/BALB) in their progeny. The CGH results pinpointed the boundaries of the C57BL/6J CNV to an 112-kb region of chromosome 19 (NCBI Build 36, chromosome 19 37.303–37.415 Mb) including the entire insulin-degrading enzyme locus (Ide) and the first exon of Kif11 (Fig. 2C). CGH analysis also identified a second CNV of 60 kb (NCBI Build 36, chromosome 19 36.977–37.037) containing fibroblast growth factor binding protein 3 (Fgfbp3) and 5' exons of Btaf1 RNA polymerase II (Btaf1) (Fig. 2C).
Linkage analysis
CNVs alter expression of duplicated genes Since the entire coding region of Ide and Fgfbp3 are included in the two CNVs identified, we used real-time RT-PCR to investigate if their expression differed between B6/BALB and DupB6/BALB F1 animals. Indeed, DupB6/BALB F1 animals showed significantly increased expression of Ide in spleen and brain (1.7-fold increase) and of Fgfbp3 expression in spleen (1.6-fold increase; Fig. 3B). Interestingly, Fgfbp3 was expressed but not altered in brain, suggesting that a brain-specific regulatory element may be altered in the extra copy of the gene. We also looked at expression of four other genes in the regions that either overlap the breakpoints or lie between the CNVs. We did not detect any significant expression differences in brain or spleen between the genotype classes for these other four genes (Btaf1, March5, Cpeb3, Kif11), indicating that either the regulation of these genes is not affected by the rearrangement, expression was measured in an inappropriate tissue, or feedback mechanisms exist to correct for any changes in copy number.
High frequency in C57BL/6J stock
The concordance with Hardy–Weinberg equilibrium could suggest that this genomic rearrangement occurred a long time ago in the C57BL/6J colony. To address when this rearrangement arose, we analyzed samples from a closely related strain, C57BL/10, which was separated from C57BL/6 sometime prior to 1937 (Festing 1998
To identify possible CNVs within inbred mouse strains, we measured the relative ratio of parental alleles in the F1 progeny of two inbred strains, C57BL/6J and BALB/cJ. A CNV was identified on chromosome 19 and verified by independent techniques including quantitative PCR and sequencing. Subsequent pedigree analysis and array CGH identified a second CNV on chromosome 19 and confirmed that the two CNVs are present within the highly inbred C57BL/6J colony and alter expression of the Ide and Fgfbp3 genes.
The identification of the CNV spanning Ide in the small portion of the genome we initially screened for allelic ratios suggests that other CNVs are likely to be present within C57BL/6J as well as other carefully maintained inbred strains. The Illumina SNP genotyping technology we utilized for a small-scale screen of 804 informative SNPs distributed across the genome has the potential to detect CNVs of any length that span the SNPs utilized, and thus we can assume that we might have detected the majority of very large CNVs. However, little is known about the average size of potential CNVs within the mouse genome. Until recently, studies were biased toward the identification of the largest CNVs because of the limited resolution of available techniques for detection. More recently, a comprehensive study in mouse utilized array CGH with an average oligo spacing of
To limit genetic drift in the Jackson Laboratory colonies, C57BL/6J along with other strains are rederived from a frozen embryo stock every five generations (Taft et al. 2006
The CNVs we identified cause duplication of the Ide and Fgfbp3 genes in a high proportion of C57BL/6J individuals and result in increased gene expression. The implications of these findings are important considering that Ide expression differences in engineered mice have been shown to have significant effects on disease phenotypes (Farris et al. 2003 Given the extensive use of C57BL/6J in research, knowledge of the precise content of its genome is extremely important. Over 90% of the mouse genome sequence is thought to be complete with NCBI Build 37.1; however, the extent of intra-strain variation is not likely to be represented by such efforts. With the increasing availability of techniques to rapidly and inexpensively characterize intra-strain variation, it is important for the research community to begin to identify existing CNVs in the most widely used strains where genetic drift has been limited by cryopreservation programs. This effort would allow the affects of CNVs in humans to be modeled on controlled genetic backgrounds. Additionally, if intra-strain CNVs are widespread within inbred colonies, they may complicate large scale efforts in mouse to study complex diseases using recombinant inbred and congenic strains, which may indeed carry polymorphisms at loci outside the selected regions. While we cannot limit de novo events that will surely introduce minor variability into inbred strains, documentation of CNVs that are fixed in existing populations will provide details that will ultimately extend the utility of the genomes of mouse inbred strains.
Genome scan Genotyping services were provided by the Center for Inherited Disease Research (CIDR, Baltimore, MD) using a commercially available medium density mouse linkage panel (Illumina) to genotype 804 autosomal SNPs polymorphic between C57BL/6J and BALB/cJ on 158 first generation offspring (F1s). Data were analyzed using BeadStudio software (Illumina) and Excel (Microsoft). A single SNP (gnf19.035.019 corresponding to refSNP ID rs30920120) showed evidence of CNV.
Mouse husbandry
Real-time PCR
CGH
FISH
Gene expression
Genotyping services were provided by the Center for Inherited Disease Research (CIDR). CIDR is fully funded through a federal contract from the National Institutes of Health to The Johns Hopkins University, contract number N01-HG-65403. We thank the NHGRI cytogenetics core for FISH services and Deanna Church at NCBI for genome sequence analysis. SSLP genotyping was performed by Ursula Harper and MaryPat Jones through the NHGRI Genomics Core. The Jackson Laboratory kindly provided archived DNA from the BSS/BSB backcross panels. This research was supported in part by the Intramural Research Program of the National Human Genome Research Institute, National Institutes of Health (USA). Thank you to Julie Segre, Leslie Biesecker, Pamela Schwartzberg, Eric Green, Lawrence Brody, and members of the Pavan Laboratory for helpful comments and discussions.
1 Corresponding author.
E-mail bpavan{at}mail.nih.gov; fax (301) 402-2170. Article published online before print. Article and publication date are at http://www.genome.org/cgi/doi/10.1101/gr.6927808
Bailey, D.W. 1982. How pure are inbred strains of mice? Immunol. Today 3: 210–214.[CrossRef] Bogue, M.A., Grubb, S.C., Maddatu, T.P., and Bult, C.J. 2007. Mouse Phenome Database (MPD). Nucleic Acids Res. 35: D643–D649. doi: 10.1093/nar/gkl1049.[CrossRef][Medline] Churchill, G.A., Airey, D.C., Allayee, H., Angel, J.M., Attie, A.D., Beatty, J., Beavis, W.D., Belknap, J.K., Bennett, B., Berrettini, W., et al. 2004. The Collaborative Cross, a community resource for the genetic analysis of complex traits. Nat. Genet. 36: 1133–1137.[CrossRef][Medline] Crabbe, J.C., Wahlsten, D., and Dudek, B.C. 1999. Genetics of mouse behavior: Interactions with laboratory environment. Science 284: 1670–1672. de Fourmestraux, V., Neubauer, H., Poussin, C., Farmer, P., Falquet, L., Burcelin, R., Delorenzi, M., and Thorens, B. 2004. Transcript profiling suggests that differential metabolic adaptation of mice to a high fat diet is associated with changes in liver to muscle lipid fluxes. J. Biol. Chem. 279: 50743–50753. Dutra, A.S., Mignot, E., and Puck, J.M. 1996. Gene localization and syntenic mapping by FISH in the dog. Cytogenet. Cell Genet. 74: 113–117.[Medline] Eichler, E.E. 2001. Recent duplication, domain accretion and the dynamic mutation of the human genome. Trends Genet. 17: 661–669.[CrossRef][Medline] Eichler, E.E. 2006. Widening the spectrum of human genetic variation. Nat. Genet. 38: 9–11.[Medline] Farris, W., Mansourian, S., Chang, Y., Lindsley, L., Eckman, E.A., Frosch, M.P., Eckman, C.B., Tanzi, R.E., Selkoe, D.J., and Guenette, S. 2003. Insulin-degrading enzyme regulates the levels of insulin, amyloid β-protein, and the β-amyloid precursor protein intracellular domain in vivo. Proc. Natl. Acad. Sci. 100: 4162–4167. Festing, M.F.W. 1998. Inbred strains of mice, http://www.informatics.jax.org/external/festing/mouse/STRAINS.shtml. Freeman, J.L., Perry, G.H., Feuk, L., Redon, R., McCarroll, S.A., Altshuler, D.M., Aburatani, H., Jones, K.W., Tyler-Smith, C., and Hurles, M.E. 2006. Copy number variation: New insights in genome diversity. Genome Res. 16: 949–961. Graubert, T.A., Cahan, P., Edwin, D., Selzer, R.R., Richmond, T.A., Eis, P.S., Shannon, W.D., Li, X., McLeod, H.L., Cheverud, J.M., et al. 2007. A high-resolution map of segmental DNA copy number variation in the mouse genome. PLoS Genet. 3: e3. doi: 10.1371/journal.pgen.0030003.[CrossRef][Medline] Hamilton, B.A. and Frankel, W.N. 2001. Of mice and genome sequence. Cell 107: 13–16.[CrossRef][Medline] Iafrate, A.J., Feuk, L., Rivera, M.N., Listewnik, M.L., Donahoe, P.K., Qi, Y., and Lee, C. 2004. Detection of large-scale variation in the human genome. Nat. Genet. 36: 949–951.[CrossRef][Medline] International Mouse Knockout Consortium. 2007. A mouse for all reasons. Cell 128: 9–13.[CrossRef][Medline] JAX Notes. 1989. Profile: C57BL/6J. JAX Notes 438: http://jaxmice.jax.org/library/notes/438b.html. Kile, B.T., Hentges, K.E., Clark, A.T., Nakamura, H., Salinger, A.P., Liu, B., Box, N., Stockton, D.W., Johnson, R.L., Behringer, R.R., et al. 2003. Functional genetic analysis of mouse chromosome 11. Nature 425: 81–86.[CrossRef][Medline] La Starza, R., Crescenzi, B., Pierini, V., Romoli, S., Gorello, P., Brandimarte, L., Matteucci, C., Kropp, M.G., Barba, G., Martelli, M.F., et al. 2007. A common 93-kb duplicated DNA sequence at 1q21.2 in acute lymphoblastic leukemia and Burkitt lymphoma. Cancer Genet. Cytogenet. 175: 73–76.[CrossRef][Medline] Lakshmi, B., Hall, I.M., Egan, C., Alexander, J., Leotta, A., Healy, J., Zender, L., Spector, M.S., Xue, W., Lowe, S.W., et al. 2006. Mouse genomic representational oligonucleotide microarray analysis: Detection of copy number variations in normal and tumor specimens. Proc. Natl. Acad. Sci. 103: 11234–11239. Lee, J.A. and Lupski, J.R. 2006. Genomic rearrangements and gene copy-number alterations as a cause of nervous system disorders. Neuron 52: 103–121.[CrossRef][Medline] Leissring, M.A., Farris, W., Chang, A.Y., Walsh, D.M., Wu, X., Sun, X., Frosch, M.P., and Selkoe, D.J. 2003. Enhanced proteolysis of β-amyloid in APP transgenic mice prevents plaque formation, secondary pathology, and premature death. Neuron 40: 1087–1093.[CrossRef][Medline] Li, J., Jiang, T., Mao, J., Balmain, A., Peterson, L., Harris, C., Rao, P.H., Havlak, P., Gibbs, R., and Cai, W.W. 2004. Genomic segmental polymorphisms in inbred mouse strains. Nat. Genet. 36: 952–954.[CrossRef][Medline] Mouse Genome Sequencing Consortium. 2002. Initial sequencing and comparative analysis of the mouse genome. Nature 420: 520–562.[CrossRef][Medline] Mueller, J.C., Riemenschneider, M., Schoepfer-Wendels, A., Gohlke, H., Konta, L., Friedrich, P., Illig, T., Laws, S., Förstl, H., and Kurz, A. 2007. Weak independent association signals between IDE polymorphisms, Alzheimers disease, and cognitive measures. Neurobiol. Aging 28: 727–734.[CrossRef][Medline] Olshen, A.B., Venkatraman, E.S., Lucito, R., and Wigler, M. 2004. Circular binary segmentation for the analysis of array-based DNA copy number data. Biostatistics 5: 557–572.[Abstract] Paigen, K. 1995. A miracle enough: The power of mice. Nat. Med. 1: 215–220.[CrossRef][Medline] Peters, L.L., Robledo, R.F., Bult, C.J., Churchill, G.A., Paigen, B.J., and Svenson, K.L. 2007. The mouse as a model for human biology: A resource guide for complex trait analysis. Nat. Rev. Genet. 8: 58–69.[CrossRef][Medline] Qiu, W.Q. and Folstein, M.F. 2006. Insulin, insulin-degrading enzyme and amyloid-beta peptide in Alzheimers disease: Review and hypothesis. Neurobiol. Aging 27: 190–198.[CrossRef][Medline] Rowe, L.B., Nadeau, J.H., Turner, R., Frankel, W.N., Letts, V.A., Eppig, J.T., Ko, M.S.H., Thurston, S.J., and Birkenmeier, E.H. 1994. Maps from two interspecific backcross DNA panels available as a community genetic mapping resource. Mamm. Genome 5: 253–274.[CrossRef][Medline] Rubin, E.M. and Barsh, G.S. 1996. Biological insights through genomics: Mouse to man. J. Clin. Invest. 97: 275–280.[Medline] Scott, L.J., Mohlke, K.L., Bonnycastle, L.L., Willer, C.J., Li, Y., Duren, W.L., Erdos, M.R., Stringham, H.M., Chines, P.S., Jackson, A.U., et al. 2007. A genome-wide association study of type 2 diabetes in Finns detects multiple susceptibility variants. Science 316: 1341–1345. Sebat, J., Lakshmi, B., Troge, J., Alexander, J., Young, J., Lundin, P., Månér, S., Massa, H., Walker, M., Chi, M., et al. 2004. Large-scale copy number polymorphism in the human genome. Science 305: 525–528. Sharp, A.J., Cheng, Z., and Eichler, E.E. 2006a. Structural variation of the human genome. Annu. Rev. Genomics Hum. Genet. 7: 407–442.[CrossRef][Medline] Sharp, A.J., Hansen, S., Selzer, R.R., Cheng, Z., Regan, R., Hurst, J.A., Stewart, H., Price, S.M., Blair, E., Hennekam, R.C., et al. 2006b. Discovery of previously unidentified genomic disorders from the duplication architecture of the human genome. Nat. Genet. 38: 1038–1042.[CrossRef][Medline] Singer, J.B., Hill, A.E., Burrage, L.C., Olszens, K.R., Song, J., Justice, M., OBrien, W.E., Conti, D.V., Witte, J.S., Lander, E.S., et al. 2004. Genetic dissection of complex traits with chromosome substitution strains of mice. Science 304: 445–448. Snijders, A.M., Nowak, N.J., Huey, B., Fridlyand, J., Law, S., Conroy, J., Tokuyasu, T., Demir, K., Chiu, R., Mao, J.H., et al. 2005. Mapping segmental and sequence variations among laboratory mice using BAC array CGH. Genome Res. 15: 302–311. Specht, C.G. and Schoepfer, R. 2001. Deletion of the alpha-synuclein locus in a subpopulation of C57BL/6J inbred mice. BMC Neurosci. 2: 11. doi: 10.1186/1471-2202-2-11.[CrossRef][Medline] Taft, R.A., Davisson, M., and Wiles, M.V. 2006. Know thy mouse. Trends Genet. 22: 649–653.[CrossRef][Medline] The International Mouse Knockout Consortium. 2007. A mouse for all reasons. Cell 128: 9–13.[CrossRef][Medline] Vepsäläinen, S., Parkinson, M., Helisalmi, S., Mannermaa, A., Soininen, H., Tanzi, R., Bertram, L., and Hiltunen, M. 2007. Insulin degrading enzyme is genetically associated with Alzheimers disease in the Finnish population. J. Med. Genet. 44: 606–608. Wang, X., Le Roy, I., Nicodeme, E., Li, R., Wagner, R., Petros, C., Churchill, G.A., Harris, S., Darvasi, A., Kirilovsky, J., et al. 2003. Using advanced intercross lines for high-resolution mapping of HDL cholesterol quantitative trait loci. Genome Res. 13: 1654–1664. Workman, C., Jensen, L.J., Jarmer, H., Berka, R., Gautier, L., Nielser, H.B., Nielsen, C., Brunak, S., and Knudsen, S. 2002. A new nonlinear normalization method for reducing variability in DNA microarray experiments. In Genome Biol. Vol. 3.
Received July 19, 2007; accepted in revised format October 16, 2007.
|
||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||