Genomes of Ellobius species provide insight into the evolutionary dynamics of mammalian sex chromosomes

  1. Willy M. Baarends1
  1. 1Department of Developmental Biology, Erasmus MC, 3015CN, Rotterdam, The Netherlands;
  2. 2Institut Curie, Genetics and Developmental Biology Unit, 75248, Paris, France;
  3. 3Erasmus Center for Biomics, Erasmus MC, 3015CN, Rotterdam, The Netherlands;
  4. 4Institute of Human Genetics, University of Ulm, 89081, Ulm, Germany
  1. Corresponding author: w.baarends{at}erasmusmc.nl

Abstract

The X and Y sex chromosomes of placental mammals show hallmarks of a tumultuous evolutionary past. The X Chromosome has a rich and conserved gene content, while the Y Chromosome has lost most of its genes. In the Transcaucasian mole vole Ellobius lutescens, the Y Chromosome including Sry has been lost, and both females and males have a 17,X diploid karyotype. Similarly, the closely related Ellobius talpinus, has a 54,XX karyotype in both females and males. Here, we report the sequencing and assembly of the E. lutescens and E. talpinus genomes. The results indicate that the loss of the Y Chromosome in E. lutescens and E. talpinus occurred in two independent events. Four functional homologs of mouse Y-Chromosomal genes were detected in both female and male E. lutescens, of which three were also detected in the E. talpinus genome. One of these is Eif2s3y, known as the only Y-derived gene that is crucial for successful male meiosis. Female and male E. lutescens can carry one and the same X Chromosome with a largely conserved gene content, including all genes known to function in X Chromosome inactivation. The availability of the genomes of these mole vole species provides unique models to study the dynamics of sex chromosome evolution.

Sex determination may occur through a wide variety of systems: genetic, epigenetic, or environmental. In cases where sex determination is genetic, this may be associated with the presence of sex chromosomes that are largely heterologous, such as the X and Y system in mammals, where males carry an X and Y Chromosome and females carry two X Chromosomes (Hughes and Page 2015). These sex chromosomes evolved from an autosomal pair. In the case of the XX/XY system in therian mammals, X and Y Chromosome evolution most likely started with an initiating dominant genetic mutation in one of the ancestral Sox3 alleles combined with a gene fusion event that changed its regulation (Sato et al. 2010). Together, this generated the male-determining Sry gene (Lovell-Badge and Robertson 1990; Koopman et al. 1991). In time, the region that contained Sry became nonrecombining, possibly through inversions (Bachtrog 2013). This nonrecombining region gradually expanded and is thought to have undergone gene loss due to different evolutionary processes acting on the nonrecombining region, resulting in an accumulation of deleterious mutations and lack of adaptation (Charlesworth and Charlesworth 2000; Bachtrog 2013). Together with gene acquisition events, this has resulted in the current structure and gene composition of mammalian sex chromosomes. The X Chromosome is gene rich (about 1000 coding genes in human) (Ross et al. 2005), and the Y Chromosome is gene poor (about 70 coding genes in human) (Skaletsky et al. 2003), and the two chromosomes share only a relatively small homologous pseudoautosomal region, which still recombines during male meiosis (Hinch et al. 2014). The Y Chromosome has become enriched for male beneficial genes via relocation (and amplification) of such genes from autosomes or via gradual genetic changes in ancestral genes that gained a more sex-specific function compared with their X-linked paralog (Bellott et al. 2014; Cortez et al. 2014; Soh et al. 2014). In addition, some dosage-sensitive and critical ancestral Y-Chromosomal genes were specifically retained on the Y (Bellott et al. 2014; Cortez et al. 2014). The loss of genes from the Y Chromosome has created a dosage imbalance between X-linked and autosomal genes in males. This has most likely resulted in a twofold up-regulation of dosage-sensitive X-Chromosomal gene expression in males (Nguyen and Disteche 2006; Deng et al. 2011, 2013; Brockdorff and Turner 2015). In addition, changes in autosomal genes whose products act in conjunction with X-encoded proteins contributed to a re-establishment of balanced gene expression (Julien et al. 2012; Lin et al. 2012; Necsulea and Kaessmann 2014). It is thought that up-regulation of the dosage-sensitive genes on the X forced the specific inactivation of one of the two X Chromosomes in females (X Chromosome inactivation [XCI]) (Lyon 1961). The XX/XY sex chromosome system is constantly evolving, and selection acts differentially on the X and Y because of the above-described differences in gene content and regulation and also because the Y Chromosome is only transmitted through male gametogenesis and because the X Chromosome is present in the male germ line during only one third of its life-time (Meisel and Connallon 2013).

Although the Y Chromosome has become reduced in size and lost most of its gene content during evolution, complete loss of the Y Chromosome is predicted to be rare among mammals and unlikely to occur for the human Y Chromosome (Bellott et al. 2014; Hughes et al. 2015). Still, complete loss of the Y Chromosome, including Sry, has been reported for the Ryukyu spiny rat (Tokudaia osimensis) (Arakawa et al. 2002; Kuroiwa et al. 2010) and two Ellobius species (Ellobius lutescens, Ellobius talpinus). The loss of Sry indicates that a novel sex-determining mechanism must be operative in these species. Here, we have investigated E. lutescens (17,X males and females) (Just et al. 2007) and E. talpinus (54,XX males and females) (Romanenko et al. 2007). For these species, there is cytogenetic evidence that the X Chromosome has remained intact, based on G-banding patterns and on localization of field mole and hamster X painting probes to the X Chromosome of E. lutescens and E. talpinus, with no signal of autosomal probes localizing on the X (Vorontsov et al. 1980; Romanenko et al. 2007). Furthermore, for E. talpinus, there is no cytogenetic difference between the two X Chromosomes, except a difference in chromosome pairing behavior during male and female meiosis (Kolomiets et al. 2010). Finally, for both E. talpinus and E. lutescens, it has been established that the X Chromosomes do not show sex-specific segregation (Vorontsov et al. 1980; Just et al. 2007). We have applied DNA sequencing, RNA sequencing, and molecular biology approaches to investigate the fate of the Y Chromosome, as well as the consequences of its loss, in these two species.

Results

Genome sequencing, assembly, and phylogenetic analyses

To evaluate the genetic consequences related to loss of the Y Chromosome, we performed sequencing (Illumina sequencing platform, 130-fold coverage) and de novo assembly of the male E. lutescens genome. Scaffolds were generated with an N50 of 242 kbp, to a total of 2.3 Gbp. In addition, superscaffolds were obtained using an optical mapping technique (OpGen), yielding an N50 of 1.10 Mbp for 73% of the whole genome. In addition to the male genome, we sequenced and assembled the genome of a female E. lutescens and a female E. talpinus at lower coverage. Furthermore, we performed RNA sequencing and de novo transcriptome assembly on male E. lutescens total-testis RNA (Table 1; Supplemental Tables S1–S10). From phylogenetic analyses, we estimated that the last common ancestor of E. lutescens, E. talpinus, and Ellobius fuscocapillus (the latter is another Ellobius species with an XY/XX Sry-based sex-determination system) (Kolomiets et al. 2010) lived 2–5 million years ago (Mya) and that E. lutescens separated from E. fuscocapillus ∼0.6 Mya (Fig. 1A; for more detailed analyses of E. lutescens and E. talpinus, see Supplemental Fig. S1). Phylogeographical data, as well as the palaeogeographical history of the area near the Caspian Sea where the animals have their habitats (www.iucnredlist.org) (Fig. 1B), are in agreement with such a phylogenetic origin. Together with the sequence analyses, this indicates that the loss of the Y Chromosome in E. lutescens and E. talpinus occurred as independent events in each species.

Figure 1.

Phylogenetic relationship and geographic ranges of Ellobius species. (A) Phylogenetic tree (based on Zfx and Atrx) showing that the loss of the Y Chromosome occurred independently in Ellobius talpinus and Ellobius lutescens. Bootstrap values are shown on three nodes. (B) Present habitats of E. lutescens and E. talpinus separated by the Caspian Sea and the Caspian Mountains 8–7 Mya (obtained from the IUCN Red List of Threatened Species Version 2015.1; http://www.iucnredlist.org).

Table 1.

Genome assembly statistics

Identification of Y Chromosome–derived genes

Impaired SRY actions in the ancestor of both E. lutescens and E. talpinus, for example, due to mutations in the Sox9 promoter as has been suggested (Bagheri-Fam et al. 2012), may have provided an opportunity for a new sex-determining system to evolve, going hand in hand with further deterioration of SRY functions. This paves the way for Y Chromosome loss. However, in addition to Sry, the Y Chromosome contains several genes that are required for male function. In XX/XY mammalian species, the loss of single long-lived (ancestral) genes from the male-specific region of the human Y Chromosome (MSY) is usually accompanied by transfer of a copy of such a gene to an autosome or to the X (Hughes et al. 2015). It has been shown that in addition to a testis-determining factor, Eif2s3y (or overexpression of its X-linked paralog) is the sole Y-derived gene that is required for progression of 40,X male germ cells through the meiotic divisions in mouse (Yamauchi et al. 2014, 2016). In addition, Rbmy, Ddx3y, Zfy, and Uty are the only long-lived ancestral Y-Chromosomal genes that are conserved among all analyzed mammals. Thus, these might also be expected to have been present in the ancestors of E. lutescens and E. talpinus and to perform essential functions in spermatogenesis (Hughes et al. 2015). It is highly likely that some essential MSY genes relocated in the ancestors of both E. lutescens and E. talpinus once the new sex-determining system was in place. Such events may have allowed preservation of male fertility upon loss of the Y, which occurred later. Thus, these species provide an excellent opportunity to identify Y-Chromosomal genes that are most essential for spermatogenesis. So, we probed for the presence or absence of Y-derived genes in the genomes of E. lutescens and E. talpinus. In accordance with previous analyses (Just et al. 1995), no Sry gene sequence was detected in our assemblies of the E. lutescens and E. talpinus genomes. However, in both species, we detected orthologs of the mouse ancestral Y-linked genes Zfy1/2 and Eif2s3y. We found a complete Eif2s3y gene with introns for E. talpinus, but we could not assemble this gene for E. lutescens, although exonic regions were detected. In addition, we detected orthologs of the mouse X-Chromosomal paralogs of all these genes in both genomes (Fig. 2A,B; Supplemental Figs. S2, S3). Interestingly, although Usp9x was present in the genome of both species, a Usp9y ortholog was detected only in the genome of E. lutescens (Fig. 2A,B; Supplemental Fig. S4). This adds support to the notion that the Y was lost in independent events in E. lutescens and E. talpinus. An ortholog of the mouse Y-Chromosomal-added and -amplified gene Ssty was present in both the E. lutescens and E. talpinus genomes (Fig. 2A,B; Supplemental Fig. S5). Analysis of the RNA sequencing data confirmed the presence and expression of all identified Y-derived genes in E. lutescens (Fig. 2C). Furthermore, in a quantitative analysis of gene expression in E. lutescens liver (male and female), ovary, and testis, we observed that the Y-derived Eif2s3y, Zfy, and Ssty genes were expressed in testis only (Fig. 2D; Supplemental Fig. S6). In contrast, the X-linked paralogs Eif2s3x and Zfx were expressed in all analyzed tissues at comparable levels. For Usp9x and Usp9y, the X-linked gene was expressed at its highest level in the testis, but the Y-derived paralog was expressed at a much lower level that was similar in all tissues examined (Fig. 2D; Supplemental Fig. S6). The assembled Ssty transcript encoded the complete open reading frame (ORF; 57%/58% amino acid identity to mouse SSTY1/2). For Eif2s3y, we could recover the complete ORF upon RT-PCR on testis cDNA using primer sets that were designed based on homology of the E. lutescens RNA sequences to the first exon of mouse Eif2s3y and using primers based on the assembled RNA sequence that corresponded to the 3′ region of the cDNA (Supplemental Table S13; Supplemental Fig. S3B). Interestingly, E. lutescens EIF2S3Y shows an exceptionally high level of amino acid conservation (93% identity to mouse EIF2S3Y), pointing to an essential function under strong selection (Supplemental Table S14). For the other two Y-derived genes (Zfy, Usp9y), the longest assembled transcripts were uninterrupted and showed high percentages of amino acid identity to their mouse orthologs (ZFY1/2: 62%/64%; USP9Y: 77%) (Supplemental Tables S15, S16). It should be noted that we cannot exclude that additional Y-derived genes that were present on the Y Chromosome of the E. lutescens and E. talpinus common ancestor, but not on the mouse Y Chromosome, may also have relocated.

Figure 2.

Y-derived genes in the genomes of E. lutescens and E. talpinus. (A) Y-derived ancestral and added genes identified in the genomes of E. lutescens and E. talpinus. (mc) Multicopy; (ps) pseudo gene. (B) Schematic representation of the mouse Y Chromosome. Genes indicated in pink are present in the E. lutescens genome. (C) The expression level of Y-Chromosomal genes and their X-Chromosomal homologs in testis of E. lutescens based on RNA sequencing data. E. lutescens testis RNA was assembled and annotated using mouse cDNA, and transcript abundance was quantified and plotted using the mouse chromosomal map. Low expressed genes (genes below the 25th percentile of the data) were removed before plotting. (FPKM) Fragments per kilobase of transcript per million fragments mapped. (D) Real-time quantitative RT-PCR on RNA isolated from ovary (Ov), testes (T), and female and male liver (Li F and Li M, respectively) for the E. lutescens homologs of mouse Eif2s3x and Eif2s3y (top graph), Zfx and Zfy (second from top), Usp9x and Usp9y (third from top), and Ssty (bottom graph). mRNA expression levels are normalized to beta actin; error bars, SD values from two separate animals.

Localization of the Y-derived genes to the nonrecombining and meiotically silenced X Chromosome in E. lutescens

To investigate the localization of the MSY-related genes in the E. lutescens genome, we generated a FISH probe through combination of PCR probes spanning genomic regions of Usp9y and Zfy (Eif2s3y could not be assembled, and Ssty is too small for this approach). This combined probe generated a signal only on the E. lutescens X Chromosome (Fig. 3A), strongly indicating that Usp9y and Zfy have been translocated from the Y to the X Chromosome. Eif2s3y and Ssty may also have been translocated to the X or to an autosome.

Figure 3.

Y-derived genes localize to the nonrecombining and meiotically silenced X Chromosome in E. lutescens. (A) Localization of FISH probes for E. lutescens homologs of mouse, Usp9y and Zfy (green) on the single X (encircled) in combination with an E. lutescens X-Chromosomal BAC FISH probe (red) in E. lutescens meiotic spreads also immunostained for SYCP3 (purple). Enlargements of indicated areas are shown on the right. DAPI counterstaining of the DNA is shown in the top and middle images. (B) Immunostaining for H2AFX and SYCP3 (top), polymerase (RNA) II (DNA-directed) polypeptide A (POLR2A) and SYCP3 (middle), and MLH1 and SYCP3 (bottom) on spread pachytene E. lutescens spermatocytes. Phosphorylated H2AFX marks the single X chromatin (encircled) in pachytene. POLR2A is depleted from the X Chromosome region in comparison with the rest of the nucleus. MLH1 marks crossover sites along the synaptonemal complexes of all autosomes. Enlargements of indicated areas are shown on the right. (C) Expression level of X-Chromosomal genes compared with autosomal genes. E. lutescens testis RNA was assembled and annotated using mouse cDNA; transcript abundance was quantified and plotted using the mouse chromosomal map. Low expressed genes (genes below the 25th percentile of the data) were removed before plotting. Expression level is expressed in FPKM (log2 scale). (D) Female E. lutescens reads were aligned to the male E. lutescens reference genome, and SNVs were called that are homozygous within the female but differ from the assembled male genome. High-quality SNVs were plotted using the mouse chromosome annotation and normalized to the number of coding genes per chromosome. None were found for the X Chromosome.

Although these findings indicate that functional conservation of only a few essential genes may have been sufficient to allow loss of the Y, the detrimental effects of this event may not have been completely overcome given the reported signs of impaired spermatogenesis in E. lutescens (Just et al. 2007). To investigate the behavior of the single X Chromosome in E. lutescens meiosis, we performed immunocytochemical analyses of spread nuclei from frozen testis samples. All autosomes formed pairs that are recombining, as evidenced by the presence of MLH1 foci (Fig. 3B). As expected, the unpaired X Chromosome is incorporated into a transcriptionally silenced chromatin region, similar to the XY body in other mammals, through meiotic sex chromosome inactivation (MSCI) (Turner 2007). Accordingly, it is marked by increased phosphorylation of H2AFX and reduced polymerase (RNA) II (DNA directed) polypeptide A (POLR2A) levels in the majority of pachytene nuclei (Fig. 3B). Nevertheless, RNA sequencing of E. lutescens whole-testis RNA did not reveal a global reduction in the median expression level of X-Chromosomal versus autosomal genes (Fig. 3C). For the chimpanzee, we have previously shown that analyses of total-testis RNA can reveal the MSCI signature in a comparison of autosomal versus X-linked expression levels (Mulugeta Achame et al. 2010a). However, MSCI is more evident when purified germ cell fractions are analyzed (Namekawa et al. 2006; Mueller et al. 2008; Sin et al. 2012; Lesch et al. 2013; Soumillon et al. 2013; Federici et al. 2015). In human, MSCI is not apparent when whole-testis RNA is analyzed (Mulugeta Achame et al. 2010a), and as we also suggested for human, such a result might indicate that in E. lutescens, MSCI is not as complete as in mouse. On the other hand, it should be taken into account that spermatogenesis in E. lutescens is impaired (Just et al. 2007) and that the contribution of mRNA from meiotic and post-meiotic cells (undergoing MSCI) to the pool of mRNAs might therefore be lower in comparison to other species like mouse.

Analyses of gene content and evolutionary changes on the nonrecombining X Chromosome in E. lutescens

In contrast to the meiotically recombining E. talpinus X Chromosomes, which are expected to be stably maintained, the nonrecombining single X in E. lutescens is expected to face possible degradation. However, we did not observe major gene loss for X-linked genes in E. lutescens compared with E. talpinus (Supplemental Fig. S7A). This is in accordance with previous data obtained using X-Chromosomal painting probes (Romanenko et al. 2007). Thus, the X Chromosome of E. lutescens still complies with Ohno's law of conservation of the gene content of the mammalian X Chromosome (Ohno 1967). For the 17,X E. lutescens species, it appears that zygotes lacking the X (karyotype 16) and zygotes with two X Chromosomes (18,XX) are not viable, since there is lethality of ∼50% of the embryos (Lyapunova et al. 1975). If 18,XX individuals do not survive, the XCI mechanism may no longer be functional. Analysis of the E. lutescens X inactivation center (Xic) indicated that all genes known to control XCI are still present (Supplemental Fig. S7B,C), and the whole region showed >90% conservation between E. lutescens and E. talpinus (in which the XCI mechanism is most likely still intact) (Fredga and Lyapunova 1991). These results indicate that if XCI is dysfunctional, this happened recently and did not involve major genomic alterations of genes known to be involved in XCI.

Next we compared the female and male X in E. lutescens and looked for high-quality single-nucleotide variant (SNV) positions for which all female nucleotide sequence reads differed from the assembled male genome reads. In this analysis, we identified 653 SNVs for orthologs of sequences localized on mouse autosomal chromosomes, but no SNVs corresponding to locations on the mouse X (Fig. 4A). Thus, the female and male E. lutescens used for the present genome assembly carried one and the same X Chromosome (perhaps due to inbreeding in the colony), which does not segregate with sex. This observation is in agreement with a previous analysis of a breeding colony of E. lutescens (Just et al. 2007). In E. talpinus, both X Chromosomes have an identical G-banding pattern, but heterozygosity and an autosomal inheritance pattern of the X Chromosomes have been shown (Vorontsov et al. 1980).

Figure 4.

dN and dS analyses of autosomal and X-linked genes in E. lutescens. (A) Boxplots showing dN values of autosomal and X-Chromosomal E. lutescens orthologs of mouse genes. Values were averaged per chromosome and plotted using the mouse chromosomal map. (B) Boxplots showing dS values of autosomal and X-Chromosomal E. lutescens orthologs of mouse genes. Values were averaged per chromosome and plotted using the mouse chromosomal map. (C) Boxplots of dN (left) and dS (middle) values of all autosomal genes together compared with the X Chromosome, and the corresponding dN/dS values for the autosomal genes and the X-linked genes (right), calculated from the summed values. P values were calculated using R (Wilcoxon rank-sum test) and indicate significant differences between autosomal and X-Chromosomal values. (D) Scatter plot showing the distribution of dN and dS for the E. lutescens X Chromosome (green) and autosomes (red).

In XX/XY placental mammals, the hemizygosity of X-linked genes in males advances the fixation of recessive mutations through positive natural and sexual selection. Consequently, the number of nonsynonymous substitutions (leading to an amino acid change in the encoded protein) per nonsynonymous site (dN) is more abundant on the X Chromosome than on the autosomes in interspecies comparisons (Meisel and Connallon 2013). On the other hand, the number of synonymous substitutions (which do not result in a change in the encoded protein) per synonymous site (dS) is less frequent on the X Chromosome than on the autosomes, which is explained by a “male-driven evolution” effect associated with a higher rate of mutations per generation in the male germ line compared with the female germ line (Meisel and Connallon 2013). Together, these features result in the so-called faster-X effect (Meisel and Connallon 2013). In placental mammals with XX females and XY males, the X Chromosome is transmitted one-third of the time through the male germ line and hence is exposed to this “male-driven evolution” to a lesser extent compared with the autosomes (Hurst and Ellegren 1998). In E. lutescens, the X Chromosome is transmitted through the male and female germ lines with equal frequency and can never undergo meiotic recombination. The hemizygosity of the X in both sexes could still be associated with a faster-X effect and therefore a higher dN/dS for the X compared with autosomes, but in E. talpinus, the homozygous state of the X in both sexes might gradually “normalize” the dN/dS ratio of the X Chromosome. dN may also increase for the single X in E. lutescens due to gene degeneration caused by the absence of meiotic recombination. However, since the X Chromosome is gene rich and loss of the Y occurred only recently on an evolutionary time scale, we expect that such degeneration might still be suppressed by purifying selection. This is further supported by the normal overall expression levels from the X Chromosome (Fig. 3C), as well as the overall conservation of gene content (Supplemental Fig. S7A). We have investigated the possible impact of the difference in sex chromosome constitution between our Ellobius species and the closest relative for which a completely sequenced genome is available by performing a global dN and dS analysis. We observed that E. lutescens and E. talpinus orthologs of mouse X-Chromosomal genes show a higher dN/dS ratio compared with the E. lutescens and E. talpinus orthologs of mouse autosomal genes (Fig. 4; Supplemental Fig. S8), similar to what has been found for X-versus-autosomal genes for other placental mammals (Wolfe and Sharp 1993; The Chimpanzee Sequencing and Analysis Consortium 2005; Lu and Wu 2005; Mank et al. 2010). Hence, the genome-wide hallmarks of the XX/XY female/male evolutionary past have not yet been erased in E. talpinus.

In E. lutescens and E. talpinus, the evolution of a new dominant sex-determining system may have been facilitated by a somewhat destabilized Sry-driven mechanism in the common ancestor of the extant Ellobius species, as indicated by changes in the promoter of the Sox9 gene in both E. lutescens and the XX/XY E. fuscocapillus, which may result in Sry-independent up-regulation (Bagheri-Fam et al. 2012). We analyzed E. lutescens homologs of known mouse genes that are involved in sex determination, and found that all were present in the current assemblies of both the male and female genomes (Supplemental Table S11). Thus, at present, we cannot name a new candidate sex-determining gene.

Discussion

The results from the genome assemblies of male and female E. lutescens and of female E. talpinus, the RNA sequencing, the phylogeographical data, the palaeogeographical history of the area near the Caspian Sea, and the differences in the Y-derived signatures in the two subspecies provide strong indications for the occurrence of two independent losses of the Y Chromosomes that occurred ∼0.6 and 2–5 Mya for E. lutescens and E. talpinus, respectively. There may have been a genetic predisposition toward development of a new sex-determination system in the common ancestor (Bagheri-Fam et al. 2012), explaining how such a rare event could occur twice in the same lineage. Sex determination can be viewed as a genetic battle between sexes. In species with different dominant sex-determining genes, the general pathways of the system are usually well conserved, but components may have moved up in the hierarchy to become the sex-determining factor (Herpin and Schartl 2015). For instance, AMH, a protein involved in Müllerian duct degeneration and ovarian function in mammals, most likely functions as a primary sex-determination factor in monotremes (Cortez et al. 2014) and the teleost fish Patagonian pejerrey (Hattori et al. 2012). So far, we have been able to establish that genes encoding all current known factors involved in the primary sex-determination cascade in mammals are present in both the male and female E. lutescens genomes, but analyses of a large number of individuals will be required to identify a small functional genetic change in any of these genes that could result in dominant sex-determining action.

The action of such a new dominant sex-determining factor on an autosome, in an Ellobius XX/XY ancestor, would have been compatible with generation of both males and females that could be either XX or XY, whereby Sry would be nonfunctional if still present. Such XY females would be expected to have very limited fertility, based on what is known from studies on the deleterious effect of expression of Y-linked genes in oocytes on meiosis and early embryonic development, in XY Sry female mice (Vernet et al. 2014b). Also in X*Y female wood lemmings (more closely related to the Ellobius genus), whereby a modified X* dominantly induces female differentiation, the oocytes are almost invariably X*X* due to a double nondisjunction event in oocyte precursors (Fredga et al. 1976; Winking et al. 1981). Together, this provides support to the hypothesis that the presence of the Y Chromosome interferes with oogenesis in rodents. Similarly, in XX Sry+ male mice and also in XXY male mice, the presence of two X Chromosomes leads to elimination of most germ cells before or just after birth (Cattanach et al. 1971; Lue et al. 2001), and also XXY wood lemmings are infertile (Gropp et al. 1976). This indicates that the dosage of X-linked genes is critical during early male gametogenesis in rodents. Loss of the Y could be easily envisioned in an Ellobius ancestor, since XY nondisjunction occurs relatively frequently in mammals, in particular when proper segregation depends on XY recombination and if there is a short PAR (Blackmon and Demuth 2015). Evolutionary adaptations that generate fertility in all the offspring would have a subsequent selective advantage. We propose that essential differences between E. lutescens and E. talpinus in the evolution of XCI mechanisms may have led to successful reproduction and maintenance of X and XX males, respectively. In E. lutescens, loss of XCI would lead to all fertile 17,X offspring and lethality of embryos with two or zero X Chromosomes. In contrast, in E. talpinus, fertile XX males have evolved perhaps through maintenance of inactivation of one X Chromosome specifically in the male germ line. Differential regulation of the X in the male and female germ line in E. talpinus is supported by the reported partial asynapsis of the two X Chromosomes in male meiotic prophase only, most likely in association with MSCI (for the model and further explanation of possible courses of events in E. lutescens and E. talpinus, see Supplemental Fig. S9; Kolomiets et al. 2010).

In human, strong negative selection over an extended period of millions of years has thus far protected the Y Chromosome from losing its last few important genes (Hughes et al. 2012; Bellott et al. 2014; Cortez et al. 2014; Hughes et al. 2015), but abrupt evolutionary changes have eradicated the Y Chromosome in the Ellobius family, as described here, and also in spiny rats of the Tokudaia family (Kuroiwa et al. 2010). Interestingly, orthologs of Zfy and Eif2s3y have also been detected in the genome of the Ryukyu spiny rat (Tokudaia osimensis) (Arakawa et al. 2002; Kuroiwa et al. 2010). There is strong evidence for important general and testis-specific functions of these two genes in the Muroidea lineage (Luoh et al. 1997; Vernet et al. 2014a). Transgenic expression of Eif2s3y is sufficient to induce formation of a small number of haploid cells in a 19,X mouse carrying an Sry transgene as the only other Y-derived gene (Vernet et al. 2011; Yamauchi et al. 2014). In a recent analysis, it was shown that although primates have lost EIF2S3Y from the Y, a retroposed copy is present in the genome of all studied primate species (Hughes et al. 2015), emphasizing the importance of this gene. For most of the other ancestral Y-Chromosomal genes, including Usp9y, it has been shown that they can be completely lost in some lineages (Cortez et al. 2014; Hughes et al. 2015).

The more recently evolved multicopy Ssty genes are members of the Spindlin/Ssty family of genes. SPIN1 is encoded by an autosome; it contains three SPIN/SSTY motifs that form specific binding domains that recognize the combination of a specific H3K4 and H3R8 methylation pattern (H3K4me3-H3R8me2a) (Su et al. 2014). It plays an important role in cell cycle regulation and in WNT signaling. The multicopy Ssty and paralogous X-linked Spin2-like genes encode proteins with a similar overall structure and are thought to function as important regulators of post-meiotic XY gene expression (Comptour et al. 2014). Although some other analyzed mammalian genomes encode SPIN1 on an autosome and three paralogs on the X Chromosome, a more extreme amplification of Spin2-like genes on the X (at least 32 copies in mouse) (Mulugeta Achame et al. 2010b) and the presence of the multicopy Ssty genes on the Y are most likely specific for the Rodentia (Cortez et al. 2014).

Thus, our data provide new evidence for essential roles of only two of the ancestral MSY genes, Eif2s3y and Zfy, for spermatogenesis in Muroidea. In addition, the presence of Ssty orthologs in species of this rodent superfamily indicates that these MSY-added genes are important for fertility and arose before the split between mole voles and New World rats and mice. The genomes of the mole vole species described herein present opportunities to study properties of a new pair of sex chromosomes and a new sex-determination system and to evaluate the selective mechanisms acting on a hemizygous post-X Chromosome and other aspects of sex chromosome evolution.

Methods

Genome sequencing and assembly

DNA was isolated from the liver of a single male E. lutescens (Elut33), a single female E. lutescens (Elut46, born from the same parents as the male), and a single female E. talpinus. Libraries were prepared using Illumina library preparation protocols. Whole-genome DNA sequence information was generated from the libraries using Illumina Genome Analyzer (GA) and HiSeq 2000 platforms (Supplemental Table S1).

De novo sequence assembly was performed, based on the de Bruijn graph algorithm, using ABySS 1.2.5 software (Simpson et al. 2009) on the Huygens supercomputer equipped with 512-GB RAM and 64 dual core processors (SARA). The E. lutescens genome was assembled more than 20 times, testing various parameters and data set combinations, to obtain the best de novo assembly. The male E. lutescens genome was assembled to a total of 2.3 Gbp, with a maximum scaffold length of 2.2 Mbp (Table 1). The genome of the female E. lutescens was assembled to a total of 2.2 Gbp with maximum scaffold size of ∼160 kbp (Table 1). The genome of E. talpinus was assembled to 2.2 Gbp with maximum scaffold sizes of ∼160 kbp (Table 1). By using the quality controlled reads used for the ABySS assembly, the genome of the male E. lutescens was also assembled using CLCbio version 4.9 (https://www.qiagenbioinformatics.com/) with default parameters. We also assembled the female E. lutescens genome with CLCbio. This CLCbio assembly was used to verify part of the results obtained from analysis of the ABySS assembly, which is our primary assembly. Superscaffolds were generated using an optical mapping technology (OpGen). Optical mapping data was generated by the Argus Whole Genome Mapping System and processed using Genome-Builder (OpGen) as described in the Supplemental Methods. A data summary is shown in Supplemental Table S2. Quality controls, including alignment to previously sequenced E. lutescens genes, and GC and repeat content are summarized in the Supplemental Methods and Supplemental Tables S7 and S8.

Gene hunting and sequence alignment

We searched Y-Chromosomal genes using mouse Y-Chromosomal gene and transcript sequences as described in the Supplemental Methods.

RNA sequencing and assembly

RNA was extracted from adult E. lutescens testis, and RNA sequencing was performed using Illumina HiSeq 2000. Transcriptome assembly was performed using the Trinity short-read de novo assembly software (Grabherr et al. 2011). The complete ORF containing E. lutescens Eif2s3y cDNA was generated from overlapping fragments amplified using primer sets Eif2s3yc, Eif2s3yg, and Eif2s3y2 (Supplemental Table S13) and further comparison to the assembled RNA sequence.

Global analysis of dN and dS

We used the genome analysis tool gKaKs (version 1.2.1) (Zhang et al. 2013) to compare E. lutescens and mouse, and PAML (CodeML) (Yang 1997, 2007; Xu and Yang 2013) to compute dN, dS, and dN/dS between mouse CDSs and orthologous sequences in the E. lutescens genome.

SNV analysis

All Illumina sequence reads obtained for the female E. lutescens were aligned to the assembled male E. lutescens genome, using the Burrows-Wheeler Aligner (version 0.5.9) (Li and Durbin 2010), and SNPs were called using SAMtools (SAMtools 0.1.18) (Li et al. 2009; Li 2011).

Immunostaining of meiotic nuclei

Spread nuclei of E. lutescens spermatocytes were obtained from frozen testis material essentially as previously described (Peters et al. 1997), with modifications, and were stained using antibodies as described in the Supplemental Methods.

Quantitative PCR analyses

Standard quantitative RT-PCR (RT-qPCR) was performed using the Bio-Rad CFX 98 real-time system as described in the Supplemental Methods. The primer sequences are in Supplemental Table S13.

Gene expression levels were normalized over beta actin gene expression according to the 2–ΔCT method (Livak and Schmittgen 2001).

DNA FISH

To generate a FISH probe for E. lutescens Zfy and Usp9y, primer sets were designed to cover the available genomic sequences (Supplemental Table S12), and PCR products were isolated from gel, purified, and pooled. FISH and subsequent immunostainings were performed as described in the Supplemental Methods.

Data access

All whole-genome sequence data generated for this study have been submitted to the NCBI Genome database (http://www.ncbi.nlm.nih.gov/genome/). The raw sequence reads have been submitted to the NCBI Sequence Read Archive (SRA; http://www.ncbi.nlm.nih.gov/sra/). Individual genes have been submitted to the NCBI GenBank (http://www.ncbi.nlm.nih.gov/genbank/). All sequencing data are linked to the NCBI BioProject number PRJNA305123 (http://www.ncbi.nlm.nih.gov/bioproject/). Accession numbers are listed in Supplemental Tables S17 through S20.

Acknowledgments

E.M. and J.G. were supported by European Research Council (ERC) grant 260587, and Netherlands Organization for Scientific Research (NWO) grant SH-173-10.

Author contributions: E.M. contributed to conceiving and designing of the study, performed experiments, assembled the genomes and the transcriptome, performed other bioinformatic analyses, interpreted the data, and wrote the manuscript. E.W. and E.S. performed experiments and interpreted data. W.F.J.v.IJ. performed DNA sequencing and contributed to the writing of the manuscript. W.J. contributed to the design of the study, provided materials, and contributed to the writing of the manuscript. E.H. supervised the transcriptome data analyses and interpretation and contributed to the writing of the manuscript. J.A.G. conceived of the study, participated in the design of the experiments and interpretation of the data, and wrote the manuscript. J.G. conceived of the study and participated in the design of the experiments, interpretation of the data, and writing of the manuscript. W.M.B. conceived of the study, participated in the design of the experiments and interpretation of the data, and wrote the manuscript.

Footnotes

  • Received November 12, 2015.
  • Accepted July 11, 2016.

This article is distributed exclusively by Cold Spring Harbor Laboratory Press for the first six months after the full-issue publication date (see http://genome.cshlp.org/site/misc/terms.xhtml). After six months, it is available under a Creative Commons License (Attribution-NonCommercial 4.0 International), as described at http://creativecommons.org/licenses/by-nc/4.0/.

References

| Table of Contents

Preprint Server