|
|
|
|
Published online before print
June 7, 2007, 10.1101/gr.6214107 Genome Res. 17:1072-1081, 2007 ©2007 by Cold Spring Harbor Laboratory Press; ISSN 1088-9051/07 $5.00
Letter Genome-wide comparative analysis of copia retrotransposons in Triticeae, rice, and Arabidopsis reveals conserved ancient evolutionary lineages and distinct dynamics of individual copia familiesInstitute of Plant Biology, University of Zürich, Zollikerstrasse 107, 8008 Zürich, Switzerland
Although copia retrotransposons are major components of all plant genomes, the evolutionary relationships between individual copia families and between elements from different plant species are only poorly studied. We used 20 copia families from the large-genome plants barley and wheat to identify 46 families of homologous copia elements from rice and 22 from Arabidopsis, two plant species with much smaller genomes. In total, 599 copia elements were analyzed. Phylogenetic analysis showed that copia elements from the four species can be classified into six ancient lineages that existed before the divergence of monocots and dicots. The six lineages show a surprising degree of conservation in sequence organization and other characteristics across species. Additionally, the phylogenetic data suggest at least one case of horizontal gene transfer between the Arabidopsis and rice lineages. Insertion time estimates for 522 high-copy elements showed that retrotransposons from rice were active at different times in waves of activity lasting 0.52 million years, depending on the family, whereas elements from wheat and barley had longer periods of activity. We estimated that half of the rice copia elements are truncated or otherwise rearranged after 790,000 yr, which is almost twice the half-life of Arabidopsis elements. In contrast, wheat and barley copia elements appear to have a massively longer half-life, beyond our ability to estimate from the available data. These findings suggest that genome size can be explained by the specific rate of DNA removal from the genome and the length of active periods of retrotransposon families.
Retrotransposons are the predominant class of transposable elements in plants and are largely responsible for the vast differences in genome sizes. The relatively small and compact genome of Arabidopsis has a size of 120 Mbp and contains only 10% repetitive DNA (Arabidopsis Genome Initiative 2000 5500 Mbp) (Bennett and Smith 1976
copia retrotransposons are flanked by long terminal repeats (LTRs) that contain promoter and downstream control elements. Their internal domain usually contains the genes required for reproduction such as reverse transcriptase (RT), integrase (INT), and gag, which is probably involved in packaging RNA or DNA during replication (for review, see Wilhelm and Wilhelm 2001
Because only one of the two LTR sequences serves as template for reverse transcription, both LTRs are identical at the time of insertion. This characteristic can be used to estimate the age of retrotransposons (SanMiguel et al. 1998
The vast majority of plant retrotransposons studied so far were estimated to be <3 million years old (SanMiguel et al. 1998
Many retrotransposons are nonautonomous elements that rely for their replication completely or in part on proteins expressed by other elements elsewhere in the genome (Vitte and Panaud 2005 The objective of this study was a comparative analysis of copia elements from plant genomes that differ greatly in size and also represent different clades of the phylogenetic tree. We used the sequences from Triticeae repeat database (TREP) as a starting point for a homology search in rice and Arabidopsis. At the time this study was done, 32 families of copia retrotransposons were represented at TREP. We were able to identify rice homologs for all Triticeae families and traced the evolutionary lineages back to the divergence of monocots and dicots by comparison with Arabidopsis elements. We identified six ancient lineages of copia elements that existed before the divergence of monocots and dicots. Each of the six lineages shows a surprising degree of conservation and has distinct characteristics. Our analysis also demonstrated that specific retrotransposon families invade genomes in waves of high activity that are followed by long periods of relative silence.
Of the 32 families of copia elements deposited at TREP, 20 have at least one copy that is complete, whereas 12 families are only represented by fragments. A "complete" element is defined as a copy that has intact ends, but does not make any statement about whether the element is actually functional or whether it contains internal deletions. In fact, the majority of the complete elements deposited at TREP have degenerated coding regions that contain frameshifts, stop codons, or deletions. For this analysis, we used the 20 families for which complete elements were available. For eight of the 20 copia families, only a single complete element is deposited at TREP. Nine families have two to four members (Ale, Barbara, Bianca, HORPIA2, Ikeros, Inga, Maximus, Oref, and TAR1) and most abundant are the closely related Angela, BARE1, and WIS families for which 15, 17, and 26 complete copies are available, respectively (Supplemental Table 1). Multicopy elements were used in multiple sequence alignments to obtain a consensus sequence. In most cases, this consensus sequence gave rise to an intact open reading frame (ORF), because frameshifts and stop codons are usually the result of mutations in one specific copy and are eliminated in the consensus sequence. For single-copy elements, the hypothetical protein sequences were deduced by comparison with proteins from similar copia elements, and frameshifts were removed manually to obtain the putative ORFs.
Most Triticeae copia elements have homologs in the rice genome For this study, we defined a "family" based on common LTR DNA identity. LTRs are among the most rapidly evolving parts of retrotransposons, because they do not encode any proteins. Often, conserved parts of the coding region of retrotransposons can be 85%90% identical, while their LTRs are highly divergent. Thus, we considered two elements as belonging to the same family if their LTRs are at least 80% identical. This was for a practical reason, as we wanted to be able to derive consensus sequences for families with multiple copies. Using these criteria, we could classify 31 of the 46 identified rice copia families as previously described elements. Sixteen families are novel. Five of them have an internal domain that is very similar to that of previously described elements (>90% DNA sequence identity), but have highly divergent LTRs (Supplemental Table 2). In some cases, different copia elements from Triticeae identified the same family (or groups of families) of rice elements, indicating that these Triticeae elements diverged after the Triticeae/rice separation. Analogously, some families from Triticeae identified multiple rice families.
Arabidopsis homologs of grass copia elements Due to the low level of DNA sequence conservation, all other Arabidopsis homologs of rice or Triticeae families were identified by a TBLASTN search of the predicted protein sequences against the Arabidopsis genome. This search identified 22 families of Arabidopsis copia elements, 17 of which were previously described. Again, in some cases, multiple Triticeae elements identified the same Arabidopsis copia family (or group of families), indicating a divergence of the grass elements after the monocot/dicot separation. Isolation and characterization of Arabidopsis copia elements was done in the same way as for rice elements. However, due to the lower copy number of most elements, consensus sequences could not be obtained for all of them. For some, the protein sequences had to be predicted from one single element by comparison with similar protein sequences.
Most copia elements have a low copy number, while few are very abundant Due to the limited data set for Triticeae, one can only speculate about the actual copy number of retrotransposons in the entire genome. The fact that for the majority of retrotransposons only one or a few copies were available, indicates that the Triticeae genomes must contain a vast number of different families with moderate copy numbers. In this study, we considered every Triticeae copia family for which more than one copy was available as high-copy. The only copia elements that are present in high-copy numbers in the available Triticeae sequences are the closely related Angela, BARE1, and WIS families.
copia elements from grasses and their Arabidopsis homologs can be separated into six major evolutionary lineages
The sequences cluster into six major evolutionary lineages (Angela, Ale, Bianca, Ivana, Maximus, and TAR) (Fig. 1). We defined the lineages primarily as large groups that are in a common branch of the tree with a very high bootstrap value (<95) and secondarily by shared characteristics (see below). Some branches that separate the six major lineages have lower bootstrap values, indicating that the precise relationships between the lineages are difficult to assess and that they had been evolving independently for a long time. Relationships within the Bianca and TAR lineages reflect the expected pattern with elements from rice and Triticeae more closely related and Arabidopsis elements placed in a separate branch. The Bianca lineage contains only three families, one for each Triticeae, rice, and Arabidopsis, whereas the TAR lineage contains three families of rice elements, indicating that they diverged after the rice/Triticeae separation (Fig. 1). Within the Ale lineage, the relationships are less clear, which is reflected in lower bootstrap values. This could indicate that either the families within these lineages diverged within a short evolutionary time or that sequence exchange has occurred between families. Specifically in Arabidopsis, a large number of families were identified. In contrast, both the Ivana and Maximus lineages show a wide variety in Triticeae and rice families, whereas in Arabidopsis, only one and two families were identified, respectively. For the Ikeros family, a rice homolog (Ostonor1), but no Arabidopsis homologs, could be identified. This could either indicate that these particular families became extinct in Arabidopsis or that the actual homolog is simply too divergent to be identified by the criteria used (Fig. 1). Similarly, there is no clear Arabidopsis homolog for the group containing the Inav and Ale elements from Triticeae.
Elements from the same lineage have similar characteristics in Triticeae, rice, and Arabidopsis
The Ale lineage is the most diverse, as it contains 36 families. Interestingly, for 28 families, only one single full-length copy was found, and the most abundant family (rn154162 from rice) was found in only five complete copies. Representatives of the Ale lineage are the smallest of all copia elements studied, and range in size from 4.4 to 5.5 kb. Especially in Arabidopsis, Ale elements diverged massively, as 14 families were identified, most of them with only one complete copy in the genome (Supplemental Figure 1; Supplemental Table 1).
The largest elements (8.714.4 kb) were found in the Maximus lineage. Their unusual size is caused by long LTRs, large stretches of noncoding sequences in the internal domain, as well as by the presence of a unique second open reading frame (ORF2) downstream of the gag-INT-RT domain. ORF2 is very divergent among different Maximus families, and no information as to the function could be found. However, its position suggests it to be homologous to the env gene of retroviruses (Wilhelm and Wilhelm 2001 The three families of the Bianca lineage also contain a unique ORF, which is located upstream of the gag-INT-RT domain (in contrast to ORF2 of Maximus families, which is located downstream of gag-INT-RT). Bianca probably represents an ancient lineage, as it is placed on a separate branch most distantly related to all other lineages in the phylogenetic tree. The Angela and TAR lineages are similar in size and overall sequence organization with long LTRs and several hundred base pairs of noncoding DNA in their internal domain. TAR contains the most abundant copia family identified in this study (Houba from rice), whereas Angela contains three of the most abundant Triticeae families (Angela, BARE1, and WIS). The families from the Ivana lineage also have relatively high-copy numbers. In comparison with TAR and Angela, they are compact, with short LTRs, and almost the entire internal domain is comprised of coding region.
Primer binding sites and polypurine tracts reflect the relationships between lineages
The Maximus lineage contains multiple populations of deletion derivatives Many copia elements analyzed in this study contain deletions, but most of them are unique to one specific copy and probably resulted in disruption of functionality of the element. However, some retrotransposons that have lost some parts in a deletion are obviously still able to replicate by relying on proteins encoded by full-length elements. Thus, a population of deletion derivatives (which we refer to as "subfamily") is established in the genome, in addition to the population of autonomous elements. Curiously, all four deletion-derivative populations identified belonged to families of the Maximus lineage. Osr9 from rice does not have an ORF2 region, but is otherwise very similar in size and sequence to other Maximus elements from rice (e.g., COPIO) (Fig. 3). It is likely that the Osr9 family originated from a single deletion event that eliminated the ORF2 region.
Barbara_A and Barbara_B from wheat show a more complex pattern, as both subfamilies share common sequences only in the LTRs, whereas their internal domains are completely different (Fig. 3). It appears that Barbara_A has lost its gag and ORF2 regions in two independent deletions. Interestingly, the Barbara_B family almost perfectly complements Barbara_A, as it lacks only the INT-RT region but still contains both gag and ORF2. Barbara_A and Barbara_B have very similar LTRs ( 83% identity), indicating that the divergence of these two subfamilies occurred relatively recently.
The high-copy family Osr8 consists of two populations (Osr8 and Osr8B). Osr8 represents a putative autonomous element and is more abundant (61 complete copies) than Osr8B (16 complete copies). Osr8B appears to be a deletion derivative of Osr8, which contains only the gag domain but lacks the INT-RT region. Interestingly, the LTRs plus the 5' part of the gag region is very well conserved between Osr8 and Osr8B, whereas the 3' part of gag is divergent (Fig. 3). One explanation is that one of the two is a recombined element that carries part of the coding region of a different subfamily of an Osr8-like element. Du et al. (2006) The Arabidopsis genome contains a small population of ATCopia58 elements, which consists of one putatively autonomous copy of ATCopia58, four deletion derivatives (ATCopia58B), and four solo-LTRs. All four deletion derivatives have the same structure, namely, four different deletions compared with the full-length ATCopia58 element, indicating that they originated from the same ancestor element (Fig. 3).
Waves of genome invasion
Graphical display of LTR divergence of members of copia families revealed that different families were active at different times during evolution (Fig. 4). Statistical Bonferroni-correted Kolomogorov-Smirnov tests showed the insertion distributions of all but four families to be significantly different from a uniform distribution (Fig. 4; Supplemental Table 3). Most of the analyzed families apparently had active periods of 12 million years, which were followed by periods of relative silence. The most abundant family, Houba, shows two active periods, one within the last 500,000 yr and a second one 13 Mya. The rice Adena family had a highly active period 12.5 Mya and apparently became silenced after that, since no copies younger than one million years were found. In contrast, the Ostnor1, Osr8, and Osr10 families were obviously active for more than two million years, but at a lower level, producing fewer copies than the Houba family. The Beyla, COPIO, and Echidne families each had one wave of moderate activity during roughly one million years.
Osr8 elements were very active for a long period
The three copia families from wheat and barley have waves of activity that are much more spread out in time than those of the rice elements. All three families had a period of relatively high activity 0.51.5 Mya, but also produced copies during the entire last three million years (Fig. 4). The youngest element (Angela_AF326781-6) inserted only
copia elements are truncated or removed from the rice genome at a half-life rate of about 790,000 yr, and more slowly from the wheat genome
Assuming that repetitive sequences are removed from the genome at a constant rate, insertion time distribution can be described by an exponential function with a constant half-life rate. To estimate the average half-life of copia elements, we fitted an exponential function with half-life and starting value as variables to the observed data by minimizing the sum of distances between the calculated and the observed value for each bin. The calculations resulted in an average half-life of 794,000 yr. Thus, on average, half of the full-length elements are at least partially removed from the genome after 794,000 yr. The observed distribution deviates to some degree from the calculated distribution, especially for the period of 12 Mya, where multiple families were active (Figs. 4, 5A).
It is important to emphasize that this analysis only included complete elements. If a retrotransposon is only partially deleted (e.g., a part of one LTR is missing), it is not considered in the analysis. Thus, our estimate does not make a statement on how long it takes until repetitive elements are completely removed from the genome as previous studies have done (Vitte and Bennetzen 2006 Despite the smaller sample size for Triticeae elements, an attempt was made to obtain an estimate of their half-life. Indeed, the distribution of insertion times of 86 copia elements from barley and wheat does not resemble an exponential distribution at all, making it meaningless to find the best-fitting exponential function (Fig. 5B). Although the available data set is too small to draw definite conclusions, the observed distribution may reflect a very long half-life of copia elements in the Triticeae genomes and, thus, a fundamental difference of small and large genome plants.
Evidence for horizontal gene transfer between Arabidopsis and rice
The presented comparative analysis of copia elements from wheat, barley, rice, and Arabidopsis provided a broad insight into the evolution of these retrotransposons. Our analysis compared copia elements from plant genomes with very different characteristics. First, they represent the major clades of the monocots and dicots and, second, they differ greatly in genome size, as Arabidopsis represents a small, rice, a medium-sized, and barley and wheat, large genome species. Our data showed that copia elements in these three genomes are descendants of only a few ancient evolutionary lineages, each with distinct conserved characteristics. Individual families were shown to be active for a few million years and are then silenced for similarly long periods. Additionally, our data suggest that at least in one case, horizontal gene transfer might have occurred between the Arabidopsis and rice lineages.
Is genome size the product of a species-specific rate of DNA removal and the length of waves of retrotransposon activity?
Interestingly, the three families of high-copy elements from wheat and barley show much longer waves of activity than all of the rice families studied. This could reflect one of the fundamental differences between rice and Triticeae species and contribute to the vast difference in genome sizes between the two. A permanent high activity of certain retrotransposon families could, therefore, have caused the expansion of the Triticeae genomes. This expansion phase presumably started after the divergence from its closest relative with a small compact genome, Brachypodium,
The question of whether the Triticeae genomes are still expanding remains unanswered. The fact that all Triticeae have similar genome sizes implies that their genome sizes have been constant at least since their divergence 1114 Mya (Wolfe et al. 1989
It was repeatedly speculated that the balance of deletion and generation of DNA through retrotransposition is responsible for the differences in genome sizes (for review, see Vitte and Panaud 2005 From these data, we suggest that genome size is largely the product of DNA removal rate and the length of periods of retrotransposon activity. Already, small differences in the rates of DNA removal and/or retrotransposon activity could have a profound effect on genome size due to the additive nature of the two factors. One can imagine that a genome can grow rapidly if the rate of DNA removal is decreased slightly, while periods of retrotransposon activity are prolonged slightly. Analogously, the genome can contract quickly when the situation is reversed. Thus, the genome sizes we observe today are probably very different from what they were in the recent evolutionary past or from what they will be in the near future.
Conserved characteristics of copia lineages
Repetitive DNA is generally believed to be free from selection pressure and often referred to as "selfish" DNA (Petrov 2001
Concluding remarks
In our view, the most viable alternative explanation is frequent horizontal transfer of these elements across species boundaries, which would lead to an overall homogenization of the retrotransposon gene pool. A previous study actually suggested that conservation of lineages across distantly related species is due to frequent horizontal transfer between species (McCarthy et al. 2002 Between closely related cross-breeding species, one can expect that transfer of transposable elements is frequent, and thus provides a stabilizing force that prevents individual families from extinction. However, this can only have an effect for relatively short evolutionary time spans, as long as freshly diverged species can still cross-breed.
One could argue that copia elements are mostly selfish, but that the presence of certain types of elements (e.g., representatives of the six evolutionary lineages described) can somehow have beneficial effects on the host population. Recently, it was shown that levels of retrotransposon silencing vary in a species-specific manner, indicating that permanent background activity occurs at least in some species (Vitte and Bennetzen 2006
Sequence analysis tools For sequence analysis, BLAST (Altschul et al. 1997
Multiple alignments were done with CLUSTALW (Thompson et al. 1994
Isolation of complete copia elements form the rice and Arabidopsis genomes Nested insertions were detected by alignment of the extracted sequences with the respective consensus sequence using the program WATER. This process was also automated by means of a Perl script that included the detection of target-site duplications to exclude elements that were the product of inter-element recombination. Alignments of LTR pairs were done with the program WATER using a gap creation penalty of 30 and a gap extension penalty of 0.5. A Perl script was written to extract the number of aligned bases and number of mutations (transitions and transversions) from large numbers of WATER outputs. All consensus DNA and protein sequences, as well as the complete set of copia sequences used for this study are available upon request.
copia elements were classified by BLAST against publicly available repeat databases such as RepBase (http://www.girinst.org/repbase/update/), the TIGR repeat database (www.tigr.org), RetrOryza (http://www.retroryza.org), and the data set described by Pereira (2004)
We thank Dr. Vini Pereira for kindly providing his dataset of Arabidopsis copia elements and Adrian Roellin for his help with statistical analyses. This work was supported by the Swiss National Science Foundation (SNF) grant 3100AV-105620.
1 Corresponding author.
E-mail wicker{at}botinst.unizh.ch; fax 41-44-634-82-04. [Supplemental material is available online at www.genome.org.] Article published online before print. Article and publication date are at http://www.genome.org/cgi/doi/10.1101/gr.6214107
Altschul, S.F., Madden, T.L., Schaffer, A.A., Zhang, J.H., Zhang, Z., Miller, W., and Lipman, D.J. 1997. Gapped BLAST and PSI-BLAST: A new generation of protein database search programs. Nucleic Acids Res. 25: 33893402. ArabidopsisGenome Initiative. 2000. Analysis of the genome sequence of the flowering plant Arabidopsis thaliana. Nature 408: 796815.[CrossRef][Medline] Bennett, M.D. and Smith, J.B. 1976. Nuclear DNA amounts in angiosperms. Philos. Trans. R. Soc. Lond. B Biol. Sci. 274: 227274.[Medline] Bossolini, E., Wicker, T., Knobel, P., and Keller, B. 2007. Comparison of orthologous loci from small grass genomes Brachypodium and rice: Implications for wheat genomics and grass genome annotation. Plant J. 49: 704717.[CrossRef][Medline] Bureau, T. and Wessler, S.R. 1994. Stowaway: A new family of inverted repeat elements associated with the genes of both monocotyledonous and dicotyledonous plants. Proc. Natl. Acad. Sci. 9: 14111415. Chaw, S.M., Chang, C.C., Chen, H.L., and Li, W.H. 2004. Dating the monocot-dicot divergence and the origin of core eudicots using whole chloroplast genomes. J. Mol. Evol. 58: 424441.[CrossRef][Medline] Devos, K.M., Brown, J.K.M., and Bennetzen, J.L. 2002. Genome size reduction through illegitimate recombination counteracts genome expansion in Arabidopsis. Genome Res. 12: 10751079. Du, C., Swigonova, Z., and Messing, J. 2006. Retrotranspositions in orthologous regions of closely related grass species. BMC Evol. Biol. 6: 62. doi: 10.1186/1471-2148-6-62.[CrossRef][Medline] Gabriel, A., Willems, M., Mules, E.H., and Boeke, J.D. 1996. Replication infidelity during a single cycle of Ty1 retrotransposition. Proc. Natl. Acad. Sci. 93: 77677771. Gao, L., McCarthy, E.M., Ganko, E.W., and McDonald, J.F. 2004. Evolutionary history of Oryza sativa LTR retrotransposons: A preliminary survey of the rice genome sequences. BMC Genomics 5: 18.[CrossRef][Medline] Gaut, B.S., Morton, B.R., McCaig, B.C., and Clegg, M.T. 1996. Substitution rate comparisons between grasses and palms: Synonymous rate differences at the nuclear gene Adh parallel rate differences at the plastid gene rbcL. Proc. Natl. Acad. Sci. 93: 1027410279. Holub, E.B. 2001. The arms race is ancient history in Arabidopsis, the wildflower. Nat. Rev. Genet. 2: 516527.[CrossRef][Medline] International Rice Genome Sequencing Project. 2005. The map-based sequence of the rice genome. Nature 436: 793800.[CrossRef][Medline] Kalendar, R., Tanskanen, J., Immonen, S., Nevo, E., and Schulman, A.H. 2000. Genome evolution of wild barley (Hordeum spontaneum) by BARE-1 retrotransposon dynamics in response to sharp microclimatic divergence. Proc. Natl. Acad. Sci. 97: 66036607. Kalendar, R., Vicient, C.M., Peleg, O., Anamthawat-Jonsson, K., Bolshoy, A., and Schulman, A.H. 2004. Large retrotransposon derivatives: Abundant, conserved but nonautonomous retroelements of barley and related genomes. Genetics 166: 14371450. Keulen, W., Nijhuis, M., Schuurman, R., Berkhout, B., and Boucher, C. 1997. Reverse transcriptase fidelity and HIV-1 variation. Science 275: 229231.[Medline] Koch, M.A., Haubold, B., and Mitchell-Olds, T. 2000. Comparative evolutionary analysis of chalcone synthase and alcohol dehydrogenase loci in Arabidopsis, Arabis, and related genera (Brassicaceae). Mol. Biol. Evol. 17: 14831498. Ma, J. and Bennetzen, J.L. 2004. Rapid recent growth and divergence of rice nuclear genomes. Proc. Natl. Acad. Sci. 101: 1240412410. Ma, J., Devos, K.M., and Bennetzen, J.L. 2004. Analyses of LTR-retrotransposon structures reveal recent and rapid genomic DNA loss in rice. Genome Res. 14: 860869. McCarthy, E.M., Liu, J., Lizhi, G., and McDonald, J.F. 2002. Long terminal repeat retrotransposons of Oryza sativa. Genome Biol. 3: RESEARCH0053.10053.11 doi: 10.1186/gb-2002-3-10-research0053. Paterson, A.H., Bowers, J.E., and Chapman, B.A. 2004. Ancient polyploidization predating divergence of the cereals, and its consequences for comparative genomics. Proc. Natl. Acad. Sci. 101: 99039908. Pereira, V. 2004. Insertion bias and purifying selection of retrotransposons in the Arabidopsis thaliana genome. Genome Biol. 10: R79. Pereira, A., Cuypers, H., Gierl, A., Sommer, Z.S., and Saedler, H. 1986. Molecular analysis of the En/Spm transposable element system of Zea mays. EMBO J. 5: 835841.[Medline] Petrov, D.A. 2001. Evolution of genome size: New approaches to an old problem. Trends Genet. 17: 2328.[CrossRef][Medline] Piegu, B., Guyot, R., Picault, N., Roulin, A., Saniyal, A., Kim, H., Collura, K., Brar, D.S., Jackson, S., Wing, R.A., et al. 2006. Doubling genome size without polyploidization: Dynamics of retrotransposition-driven genomic expansions in Oryza australiensis, a wild relative of rice. Genome Res. 16: 12621269. Sabot, F. and Schulman, A.H. 2006. Parasitism and the retrotransposon life cycle in plants: A hitchhiker's guide to the genome. Heredity 97: 381388.[CrossRef][Medline] SanMiguel, P., Gaut, B.S., Tikhonov, A., Nakajima, Y., and Bennetzen, J.L. 1998. The paleontology of intergene retrotransposons of maize. Nat. Genet. 20: 4345.[CrossRef][Medline] SanMiguel, P.J., Ramakrishna, W., Bennetzen, J.L., Busso, C., and Dubcovsky, J. 2002. Transposable elements, genes and recombination in a 215-kb contig from wheat chromosome 5Am. Funct. Integr. Genomics 2: 7080.[CrossRef][Medline] Sonnhammer, E.L. and Durbin, R. 1995. A dot-matrix program with dynamic threshold control suited for genomic DNA and protein sequence analysis. Gene 167: 110.[CrossRef][Medline] Thompson, J.D., Higgins, D.G., and Gibson, T.J. 1994. CLUSTAL W, improving the sensitivity of progressive multiple sequence alignment through sequence weighting, position-specific gap penalties and weight matrix choice. Nucleic Acids Res. 22: 46734680. Vicient, C.M., Suonemi, A., Anamthawat-Jonsson, K., Tanskanen, J., Beharav, A., Nevo, E., and Schulman, A.H. 1999. Retrotransposon BARE-1 and its role in genome evolution in the genus Hordeum. Plant Cell 11: 17691784. Vitte, C. and Bennetzen, J. 2006. Analysis of retrotransposon structural diversity uncovers properties and propensities in angiosperm genome evolution. Proc. Natl. Acad. Sci 103: 1763817643. Vitte, C. and Panaud, O. 2005. LTR retrotransposons and flowering plant genome size: Emergence of the increase/decrease model. Cytogenet. Genome Res. 110: 91107.[CrossRef][Medline] Wicker, T., Yahiaoui, N., Guyot, R., Schlagenhauf, E., Liu, Z.-D., Dubcovsky, J., and Keller, B. 2003a. Rapid genome divergence at orthologous low molecular weight glutenin loci of the A and Am genomes of wheat. Plant Cell 15: 11861197. Wicker, T., Guyot, R., Yahiaoui, N., and Keller, B. 2003b. CACTA transposons in Triticeae. A diverse family of high-copy repetitive elements. Plant Physiol. 132: 5263. Wicker, T., Zimmermann, W., Perovic, D., Paterson, A.H., Ganal, M., Graner, A., and Stein, N. 2005a. A detailed look at 7 million years of genome evolution in a 439 kb contiguous sequence at the barley Hv-eIF4E locus: Recombination, rearrangements and repeats. Plant J. 41: 184194.[CrossRef][Medline] Wicker, T., Robertson, J.S., Schulze, S.R., Feltus, F.A., Magrini, V., Morrison, J.A., Mardis, E.R., Wilson, R.K., Peterson, D.G., Paterson, A.H., et al. 2005b. The repetitive landscape of the chicken genome. Genome Res. 15: 126136. Wilhelm, M. and Wilhelm, F.X. 2001. Reverse transcription of retroviruses and LTR retrotransposons. Cell. Mol. Life Sci. 58: 12461262.[CrossRef][Medline] Wolfe, K.H., Gouy, M.L., Yang, Y.W., Sharp, P.M., and Li, W.H. 1989. Date of the monocot-dicot divergence estimated from chloroplast DNA-sequence data. Proc. Natl. Acad. Sci. 86: 62016205.
Received December 15, 2006; accepted in revised format April 12, 2007. | ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||