EnsemblCompara GeneTrees: Complete, duplication-aware phylogenetic trees in vertebrates
- Albert J. Vilella1,
- Jessica Severin13,
- Abel Ureta-Vidal14,
- Li Heng2,
- Richard Durbin2 and
- Ewan Birney15
- 1EMBL-EBI, Wellcome Trust Genome Campus, Hinxton, Cambridge CB10 1SD, United Kingdom
- 2Wellcome Trust Sanger Institute, Wellcome Trust Genome Campus, Hinxton, Cambridge CB10 1HH, United Kingdom
Abstract
We have developed a comprehensive gene orientated phylogenetic resource, EnsemblCompara GeneTrees, based on a computational pipeline to handle clustering, multiple alignment, and tree generation, including the handling of large gene families. We developed two novel non-sequence-based metrics of gene tree correctness and benchmarked a number of tree methods. The TreeBeST method from TreeFam shows the best performance in our hands. We also compared this phylogenetic approach to clustering approaches for ortholog prediction, showing a large increase in coverage using the phylogenetic approach. All data are made available in a number of formats and will be kept up to date with the Ensembl project.
Footnotes
-
↵ Present addresses: 3RIKEN Yokohama Institute, Genomic Sciences Center (GSC), 1-7-22 Suehiro-cho, Tsurumi-ku, Yokohama 230-0045, Japan;
-
↵ Eagle Genomics, 19 Forge End, Stapleford, Cambridge CB22 5BN, UK.
-
↵ Corresponding author.
↵ E-mail birney{at}ebi.ac.uk; fax 44-1223-494919.
-
[Supplemental material is available online at www.genome.org.]
-
Article published online before print. Article and publication date are at http://www.genome.org/cgi/doi/10.1101/gr.073585.107. Freely available online through the Genome Research Open Access option.
-
- Received October 26, 2007.
- Accepted November 18, 2008.
-
Freely available online through the open access option.
- Copyright © 2009 by Cold Spring Harbor Laboratory Press











