EnsemblCompara GeneTrees: Analysis of complete, duplication aware phylogenetic trees in vertebrates.

  1. Albert J Vilella1,
  2. Jessica Severin1,
  3. Abel Ureta-Vidal1,
  4. Richard Durbin2,
  5. Li Heng2, and
  6. Ewan Birney3,4
  1. 1 European Bioinformatics Institute - EMBL;
  2. 2 Sanger Institute;
  3. 3 EBI

Abstract

We have developed a comprehensive gene orientated phylogenetic resource, EnsemblCompara GeneTrees based on a computational pipeline to handle clustering, multiple alignment and tree generation, including the handling of large gene families. We developed two novel non-sequence based metrics of gene tree correctness and benchmarked a number of tree methods. The TreeBeST method from TreeFam shows the best performance in our hands. We also compared this phylogenetic approach to clustering approaches for ortholog prediction, showing a large increase in coverage using the phylogenetic approach. All data is made available in a number of formats and will be kept up to date with the Ensembl project.

Footnotes

    • Received October 26, 2007.
    • Accepted November 18, 2008.
  • This manuscript is Open Access.

OPEN ACCESS ARTICLE
ACCEPTED MANUSCRIPT

Preprint Server