Identification of genomic features using microsyntenies of domains: Domain teams

  1. Sophie Pasek1,4,5,
  2. Anne Bergeron3,
  3. Jean-Loup Risler1,
  4. Alexandra Louis2,
  5. Emmanuelle Ollivier1, and
  6. Mathieu Raffinot1
  1. 1 Laboratoire Génome et Informatique, CNRS/UEVE, 91034 Evry cedex, France
  2. 2 Infobiogen, 91034 Evry cedex, France
  3. 3 LacIM, Université du Québec à Montréal, Montréal, Québec, Canada
  4. 4 Soluscience, Biopôle Clermont-Limagne, 63360 Saint-Beauzire, France

Abstract

The detection, across several genomes, of local conservation of gene content and proximity considerably helps the prediction of features of interest, such as gene fusions or physical and functional interactions. Here, we want to process realistic models of chromosomes, in which genes (or genomic segments of several genes) can be duplicated within a chromosome, or be absent from some other chromosome(s). Our approach adopts the technique of temporarily forgetting genes and working directly with protein “domains” such as those found in Pfam. This allows the detection of strings of domains that are conserved in their content, but not necessarily in their order, which we refer to as domain teams. The prominent feature of the method is that it relaxes the rigidity of the orthology criterion and avoids many of the pitfalls of gene-families identification methods, often hampered by multidomain proteins or low levels of sequence similarity. This approach, that allows both inter- and intrachromosomal comparisons, proves to be more sensitive than the classical methods based on pairwise sequence comparisons, particularly in the simultaneous treatment of many species. The automated and fast detection of domain teams, together with its increased sensitivity at identifying segments of identical (protein-coding) gene contents as well as gene fusions, should prove a useful complement to other existing methods.

Footnotes

  • [Supplemental material is available online at www.genome.org.]

  • Article and publication are at http://www.genome.org/cgi/doi/10.1101/gr.3638405. Article published online before print in May 2005.

  • 5 Corresponding author. E-mail pasek{at}genopole.cnrs.fr; fax 33-1-60-87-38-97.

    • Accepted March 28, 2005.
    • Received January 3, 2005.
| Table of Contents

Preprint Server