DupMasker: A tool for annotating primate segmental duplications

  1. Zhaoshi Jiang1,
  2. Robert Hubley2,
  3. Arian Smit2, and
  4. Evan E Eichler1,3
  1. 1 University of Washington;
  2. 2 Institute for Systems Biology

Abstract

Segmental duplications (SD) play an important role in genome rearrangement, evolution and the copy-number variation (CNV) of primate genomes. Such sequences are difficult to detect, a priori, because they share no defining sequence features that distinguish them from unique portions of the genome. Current sequence annotation of segmental duplications requires computationally intensive, genome-wide self-comparisons that can not be easily implemented on new datasets. Based on the successful implementation of RepeatMasker, we developed a new genome annotation tool, DupMasker. The program uses a library of non-redundant consensus sequences of human segmental duplications, wherein a majority of the ancestral origins have been determined based on comparisons to mammalian outgroup genomes. Using DupMasker, new human and non-human primate (NHP) sequences may be readily queried to provide details on the origin and degree of sequence identity of each duplicon. This program can be applied to delineate the order and orientation of duplicons within complex duplication blocks and used to characterize structural variation differences between sequenced human haplotypes. We predict this tool will be valuable in the annotation of large-insert sequence clones, allowing putative unique and duplicated regions of the genomes to be annotated prior to whole genome assembly comparisons.

Footnotes

    • Received March 17, 2008.
    • Accepted May 19, 2008.

Articles citing this article

ACCEPTED MANUSCRIPT

This Article

  1. Genome Res. gr.078477.108 Copyright © 2008, Cold Spring Harbor Laboratory Press

Article Category

Share

Preprint Server