RT Journal
A1 Earl, Dent
A1 Nguyen, Ngan
A1 Hickey, Glenn
A1 Harris, Robert S.
A1 Fitzgerald, Stephen
A1 Beal, Kathryn
A1 Seledtsov, Igor
A1 Molodtsov, Vladimir
A1 Raney, Brian J.
A1 Clawson, Hiram
A1 Kim, Jaebum
A1 Kemena, Carsten
A1 Chang, Jia-Ming
A1 Erb, Ionas
A1 Poliakov, Alexander
A1 Hou, Minmei
A1 Herrero, Javier
A1 Kent, William James
A1 Solovyev, Victor
A1 Darling, Aaron E.
A1 Ma, Jian
A1 Notredame, Cedric
A1 Brudno, Michael
A1 Dubchak, Inna
A1 Haussler, David
A1 Paten, Benedict
T1 Alignathon: a competitive assessment of whole-genome alignment methods
JF Genome Research 
JO Genome Research 
YR 2014 
FD December 01 
VO 24 
IS 12 
SP 2077 
OP 2089 
DO 10.1101/gr.174920.114 
UL http://genome.cshlp.org/content/24/12/2077.abstract 
AB Multiple sequence alignments (MSAs) are a prerequisite for a wide variety of evolutionary analyses. Published assessments and benchmark data sets for protein and, to a lesser extent, global nucleotide MSAs are available, but less effort has been made to establish benchmarks in the more general problem of whole-genome alignment (WGA). Using the same model as the successful Assemblathon competitions, we organized a competitive evaluation in which teams submitted their alignments and then assessments were performed collectively after all the submissions were received. Three data sets were used: Two were simulated and based on primate and mammalian phylogenies, and one was comprised of 20 real fly genomes. In total, 35 submissions were assessed, submitted by 10 teams using 12 different alignment pipelines. We found agreement between independent simulation-based and statistical assessments, indicating that there are substantial accuracy differences between contemporary alignment tools. We saw considerable differences in the alignment quality of differently annotated regions and found that few tools aligned the duplications analyzed. We found that many tools worked well at shorter evolutionary distances, but fewer performed competitively at longer distances. We provide all data sets, submissions, and assessment programs for further study and provide, as a resource for future benchmarking, a convenient repository of code and data for reproducing the simulation assessments.