RT Journal A1 Earl, Dent A1 Nguyen, Ngan K A1 Hickey, Glenn A1 Harris, Robert S. A1 Fitzgerald, Stephen A1 Beal, Kathryn A1 Seledtsov, Igor A1 Molodtsov, Vladimir A1 Raney, Brian A1 Clawson, Hiram A1 Kim, Jaebum A1 Kemena, Carsten A1 Chang, Jia-Ming A1 Erb, Ionas A1 Poliakov, Alexander A1 Hou, Minmei A1 Herrero, Javier A1 Solovyev, Victor A1 Darling, Aaron E. A1 Ma, Jian A1 Notredame, Cedric A1 Brudno, Michael A1 Dubchak, Inna A1 Haussler, David A1 Paten, Benedict T1 Alignathon: A competitive assessment of whole genome alignment methods JF Genome Research JO Genome Research YR 2014 FD October 01 DO 10.1101/gr.174920.114 SP gr.174920.114 UL http://genome.cshlp.org/content/early/2014/10/01/gr.174920.114.abstract AB Multiple sequence alignments (MSAs) are a prerequisite for a wide variety of evolutionary analyses. Published assessments and benchmark datasets for protein and, to a lesser extent, global nucleotide MSAs are available, but less effort has been made to establish benchmarks in the more general problem of whole genome alignment (WGA). Using the same model as the successful Assemblathon competitions we organized a competitive evaluation in which teams submitted their alignments and then assessments were performed collectively after all the submissions were received. Three datasets were used; two were simulated and based on primate and mammalian phylogenies and one was comprised of 20 real fly genomes. In total 35 submissions were assessed, submitted by ten teams using 12 different alignment pipelines. We found agreement between independent simulation-based and statistical assessments indicating that there are substantial accuracy differences between contemporary alignment tools. We saw considerable difference in the alignment quality of differently annotated regions and found few tools aligned the duplications analysed. We found many tools worked well at shorter evolutionary distances, but fewer performed competitively at longer distances. We provide all datasets, submissions and assessment programs for further study and provide, as a resource for future benchmarking, a convenient repository of code and data for reproducing the simulation assessments.