TY - JOUR A1 - Huang, Dandan A1 - Yi, Xianfu A1 - Zhou, Yao A1 - Yao, Hongcheng A1 - Xu, Hang A1 - Wang, Jianhua A1 - Zhang, Shijie A1 - Nong, Wenyan A1 - Wang, Panwen A1 - Shi, Lei A1 - Xuan, Chenghao A1 - Li, Miaoxin A1 - Wang, Junwen A1 - Li, Weidong A1 - Kwan, Hoi Shan A1 - Sham, Pak Chung A1 - Wang, Kai A1 - Li, Mulin Jun T1 - Ultrafast and scalable variant annotation and prioritization with big functional genomics data Y1 - 2020/12/01 JF - Genome Research JO - Genome Research SP - 1789 EP - 1801 DO - 10.1101/gr.267997.120 VL - 30 IS - 12 UR - http://genome.cshlp.org/content/30/12/1789.abstract N2 - The advances of large-scale genomics studies have enabled compilation of cell type–specific, genome-wide DNA functional elements at high resolution. With the growing volume of functional annotation data and sequencing variants, existing variant annotation algorithms lack the efficiency and scalability to process big genomic data, particularly when annotating whole-genome sequencing variants against a huge database with billions of genomic features. Here, we develop VarNote to rapidly annotate genome-scale variants in large and complex functional annotation resources. Equipped with a novel index system and a parallel random-sweep searching algorithm, VarNote shows substantial performance improvements (two to three orders of magnitude) over existing algorithms at different scales. It supports both region-based and allele-specific annotations and introduces advanced functions for the flexible extraction of annotations. By integrating massive base-wise and context-dependent annotations in the VarNote framework, we introduce three efficient and accurate pipelines to prioritize the causal regulatory variants for common diseases, Mendelian disorders, and cancers. ER -