A fast and adaptive detection framework for genome-wide chromatin loop mapping from Hi-C data
- Siyuan Chen1,2,3,8,
- Jiuming Wang4,8,
- Inkyung Jung5,
- Zhaowen Qiu6,
- Xin Gao1,2,3 and
- Yu Li4,7
- 1Computer Science Program, Computer, Electrical and Mathematical Sciences and Engineering (CEMSE) Division, King Abdullah University of Science and Technology (KAUST), Thuwal 23955-6900, Kingdom of Saudi Arabia;
- 2Center of Excellence on Smart Health, King Abdullah University of Science and Technology (KAUST), Thuwal 23955-6900, Kingdom of Saudi Arabia;
- 3Center of Excellence for Generative AI, King Abdullah University of Science and Technology (KAUST), Thuwal 23955-6900, Kingdom of Saudi Arabia;
- 4Department of Computer Science and Engineering, The Chinese University of Hong Kong (CUHK), Hong Kong SAR 999077, China;
- 5Department of Biological Sciences, Korea Advanced Institute of Science and Technology (KAIST), Daejeon 34141, Republic of Korea;
- 6Institute of Information and Computer Engineering, NorthEast Forestry University, Harbin 150040, China;
- 7The CUHK Shenzhen Research Institute, Hi-Tech Park, Nanshan, Shenzhen 518057, China
-
↵8 These authors contributed equally to this work.
Abstract
Chromatin loop identification plays an important role in molecular biology and 3D genomics research, as it constitutes a fundamental process in transcription and gene regulation. Such precise chromatin structures can be identified across genome-wide interaction matrices via Hi-C data analysis, which is essential for unraveling the intricacies of transcriptional regulation. Given the increasing number of genome-wide contact maps, derived from both in situ Hi-C and single-cell Hi-C experiments, there is a pressing need for efficient and resilient algorithms capable of processing data from diverse experiments rapidly and adaptively. Here, we propose YOLOOP, a novel detection-based framework that is different from the conventional paradigm. YOLOOP stands out for its speed, surpassing the performance of previous state-of-the-art (SOTA) chromatin loop detection methods. It achieves a 30-fold acceleration compared with classification-based methods, up to 20-fold acceleration compared with the SOTA kernel-based framework, and a fivefold acceleration compared with statistical algorithms. Furthermore, the proposed framework is capable of generalizing across various cell types, multiresolution Hi-C maps, and diverse experimental protocols. Compared with the existing paradigms, YOLOOP shows up to a 10% increase in recall and a 15% increase in F1-score, particularly noteworthy in the GM12878 cell line. YOLOOP also offers fast adaptability with straightforward fine-tuning, making it readily applicable to extremely sparse single-cell Hi-C contact maps. It maintains its exceptional speed, completing genome-wide detection at a 10 kb resolution for a single-cell contact map within 1 min and for a 900-cell-superimposed contact map within 3 min, enabling fast analysis of large-scale single-cell data.
Footnotes
-
[Supplemental material is available for this article.]
-
Article published online before print. Article, supplemental material, and publication date are at https://www.genome.org/cgi/doi/10.1101/gr.279274.124.
-
Freely available online through the Genome Research Open Access option.
- Received March 5, 2024.
- Accepted August 8, 2024.
This article, published in Genome Research, is available under a Creative Commons License (Attribution-NonCommercial 4.0 International), as described at http://creativecommons.org/licenses/by-nc/4.0/.











