RT Journal
A1 Gustafson, Jonas A.
A1 Gibson, Sophia B.
A1 Damaraju, Nikhita
A1 Zalusky, Miranda P.G.
A1 Hoekzema, Kendra
A1 Twesigomwe, David
A1 Yang, Lei
A1 Snead, Anthony A.
A1 Richmond, Phillip A.
A1 De Coster, Wouter
A1 Olson, Nathan D.
A1 Guarracino, Andrea
A1 Li, Qiuhui
A1 Miller, Angela L.
A1 Goffena, Joy
A1 Anderson, Zachary B.
A1 Storz, Sophie H.R.
A1 Ward, Sydney A.
A1 Sinha, Maisha
A1 Gonzaga-Jauregui, Claudia
A1 Clarke, Wayne E.
A1 Basile, Anna O.
A1 Corvelo, André
A1 Reeves, Catherine
A1 Helland, Adrienne
A1 Musunuri, Rajeeva Lochan
A1 Revsine, Mahler
A1 Patterson, Karynne E.
A1 Paschal, Cate R.
A1 Zakarian, Christina
A1 Goodwin, Sara
A1 Jensen, Tanner D.
A1 Robb, Esther
A1 The 1000 Genomes ONT Sequencing Consortium
A1 University of Washington Center for Rare Disease Research (UW-CRDR)
A1 Genomics Research to Elucidate the Genetics of Rare Diseases (GREGoR) Consortium
A1 McCombie, William Richard
A1 Sedlazeck, Fritz J.
A1 Zook, Justin M.
A1 Montgomery, Stephen B.
A1 Garrison, Erik
A1 Kolmogorov, Mikhail
A1 Schatz, Michael C.
A1 McLaughlin, Richard N.
A1 Dashnow, Harriet
A1 Zody, Michael C.
A1 Loose, Matt
A1 Jain, Miten
A1 Eichler, Evan E.
A1 Miller, Danny E.
T1 High-coverage nanopore sequencing of samples from the 1000 Genomes Project to build a comprehensive catalog of human genetic variation
JF Genome Research 
JO Genome Research 
YR 2024 
FD November 01 
VO 34 
IS 11 
SP 2061 
OP 2073 
DO 10.1101/gr.279273.124 
UL http://genome.cshlp.org/content/34/11/2061.abstract 
AB Fewer than half of individuals with a suspected Mendelian or monogenic condition receive a precise molecular diagnosis after comprehensive clinical genetic testing. Improvements in data quality and costs have heightened interest in using long-read sequencing (LRS) to streamline clinical genomic testing, but the absence of control data sets for variant filtering and prioritization has made tertiary analysis of LRS data challenging. To address this, the 1000 Genomes Project (1KGP) Oxford Nanopore Technologies Sequencing Consortium aims to generate LRS data from at least 800 of the 1KGP samples. Our goal is to use LRS to identify a broader spectrum of variation so we may improve our understanding of normal patterns of human variation. Here, we present data from analysis of the first 100 samples, representing all 5 superpopulations and 19 subpopulations. These samples, sequenced to an average depth of coverage of 37× and sequence read N50 of 54 kbp, have high concordance with previous studies for identifying single nucleotide and indel variants outside of homopolymer regions. Using multiple structural variant (SV) callers, we identify an average of 24,543 high-confidence SVs per genome, including shared and private SVs likely to disrupt gene function as well as pathogenic expansions within disease-associated repeats that were not detected using short reads. Evaluation of methylation signatures revealed expected patterns at known imprinted loci, samples with skewed X-inactivation patterns, and novel differentially methylated regions. All raw sequencing data, processed data, and summary statistics are publicly available, providing a valuable resource for the clinical genetics community to discover pathogenic SVs.