Searching journal content for articles similar to Jourdain et al..

Displaying results 1-10 of 6136
For checked items
  1. ...of our knowledge, ScisTree2 is the first model-based cell lineage tree inference and genotype calling approach that is capable of handling data sets from tens of thousands of cells or more.An adult human body contains about 30 trillion cells that perform various functions. These cells are interconnected...
  2. ...and calculated the cost per thousand fully genotyped loci (Methods) (Supplemental Fig. S14A,B; Supplemental Note). When analyzing only the loci from our large-scale panel, our method was significantly more cost effective than WGS at all coverage levels necessary to achieve detection of Mendelian discordant loci...
  3. ...-Relate on the UK Biobank and All of Us data sets. On a data set of 200,000 individuals split between two parties, SF-Relate detects 97% of third-degree or closer relatives within 15 h of runtime. Our work enables secure identification of relatives across large-scale genomic data sets.Collaborative studies that aim...
  4. ..., morphology, and disease susceptibility. We explore the limitations of current data sets for variant interpretation, tradeoffs between sequencing strategies, and the burgeoning role of long-read s for capturing structural variants. In addition, we consider how large-scale collections of whole- sequence data...
  5. ...prediction. We build on these data, implementing further filtering steps to remove families with homology with other species, to identify putative SSOGs in the gut microbiome and to study them systematically. By looking for patterns in large-scale comparisons, we attempt to disentangle the evolutionary...
  6. ...data sets, we introduce a filtering approach called GDI that combines genotype probability (GP), alternate allele dosage (DS), and INFO score filters. We demonstrate that the imputation tools QUILT and GLIMPSE2 achieve similar accuracy, which is high enough for broad-scale ancestry mapping...
  7. ...strategy to improve computational efficiency and enhance model generalizability (Methods). This approach leverages subsets of cells to update the cell embedding matrix, reducing memory usage and mitigating overfitting. This strategy is particularly valuable when processing large-scale scRNA-seq data sets...
  8. ...://creativecommons.org/licenses/by-nc/4.0/.References ↵Abraham G, Qiu Y, Inouye M. 2017. FlashPCA2: principal component analysis of biobank-scale genotype data sets. Bioinformatics 33: 2776–2778. doi:10.1093/bioinformatics/btx299 ↵Agrawal A, Chiu AM, Le M, Halperin E, Sankararaman S. 2020. Scalable probabilistic PCA for large-scale...
  9. ..., we performed Oxford Nanopore Technologies long-read sequencing on 72 cervical cancer s from a Ugandan data set that was previously characterized using short-read sequencing. We find recurrent structural rearrangement patterns at HPV integration events, which we categorize as del(etion)-like, dup...
  10. ...to protect large-scale haplotype-level data sets and demonstrated their utility for outsourcing genotype imputation. Proxy panels combine mosaic haplotype generation, noise addition, locality-sensitive random hashing, and random permutations to protect against well-known linking and reidentification...
For checked items

Preprint Server