Analyzing the large and complex SFARI autism cohort data using the Genotypes and Phenotypes in Families (GPF) platform
- Liubomir Chorbadjiev1,7,
- Murat Cokol2,7,
- Zohar Weinstein3,
- Kevin Shi4,
- Christopher Fleisch5,
- Nikolay Dimitrov1,
- Svetlin Mladenov1,
- Ivo Todorov1,
- Iordan Ivanov1,
- Simon Xu5,
- Steven Ford5,
- Yoon-ha Lee6,
- Boris Yamrom6,
- Steven Marks6,
- Adriana Munoz6,
- Alex Lash5,
- Natalia Volfovsky5 and
- Ivan Iossifov6
- 1SeqPipe Limited, Sofia 1000, Bulgaria;
- 2Rodop Biotechnology, 54050 Sakarya, Turkey;
- 3Atrius Health, Cambridge, Massachusetts 02139, USA;
- 4New York Genome Center, New York, New York 10013, USA;
- 5Simons Foundation, New York, New York 10010, USA;
- 6Cold Spring Harbor Laboratory, Cold Spring Harbor, New York 11724, USA
-
↵7 These authors contributed equally to this work.
Abstract
The exploration of genotypic variants impacting phenotypes is a cornerstone in genetics research. The emergence of vast collections containing deeply genotyped and phenotyped families has made it possible to pursue the search for variants associated with complex diseases. However, managing these large-scale data sets requires specialized computational tools to organize and analyze the extensive data. Genotypes and Phenotypes in Families (GPF) is an open-source platform that manages genotypes and phenotypes derived from collections of families. GPF allows interactive exploration of genetic variants, enrichment analysis for de novo mutations, phenotype/genotype association tools, and secure data sharing. GPF is used to disseminate two family collection data sets, SSC and SPARK, for the study of autism, built by the Simons Foundation. The GPF instance at the Simons Foundation (GPF-SFARI) provides protected access to comprehensive genotypic and phenotypic data for SSC and SPARK. GPF-SFARI also provides public access to an extensive collection of de novo mutations from individuals with autism and related disorders and to gene-level statistics of the protected data sets characterizing the genes’ roles in autism. However, GPF is versatile and can manage genotypic data from other small or large family collections. Here, we highlight the primary features of GPF within the context of GPF-SFARI.
Footnotes
-
[Supplemental material is available for this article.]
-
Article published online before print. Article, supplemental material, and publication date are at https://www.genome.org/cgi/doi/10.1101/gr.280356.124.
-
Freely available online through the Genome Research Open Access option.
- Received December 13, 2024.
- Accepted July 31, 2025.
This article, published in Genome Research, is available under a Creative Commons License (Attribution-NonCommercial 4.0 International), as described at http://creativecommons.org/licenses/by-nc/4.0/.











