EPIGEN-Brazil Initiative resources: a Latin American imputation panel and the Scientific Workflow
- Wagner C.S. Magalhães1,2,10,
- Nathalia M. Araujo1,10,
- Thiago P. Leal1,10,
- Gilderlanio S. Araujo1,
- Paula J.S. Viriato1,
- Fernanda S. Kehdy1,3,
- Gustavo N. Costa4,
- Mauricio L. Barreto4,5,
- Bernardo L. Horta6,
- Maria Fernanda Lima-Costa7,
- Alexandre C. Pereira8,
- Eduardo Tarazona-Santos1,11,
- Maíra R. Rodrigues1,9,11,
- The Brazilian EPIGEN Consortium12
- 1Departamento de Biologia Geral, Universidade Federal de Minas Gerais, Belo Horizonte, Minas Gerais, 31270-901, Brazil;
- 2Instituto Mario Penna, Núcleo de Ensino e Pesquisa, Belo Horizonte, Minas Gerais, 30380-472, Brazil;
- 3Laboratório de Hanseníase, Instituto Oswaldo Cruz, Fundação Oswaldo Cruz, Rio de Janeiro, Rio de Janeiro, 21040-900, Brazil;
- 4Instituto de Saúde Coletiva, Universidade Federal da Bahia, Salvador, Bahia, 40110-040, Brazil;
- 5Center for Data and Knowledge Integration for Health, Institute Gonçalo Muniz, Fundação Oswaldo Cruz, Salvador, Bahia, 40296-710, Brazil;
- 6Programa de Pós-Graduação em Epidemiologia, Universidade Federal de Pelotas, Pelotas, Rio Grande do Sul, 96020-220, Brazil;
- 7Instituto de Pesquisa Rene Rachou, Fundação Oswaldo Cruz, Belo Horizonte, Minas Gerais, 30190-009, Brazil;
- 8Instituto do Coração, Universidade de São Paulo, São Paulo, São Paulo, 05403-900, Brazil;
- 9Faculdade de Ciências Médicas e Instituto de Matemática, Estatística e Ciência da Computação, Universidade de Campinas, São Paulo, 13083-894, Brazil
- 13Departamento de Biologia Geral, Universidade Federal de Minas Gerais, Belo Horizonte, Minas Gerais, 31270-901, Brazil
- 14Instituto Nacional de Salud, Lima, 9, Peru
- 15Instituto de Pesquisa Rene Rachou, Fundação Oswaldo Cruz, Belo Horizonte, Minas Gerais, 30190-009, Brazil
- 16Laboratory of Translational Genomics, National Institute of Health, Bethesda, MD 20877, USA
- 17Laboratório Multiusuário de Genômica, Instituto de Ciências Biológicas, Universidade Federal de Minas Gerais, Belo Horizonte, Minas Gerais, 31270-901, Brazil
- 18Beagle. Belo Horizonte, Minas Gerais, 31710-550, Brazil
Abstract
EPIGEN-Brazil is one of the largest Latin American initiatives at the interface of human genomics, public health, and computational biology. Here, we present two resources to address two challenges to the global dissemination of precision medicine and the development of the bioinformatics know-how to support it. To address the underrepresentation of non-European individuals in human genome diversity studies, we present the EPIGEN-5M+1KGP imputation panel—the fusion of the public 1000 Genomes Project (1KGP) Phase 3 imputation panel with haplotypes derived from the EPIGEN-5M data set (a product of the genotyping of 4.3 million SNPs in 265 admixed individuals from the EPIGEN-Brazil Initiative). When we imputed a target SNPs data set (6487 admixed individuals genotyped for 2.2 million SNPs from the EPIGEN-Brazil project) with the EPIGEN-5M+1KGP panel, we gained 140,452 more SNPs in total than when using the 1KGP Phase 3 panel alone and 788,873 additional high confidence SNPs (info score ≥ 0.8). Thus, the major effect of the inclusion of the EPIGEN-5M data set in this new imputation panel is not only to gain more SNPs but also to improve the quality of imputation. To address the lack of transparency and reproducibility of bioinformatics protocols, we present a conceptual Scientific Workflow in the form of a website that models the scientific process (by including publications, flowcharts, masterscripts, documents, and bioinformatics protocols), making it accessible and interactive. Its applicability is shown in the context of the development of our EPIGEN-5M+1KGP imputation panel. The Scientific Workflow also serves as a repository of bioinformatics resources.
Footnotes
-
↵12 A complete list of the Brazilian EPIGEN Consortium authors appears at the end of this paper.
-
[Supplemental material is available for this article.]
-
Article published online before print. Article, supplemental material, and publication date are at http://www.genome.org/cgi/doi/10.1101/gr.225458.117.
- Received June 1, 2017.
- Accepted May 24, 2018.
This article is distributed exclusively by Cold Spring Harbor Laboratory Press for the first six months after the full-issue publication date (see http://genome.cshlp.org/site/misc/terms.xhtml). After six months, it is available under a Creative Commons License (Attribution-NonCommercial 4.0 International), as described at http://creativecommons.org/licenses/by-nc/4.0/.











