Genotype imputation of the human leukocyte antigen (HLA) region is a cost-effective means to infer classical HLA alleles from inexpensive and dense SNP array data. In the research setting, imputation helps avoid costs for wet lab-based HLA typing and thus renders association analyses of the HLA in large cohorts feasible. Yet, most HLA imputation reference panels target Caucasian ethnicities and multi-ethnic panels are scarce. We compiled a high-quality multi-ethnic reference panel based on genotypes measured with Illumina's Immunochip genotyping array and HLA types established using a high-resolution next generation sequencing approach. Our reference panel includes more than 1,300 samples from Germany, Malta, China, India, Iran, Japan and Korea and samples of African American ancestry for all classical HLA class I and II alleles including HLA-DRB3/4/5. Applying extensive cross-validation, we benchmarked the imputation using the HLA imputation tool HIBAG, our multi-ethnic reference and an independent, previously published data set compiled of subpopulations of the 1000 Genomes project. We achieved average imputation accuracies higher than 0.924 for the commonly studied HLA-A, -B, -C, -DQB1 and -DRB1 genes across all ethnicities. We investigated allele-specific imputation challenges in regard to geographic origin of the samples using sensitivity and specificity measurements as well as allele frequencies and identified HLA alleles that are challenging to impute for each of the populations separately. In conclusion, our new multi-ethnic reference data set allows for high resolution HLA imputation of genotypes at all classical HLA class I and II genes including the HLA-DRB3/4/5 loci based on diverse ancestry populations.
Bibliographical noteFunding Information:
German Research Foundation (DFG) (Research Training Group 1743, ‘Genes, Environment and Inflammation’ to M.W.); DFG Excellence Cluster No. 306 ‘Inflammation at Interfaces’; European Union Seventh Framework Programme (FP7-PEOPLE-2013-COFUND) (No. 609020; Scientia Fellows to E.E.); Funding for the Multicenter African American IBD Study (MAAIS) samples was provided by the USA National Institutes of Health (DK062431 to S.R.B.); University Medical Center Groningen, Groningen, The Netherlands (to S.A.); Institute for Digestive System Disease, Tehran University of Medical Sciences, Tehran, Iran (to S.A.); BioBank Japan Project and, in part, by a Grant- in-Aid for Scientific Research (B) (26293180) funded by the Ministry of Education, Culture, Sports, Science, and Technology, Japan; Korea Health Technology R&D Project through the Korea Health Industry Development Institute (KHIDI), funded by the Ministry of Health & Welfare, Republic of Korea (HI18C0094); Funding for the Indian samples was provided by the Centre of Excellence in Genome Sciences and Predictive Medicine (BT/01/COE/07/UDSC/2008 from the Department of Biotechnology, Government of India); BMBF e:Med research and funding concept (SysInflame grant 01ZX1306A; GB-XMAP grant 01ZX1709); J.D.R. holds a Canada Research Chair and this work was supported by National Institutes of Health grant DK62432. The funders had no role in study design, data collection and analysis, decision to publish or preparation of the manuscript.
All Science Journal Classification (ASJC) codes
- Molecular Biology