Publication:
Validation of genotype imputation in Southeast Asian populations and the effect of single nucleotide polymorphism annotation on imputation outcome

dc.contributor.authorWorachart Lert-itthipornen_US
dc.contributor.authorBhoom Suktitipaten_US
dc.contributor.authorHarald Groveen_US
dc.contributor.authorAnavaj Sakuntabhaien_US
dc.contributor.authorPrida Malasiten_US
dc.contributor.authorNattaya Tangthawornchaikulen_US
dc.contributor.authorFumihiko Matsudaen_US
dc.contributor.authorPrapat Suriyapholen_US
dc.contributor.otherMahidol Universityen_US
dc.contributor.otherThailand National Center for Genetic Engineering and Biotechnologyen_US
dc.contributor.otherFaculty of Medicine, Siriraj Hospital, Mahidol Universityen_US
dc.contributor.otherKyoto Universityen_US
dc.contributor.otherCNRS Centre National de la Recherche Scientifiqueen_US
dc.contributor.otherInstitut Pasteur, Parisen_US
dc.date.accessioned2019-08-23T10:37:15Z
dc.date.available2019-08-23T10:37:15Z
dc.date.issued2018-02-13en_US
dc.description.abstract© 2018 The Author(s). Background: Imputation involves the inference of untyped single nucleotide polymorphisms (SNPs) in genome-wide association studies. The haplotypic reference of choice for imputation in Southeast Asian populations is unclear. Moreover, the influence of SNP annotation on imputation results has not been examined. Methods: This study was divided into two parts. In the first part, we applied imputation to genotyped SNPs from Southeast Asian populations from the Pan-Asian SNP database. Five percent of the total SNPs were removed. The remaining SNPs were applied to imputation with IMPUTE2. The imputed outcomes were verified with the removed SNPs. We compared imputation references from Chinese and Japanese haplotypes from the HapMap phase II (HMII) and the complete set of haplotypes from the 1000 Genomes Project (1000G). The second part was imputation accuracy and yield in Thai patient dataset. Half of the autosomal SNPs was removed to create Set 1. Another dataset, Set 2, was then created where we switched which half of the SNPs were removed. Both Set 1 and Set 2 were imputed with HMII to create a complete imputed SNPs dataset. The dataset was used to validate association testing, SNPs annotation and imputation outcome. Results: The accuracy was highest for all populations when using the HMII reference, but at the cost of a lower yield. Thai genotypes showed the highest accuracy over other populations in both HMII and 1000G panels, although accuracy and yield varied across chromosomes. Imputation was tested in a clinical dataset to compare accuracy in gene-related regions, and coding regions were found to have a higher accuracy and yield. Conclusions: This work provides the first evidence of imputation reference selection for Southeast Asian studies and highlights the effects of SNP locations respective to genes on imputation outcome. Researchers will need to consider the trade-off between accuracy and yield in future imputation studies.en_US
dc.identifier.citationBMC Medical Genetics. Vol.19, No.1 (2018)en_US
dc.identifier.doi10.1186/s12881-018-0534-8en_US
dc.identifier.issn14712350en_US
dc.identifier.other2-s2.0-85042107898en_US
dc.identifier.urihttps://repository.li.mahidol.ac.th/handle/123456789/45245
dc.rightsMahidol Universityen_US
dc.rights.holderSCOPUSen_US
dc.source.urihttps://www.scopus.com/inward/record.uri?partnerID=HzOxMe3b&scp=85042107898&origin=inwarden_US
dc.subjectBiochemistry, Genetics and Molecular Biologyen_US
dc.subjectMedicineen_US
dc.titleValidation of genotype imputation in Southeast Asian populations and the effect of single nucleotide polymorphism annotation on imputation outcomeen_US
dc.typeArticleen_US
dspace.entity.typePublication
mu.datasource.scopushttps://www.scopus.com/inward/record.uri?partnerID=HzOxMe3b&scp=85042107898&origin=inwarden_US

Files

Collections