Direct inference and control of genetic population structure from RNA sequencing data

dc.contributor.authorFachrul M.
dc.contributor.authorKarkey A.
dc.contributor.authorShakya M.
dc.contributor.authorJudd L.M.
dc.contributor.authorHarshegyi T.
dc.contributor.authorSim K.S.
dc.contributor.authorTonks S.
dc.contributor.authorDongol S.
dc.contributor.authorShrestha R.
dc.contributor.authorSalim A.
dc.contributor.authorAdhikari A.
dc.contributor.authorBanda H.C.
dc.contributor.authorBlohmke C.
dc.contributor.authorDarton T.C.
dc.contributor.authorFarooq Y.
dc.contributor.authorGhimire M.
dc.contributor.authorHill J.
dc.contributor.authorHoang N.T.
dc.contributor.authorJere T.M.
dc.contributor.authorKamzati M.
dc.contributor.authorKao Y.H.
dc.contributor.authorMasesa C.
dc.contributor.authorMbewe M.
dc.contributor.authorMsuku H.
dc.contributor.authorMunthali P.
dc.contributor.authorNga T.V.T.
dc.contributor.authorNkhata R.
dc.contributor.authorSaad N.J.
dc.contributor.authorVan Tan T.
dc.contributor.authorThindwa D.
dc.contributor.authorKhanam F.
dc.contributor.authorMeiring J.
dc.contributor.authorClemens J.D.
dc.contributor.authorDougan G.
dc.contributor.authorPitzer V.E.
dc.contributor.authorQadri F.
dc.contributor.authorHeyderman R.S.
dc.contributor.authorGordon M.A.
dc.contributor.authorVoysey M.
dc.contributor.authorBaker S.
dc.contributor.authorPollard A.J.
dc.contributor.authorKhor C.C.
dc.contributor.authorDolecek C.
dc.contributor.authorBasnyat B.
dc.contributor.authorDunstan S.J.
dc.contributor.authorHolt K.E.
dc.contributor.authorInouye M.
dc.contributor.otherMahidol University
dc.date.accessioned2023-08-11T18:00:58Z
dc.date.available2023-08-11T18:00:58Z
dc.date.issued2023-08-02
dc.description.abstractRNAseq data can be used to infer genetic variants, yet its use for estimating genetic population structure remains underexplored. Here, we construct a freely available computational tool (RGStraP) to estimate RNAseq-based genetic principal components (RG-PCs) and assess whether RG-PCs can be used to control for population structure in gene expression analyses. Using whole blood samples from understudied Nepalese populations and the Geuvadis study, we show that RG-PCs had comparable results to paired array-based genotypes, with high genotype concordance and high correlations of genetic principal components, capturing subpopulations within the dataset. In differential gene expression analysis, we found that inclusion of RG-PCs as covariates reduced test statistic inflation. Our paper demonstrates that genetic population structure can be directly inferred and controlled for using RNAseq data, thus facilitating improved retrospective and future analyses of transcriptomic data.
dc.identifier.citationCommunications biology Vol.6 No.1 (2023) , 804
dc.identifier.doi10.1038/s42003-023-05171-9
dc.identifier.eissn23993642
dc.identifier.pmid37532769
dc.identifier.scopus2-s2.0-85166425755
dc.identifier.urihttps://repository.li.mahidol.ac.th/handle/20.500.14594/88276
dc.rights.holderSCOPUS
dc.subjectBiochemistry, Genetics and Molecular Biology
dc.titleDirect inference and control of genetic population structure from RNA sequencing data
dc.typeArticle
mu.datasource.scopushttps://www.scopus.com/inward/record.uri?partnerID=HzOxMe3b&scp=85166425755&origin=inward
oaire.citation.issue1
oaire.citation.titleCommunications biology
oaire.citation.volume6
oairecerif.author.affiliationMahidol Oxford Tropical Medicine Research Unit
oairecerif.author.affiliationOxford University Clinical Research Unit
oairecerif.author.affiliationDepartment of Medicine
oairecerif.author.affiliationDepartment of Public Health and Primary Care
oairecerif.author.affiliationSchool of Mathematics and Statistics
oairecerif.author.affiliationSchool of Biosciences
oairecerif.author.affiliationMelbourne School of Population and Global Health
oairecerif.author.affiliationThe Peter Doherty Institute for Infection and Immunity
oairecerif.author.affiliationFriends of Patan Hospital Nepal
oairecerif.author.affiliationBaker Heart and Diabetes Institute
oairecerif.author.affiliationLondon School of Hygiene & Tropical Medicine
oairecerif.author.affiliationA-Star, Genome Institute of Singapore
oairecerif.author.affiliationUniversity of Cambridge
oairecerif.author.affiliationUniversity of Melbourne
oairecerif.author.affiliationFaculty of Medicine, Nursing and Health Sciences
oairecerif.author.affiliationNuffield Department of Medicine
oairecerif.author.affiliationUniversity of Oxford Medical Sciences Division

Files

Collections