VPatho: a deep learning-based two-stage approach for accurate prediction of gain-of-function and loss-of-function variants
dc.contributor.author | Ge F. | |
dc.contributor.author | Li C. | |
dc.contributor.author | Iqbal S. | |
dc.contributor.author | Muhammad A. | |
dc.contributor.author | Li F. | |
dc.contributor.author | Thafar M.A. | |
dc.contributor.author | Yan Z. | |
dc.contributor.author | Worachartcheewan A. | |
dc.contributor.author | Xu X. | |
dc.contributor.author | Song J. | |
dc.contributor.author | Yu D.J. | |
dc.contributor.other | Mahidol University | |
dc.date.accessioned | 2023-05-19T07:36:20Z | |
dc.date.available | 2023-05-19T07:36:20Z | |
dc.date.issued | 2023-01-19 | |
dc.description.abstract | Determining the pathogenicity and functional impact (i.e. gain-of-function; GOF or loss-of-function; LOF) of a variant is vital for unraveling the genetic level mechanisms of human diseases. To provide a 'one-stop' framework for the accurate identification of pathogenicity and functional impact of variants, we developed a two-stage deep-learning-based computational solution, termed VPatho, which was trained using a total of 9619 pathogenic GOF/LOF and 138 026 neutral variants curated from various databases. A total number of 138 variant-level, 262 protein-level and 103 genome-level features were extracted for constructing the models of VPatho. The development of VPatho consists of two stages: (i) a random under-sampling multi-scale residual neural network (ResNet) with a newly defined weighted-loss function (RUS-Wg-MSResNet) was proposed to predict variants' pathogenicity on the gnomAD_NV + GOF/LOF dataset; and (ii) an XGBOD model was constructed to predict the functional impact of the given variants. Benchmarking experiments demonstrated that RUS-Wg-MSResNet achieved the highest prediction performance with the weights calculated based on the ratios of neutral versus pathogenic variants. Independent tests showed that both RUS-Wg-MSResNet and XGBOD achieved outstanding performance. Moreover, assessed using variants from the CAGI6 competition, RUS-Wg-MSResNet achieved superior performance compared to state-of-the-art predictors. The fine-trained XGBOD models were further used to blind test the whole LOF data downloaded from gnomAD and accordingly, we identified 31 nonLOF variants that were previously labeled as LOF/uncertain variants. As an implementation of the developed approach, a webserver of VPatho is made publicly available at http://csbio.njust.edu.cn/bioinf/vpatho/ to facilitate community-wide efforts for profiling and prioritizing the query variants with respect to their pathogenicity and functional impact. | |
dc.identifier.citation | Briefings in bioinformatics Vol.24 No.1 (2023) | |
dc.identifier.doi | 10.1093/bib/bbac535 | |
dc.identifier.eissn | 14774054 | |
dc.identifier.pmid | 36528806 | |
dc.identifier.scopus | 2-s2.0-85147044971 | |
dc.identifier.uri | https://repository.li.mahidol.ac.th/handle/20.500.14594/81682 | |
dc.rights.holder | SCOPUS | |
dc.subject | Biochemistry, Genetics and Molecular Biology | |
dc.title | VPatho: a deep learning-based two-stage approach for accurate prediction of gain-of-function and loss-of-function variants | |
dc.type | Article | |
mu.datasource.scopus | https://www.scopus.com/inward/record.uri?partnerID=HzOxMe3b&scp=85147044971&origin=inward | |
oaire.citation.issue | 1 | |
oaire.citation.title | Briefings in bioinformatics | |
oaire.citation.volume | 24 | |
oairecerif.author.affiliation | Bengbu University | |
oairecerif.author.affiliation | The Peter Doherty Institute for Infection and Immunity | |
oairecerif.author.affiliation | Taif University | |
oairecerif.author.affiliation | Anhui Polytechnic University | |
oairecerif.author.affiliation | Northwest A&F University | |
oairecerif.author.affiliation | Monash University | |
oairecerif.author.affiliation | Faculty of Medicine, Nursing and Health Sciences | |
oairecerif.author.affiliation | Mahidol University | |
oairecerif.author.affiliation | Nanjing University of Science and Technology |