Development and validation of supervised machine learning multivariable prediction models for the diagnosis of Pneumocystis jirovecii pneumonia using nasopharyngeal swab PCR in adults in a low-HIV prevalence setting

dc.contributor.authorChew R.
dc.contributor.authorWoods M.L.
dc.contributor.authorPaterson D.L.
dc.contributor.correspondenceChew R.
dc.contributor.otherMahidol University
dc.date.accessioned2025-09-21T18:35:25Z
dc.date.available2025-09-21T18:35:25Z
dc.date.issued2025-09-01
dc.description.abstractBackground: The global burden of the opportunistic fungal disease Pneumocystis jirovecii pneumonia (PJP) remains substantial. Polymerase chain reaction (PCR) on nasopharyngeal swabs (NPS) has high specificity and may be a viable alternative to the gold standard diagnostic of PCR on invasively collected lower respiratory tract specimens, but has low sensitivity. Sensitivity may be improved by incorporating NPS PCR results into machine learning models. Methods: Three supervised multivariable diagnostic models (random forest, logistic regression and extreme gradient boosting) were constructed and validated using a 111-person Australian dataset. The predictors were age, gender, immunosuppression type and NPS PCR result. Model performance metrics such as accuracy, sensitivity, specificity and predictive values were compared to select the best-performing model. Results: The logistic regression model performed best, with 80% accuracy, improving sensitivity to 86% and maintaining acceptable specificity of 70%. Using this model, positive and negative NPS PCR results indicated post-test probabilities of 84% (likely PJP) and 26% (unlikely PJP), respectively. Conclusions: The logistic regression model should be externally validated in a wider range of settings. As the predictors are simple, routinely collected patient variables, this model may represent a diagnostic advance suitable for settings where collection of lower respiratory tract specimens is difficult but PCR is available.
dc.identifier.citationInternational Health Vol.17 No.5 (2025) , 804-808
dc.identifier.doi10.1093/inthealth/ihae052
dc.identifier.eissn18763405
dc.identifier.issn18763413
dc.identifier.pmid39206512
dc.identifier.scopus2-s2.0-105015526047
dc.identifier.urihttps://repository.li.mahidol.ac.th/handle/123456789/112088
dc.rights.holderSCOPUS
dc.subjectMedicine
dc.subjectSocial Sciences
dc.titleDevelopment and validation of supervised machine learning multivariable prediction models for the diagnosis of Pneumocystis jirovecii pneumonia using nasopharyngeal swab PCR in adults in a low-HIV prevalence setting
dc.typeArticle
mu.datasource.scopushttps://www.scopus.com/inward/record.uri?partnerID=HzOxMe3b&scp=105015526047&origin=inward
oaire.citation.endPage808
oaire.citation.issue5
oaire.citation.startPage804
oaire.citation.titleInternational Health
oaire.citation.volume17
oairecerif.author.affiliationNational University of Singapore
oairecerif.author.affiliationNuffield Department of Medicine
oairecerif.author.affiliationRoyal Brisbane and Women's Hospital
oairecerif.author.affiliationFaculty of Medicine
oairecerif.author.affiliationMahidol Oxford Tropical Medicine Research Unit

Files

Collections