Unsupervised machine learning clustering approach for hospitalized COVID-19 pneumonia patients

dc.contributor.authorNalinthasnai N.
dc.contributor.authorThammasudjarit R.
dc.contributor.authorTassaneyasin T.
dc.contributor.authorEksombatchai D.
dc.contributor.authorSungkanuparph S.
dc.contributor.authorBoonsarngsuk V.
dc.contributor.authorSutherasan Y.
dc.contributor.authorJunhasavasdikul D.
dc.contributor.authorTheerawit P.
dc.contributor.authorPetnak T.
dc.contributor.correspondenceNalinthasnai N.
dc.contributor.otherMahidol University
dc.date.accessioned2025-02-27T18:21:04Z
dc.date.available2025-02-27T18:21:04Z
dc.date.issued2025-12-01
dc.description.abstractBackground: Identification of distinct clinical phenotypes of diseases can guide personalized treatment. This study aimed to classify hospitalized COVID-19 pneumonia subgroups using an unsupervised machine learning approach. Methods: We included hospitalized COVID-19 pneumonia patients from July to September 2021. K-means clustering, an unsupervised machine learning method, was performed to identify clinical phenotypes based on clinical and laboratory variables collected within 24 hours of admission. Variables were normalized before clustering to ensure equal contribution to the analysis. The optimal number of clusters was determined using the elbow method and Silhouette scores. Cox proportional hazard models were used to compare the risk of intubation and 90-day mortality across the identified clusters. Results: Three clinically distinct clusters were identified among 538 hospitalized COVID-19 pneumonia patients. Cluster 1 (N = 27) consisted predominantly of males and showed significantly elevated serum liver enzymes and LDH levels. Cluster 2 (N = 370) was characterized by lower chest x-ray scores and higher serum albumin levels. Cluster 3 (N = 141) was characterized by older age, diabetes mellitus, higher chest x-ray scores, more severe vital signs, higher creatinine levels, lower hemoglobin levels, lower lymphocyte counts, higher C-reactive protein, higher D-dimer, and higher LDH levels. When compared to cluster 2, cluster 3 was significantly associated with increased risk of 90-day mortality (HR, 6.24; 95% CI, 2.42–16.09) and intubation (HR, 5.26; 95% CI 2.37–11.72). In contrast, cluster 1 had a 100% survival rate with a non-significant increase in intubation risk compared to cluster 2 (HR, 1.40, 95% CI, 0.18–11.04). Conclusions: We identified three distinct clinical phenotypes of COVID-19 pneumonia patients, with cluster 3 associated with an increased risk of respiratory failure and mortality. These findings may guide tailored clinical management strategies.
dc.identifier.citationBMC Pulmonary Medicine Vol.25 No.1 (2025)
dc.identifier.doi10.1186/s12890-025-03536-w
dc.identifier.eissn14712466
dc.identifier.scopus2-s2.0-85218268259
dc.identifier.urihttps://repository.li.mahidol.ac.th/handle/123456789/105465
dc.rights.holderSCOPUS
dc.subjectMedicine
dc.titleUnsupervised machine learning clustering approach for hospitalized COVID-19 pneumonia patients
dc.typeArticle
mu.datasource.scopushttps://www.scopus.com/inward/record.uri?partnerID=HzOxMe3b&scp=85218268259&origin=inward
oaire.citation.issue1
oaire.citation.titleBMC Pulmonary Medicine
oaire.citation.volume25
oairecerif.author.affiliationFaculty of Medicine Ramathibodi Hospital, Mahidol University
oairecerif.author.affiliationSrinakharinwirot University

Files

Collections