Yang J.Simmachan T.Shakya S.Boonkrong P.Mahidol University2025-10-122025-10-122025-01-01Engineering Proceedings Vol.108 No.1 (2025)https://repository.li.mahidol.ac.th/handle/123456789/112524We developed a machine-learning model for the International Classification of Diseases, 10th Revision (ICD-10) classification using data from 5108 patients. Nine features, including age, gender, BMI, and vital signs, were extracted to classify the top three ICD-10 categories: intestinal infections, tuberculosis, and other bacterial diseases. Decision trees, random forest, and XGBoost models were tested using the synthetic minority over-sampling technique (SMOTE) and class weights to minimize class imbalance. Five-fold cross-validation was used using the training and testing datasets in a data ratio of 80:20. The random forest model with class weights showed the best performance. Shapley additive explanations (SHAP) analysis highlighted body-mass index (BMI), gender, and pulse as key features. The developed model showed potential for enhancing ICD-10 classification through real-time and personalized medical applications.EngineeringClassification of Infectious and Parasitic Diseases by Smart Healthcare System †ArticleSCOPUS10.3390/engproc20251080142-s2.0-10501784459526734591