Bilingual Audio Depression Identification Model by Machine Learning

Poomrittigul S.; Kiatrungrit K.; Homsiang P.; Treebupachatsakul T.

Bilingual Audio Depression Identification Model by Machine Learning

4

Issued Date

2025-01-01

Resource Type

Conference Paper

DOI

10.1109/ITC-CSCC66376.2025.11137688

Scopus ID

2-s2.0-105016359557

Journal Title

2025 International Technical Conference on Circuits Systems Computers and Communications Itc Cscc 2025

Rights Holder(s)

SCOPUS

Bibliographic Citation

2025 International Technical Conference on Circuits Systems Computers and Communications Itc Cscc 2025 (2025)

Suggested Citation

Poomrittigul S., Kiatrungrit K., Homsiang P., Treebupachatsakul T. Bilingual Audio Depression Identification Model by Machine Learning. 2025 International Technical Conference on Circuits Systems Computers and Communications Itc Cscc 2025 (2025). doi:10.1109/ITC-CSCC66376.2025.11137688 Retrieved from: https://repository.li.mahidol.ac.th/handle/123456789/112290

Title

Bilingual Audio Depression Identification Model by Machine Learning

Author(s)

Poomrittigul S.
Kiatrungrit K.
Homsiang P.
Treebupachatsakul T.

Author's Affiliation

King Mongkut's Institute of Technology Ladkrabang
Ramathibodi Hospital

Corresponding Author(s)

Poomrittigul S.

Other Contributor(s)

Mahidol University

Abstract

The number of depression patients worldwide, particularly in Thailand, is increasing on an upward trend. Depression screening commonly relies on self-report questionnaires. However, these instruments provide subjective assessments. Recent advancements in machine learning technology offer potential improvements in diagnostic accuracy through more objective measures. This study aims to evaluate the effectiveness of machine learning models in classifying depression using a bilingual audio dataset comprising Thai and English languages. Such models have the potential to assist clinicians by providing objective preliminary screening for depression based on vocal analysis, enhancing diagnostic precision and clinical decision-making. Various machine learning models were implemented including KNN, MLP, Random Forest, Decision Tree, SGD, Logistic Regression, SVM, AdaBoost, and Gaussian Naïve Bayes using MFCC-converted audio datasets. The results indicate that machine learning models effectively classify and identify depression even in bilingual audio datasets compared to individual language models, with the highest accuracy reaching 0.95 from MLP and KNN when testing the trained model by a single Thai audio.

Keyword(s)

Computer Science
Engineering

URI

https://repository.li.mahidol.ac.th/handle/123456789/112290

Collections

Scopus 2025

Full item page

Send Feedback

	Office Hour: Monday-Friday 08.30-12.00 and 13.00-16.30 hrs.
	Phutthamonthon Sai 4 Rd. Salaya, Nakhon Pathom 73170, Thailand
	The office: +66 (2) 800 2680 ext.4306
	thipsuda.van@mahidol.ac.th
	https://repository.li.mahidol.ac.th