Teacher-to-Teacher: Harmonizing Dual Expertise into a Unified Speech Emotion Model

dc.contributor.authorSingkul S.
dc.contributor.authorYuenyong S.
dc.contributor.authorWongpatikaseree K.
dc.contributor.correspondenceSingkul S.
dc.contributor.otherMahidol University
dc.date.accessioned2025-02-24T18:20:51Z
dc.date.available2025-02-24T18:20:51Z
dc.date.issued2024-01-01
dc.description.abstractThis paper introduces the Teacher-to-Teacher (T2T) framework, a novel approach in speech emotion recognition (SER) specifically tailored for the Thai language. Leveraging the dual expertise of the Wav2Vec and Wav2Vec2 models, the T2T framework utilizes unsupervised and self-supervised learning knowledges to effectively address the unique challenges posed by tonal languages. By integrating these two powerful models into a unified SER framework, T2T enhances its capability to process and interpret nuanced emotional cues in speech, achieving superior performance compared to traditional SER methods. Evaluated across three major datasets - ThaiSER, EMOLA, and MU - the framework demonstrates significant improvements in unweighted accuracy and F1-score. Innovations such as emotional clustering representation and targeted emotional representation contribute to its high precision in detecting and differentiating subtle emotional states. Additionally, the integration of a fine-tuned teacher module aligns these advancements with practical SER applications, further increasing the framework's accuracy and sensitivity in real-world scenarios. The successful implementation of the T2T framework opens new avenues for enhancing SER technologies in other low-resource languages and extends its applicability to real-time processing applications, thereby advancing the field of computational emotion recognition.
dc.identifier.citationConference Proceedings - IEEE International Conference on Systems, Man and Cybernetics (2024) , 2882-2887
dc.identifier.doi10.1109/SMC54092.2024.10830986
dc.identifier.issn1062922X
dc.identifier.scopus2-s2.0-85217844053
dc.identifier.urihttps://repository.li.mahidol.ac.th/handle/20.500.14594/105407
dc.rights.holderSCOPUS
dc.subjectComputer Science
dc.subjectEngineering
dc.titleTeacher-to-Teacher: Harmonizing Dual Expertise into a Unified Speech Emotion Model
dc.typeConference Paper
mu.datasource.scopushttps://www.scopus.com/inward/record.uri?partnerID=HzOxMe3b&scp=85217844053&origin=inward
oaire.citation.endPage2887
oaire.citation.startPage2882
oaire.citation.titleConference Proceedings - IEEE International Conference on Systems, Man and Cybernetics
oairecerif.author.affiliationMahidol University
oairecerif.author.affiliationLtd

Files

Collections