Publication: Speech and prosodic processing for assistive technology
Issued Date
2013-12-01
Resource Type
ISSN
09226389
Other identifier(s)
2-s2.0-84894597328
Rights
Mahidol University
Rights Holder(s)
SCOPUS
Bibliographic Citation
Frontiers in Artificial Intelligence and Applications. Vol.253, (2013), 36-48
Suggested Citation
Lalita Narupiyakul, Vlado Keselj, Nick Cercone, Booncharoen Sirinaovakul Speech and prosodic processing for assistive technology. Frontiers in Artificial Intelligence and Applications. Vol.253, (2013), 36-48. doi:10.3233/978-1-61499-258-5-36 Retrieved from: https://repository.li.mahidol.ac.th/handle/20.500.14594/31604
Research Projects
Organizational Units
Authors
Journal Issue
Thesis
Title
Speech and prosodic processing for assistive technology
Abstract
A speaker's utterance may convey different meanings to a hearer than what the speaker intended. Such ambiguities can be resolved by emphasizing accents at different positions. In human communication, the utterances are emphasized at a focus part to distinguish the important content and reduce ambiguity in the utterance. In our Focus-to-Emphasize Tone (FET) system, we determine how the speaker's utterances are influenced by focus and speaker's intention. The relationships of focus information, speaker's intention and prosodic phenomena are investigated to recognize the intonation patterns and annotate the sentence with prosodic marks. We propose using the Focus to Emphasize Tone (FET) analysis, which includes: (i) generating the constraints for foci, speaker's intention and prosodic features, (ii) defining the intonation patterns, and (iii) labelling a set of prosodic marks for a sentence. We also design the FET structure to support our analysis and to contain focus, speaker's intention and prosodic components. An implementation of the system is described and the evaluation results on the CMU Communicator (CMU-COM) dataset are presented. © 2013 The authors and IOS Press. All rights reserved.