Publication:
A Comparative Study of Using Bag-of-Words and Word-Embedding Attributes in the Spoiler Classification of English and Thai Text

dc.contributor.authorRangsipan Marukataten_US
dc.contributor.otherMahidol Universityen_US
dc.date.accessioned2020-01-27T03:32:11Z
dc.date.available2020-01-27T03:32:11Z
dc.date.issued2020-01-01en_US
dc.description.abstract© 2020, Springer Nature Switzerland AG. This research compares the effectiveness of using traditional bag-of-words and word-embedding attributes to classify movie comments into spoiler or non-spoiler. Both approaches were applied to comments in English, an inflectional language; and in Thai, a non-inflectional language. Experimental results suggested that in terms of classification performance, word embedding was not clearly better than bag of words. Yet, a decision to choose it over bag of words could be due to its scalability. Between Word2Vec and FastText embeddings, the former was favorable when few out-of-vocabulary (OOV) words were present. Finally, although FastText was expected to be helpful with a large number of OOV words, its benefit was hardly seen for Thai language.en_US
dc.identifier.citationStudies in Computational Intelligence. Vol.847, (2020), 81-93en_US
dc.identifier.doi10.1007/978-3-030-25217-5_7en_US
dc.identifier.issn18609503en_US
dc.identifier.issn1860949Xen_US
dc.identifier.other2-s2.0-85077127495en_US
dc.identifier.urihttps://repository.li.mahidol.ac.th/handle/123456789/49583
dc.rightsMahidol Universityen_US
dc.rights.holderSCOPUSen_US
dc.source.urihttps://www.scopus.com/inward/record.uri?partnerID=HzOxMe3b&scp=85077127495&origin=inwarden_US
dc.subjectComputer Scienceen_US
dc.titleA Comparative Study of Using Bag-of-Words and Word-Embedding Attributes in the Spoiler Classification of English and Thai Texten_US
dc.typeConference Paperen_US
dspace.entity.typePublication
mu.datasource.scopushttps://www.scopus.com/inward/record.uri?partnerID=HzOxMe3b&scp=85077127495&origin=inwarden_US

Files

Collections