Rangsipan MarukatatMahidol University2020-01-272020-01-272020-01-01Studies in Computational Intelligence. Vol.847, (2020), 81-93186095031860949X2-s2.0-85077127495https://repository.li.mahidol.ac.th/handle/123456789/49583© 2020, Springer Nature Switzerland AG. This research compares the effectiveness of using traditional bag-of-words and word-embedding attributes to classify movie comments into spoiler or non-spoiler. Both approaches were applied to comments in English, an inflectional language; and in Thai, a non-inflectional language. Experimental results suggested that in terms of classification performance, word embedding was not clearly better than bag of words. Yet, a decision to choose it over bag of words could be due to its scalability. Between Word2Vec and FastText embeddings, the former was favorable when few out-of-vocabulary (OOV) words were present. Finally, although FastText was expected to be helpful with a large number of OOV words, its benefit was hardly seen for Thai language.Mahidol UniversityComputer ScienceA Comparative Study of Using Bag-of-Words and Word-Embedding Attributes in the Spoiler Classification of English and Thai TextConference PaperSCOPUS10.1007/978-3-030-25217-5_7