Publication:
Similarity measurement for sentiment classification on textual reviews

dc.contributor.authorTan Thongtanen_US
dc.contributor.authorTanasanee Phienthrakulen_US
dc.contributor.otherMahidol Universityen_US
dc.date.accessioned2019-08-23T10:57:36Z
dc.date.available2019-08-23T10:57:36Z
dc.date.issued2018-03-24en_US
dc.description.abstract© 2018 Association for Computing Machinery. Sentiment classification on textual reviews refers to classifying textual reviews based on whether they are positive or negative. This research focuses on classifying movie reviews, and is benchmarked on the IMDB dataset, which consists of long movie reviews, using accuracy as the evaluation metric. In sentiment classification, each document must be mapped to a fixed length vector. Document embedding models map each document to a dense, low-dimensional vector in continuous vector space. This research proposes to train document embedding using cosine similarity instead of dot product. Experiments on the IMDB dataset show that accuracy is improved when using cosine similarity compared to using dot product, while using feature combination with Naïve-Bayes weighted bag of n-grams achieves a new state of the art accuracy of 97.4%.en_US
dc.identifier.citationACM International Conference Proceeding Series. (2018), 24-28en_US
dc.identifier.doi10.1145/3206185.3206204en_US
dc.identifier.other2-s2.0-85057605424en_US
dc.identifier.urihttps://repository.li.mahidol.ac.th/handle/20.500.14594/45644
dc.rightsMahidol Universityen_US
dc.rights.holderSCOPUSen_US
dc.source.urihttps://www.scopus.com/inward/record.uri?partnerID=HzOxMe3b&scp=85057605424&origin=inwarden_US
dc.subjectComputer Scienceen_US
dc.titleSimilarity measurement for sentiment classification on textual reviewsen_US
dc.typeConference Paperen_US
dspace.entity.typePublication
mu.datasource.scopushttps://www.scopus.com/inward/record.uri?partnerID=HzOxMe3b&scp=85057605424&origin=inwarden_US

Files

Collections