Publication: Classification of Tweets Related to Illegal Activities in Thai Language
dc.contributor.author | Sumeth Yuenyong | en_US |
dc.contributor.author | Narit Hnoohom | en_US |
dc.contributor.author | Konlakorn Wongpatikaseree | en_US |
dc.contributor.author | Teerapong Pheungbun Na Ayutthaya | en_US |
dc.contributor.other | Mahidol University | en_US |
dc.date.accessioned | 2019-08-23T10:56:15Z | |
dc.date.available | 2019-08-23T10:56:15Z | |
dc.date.issued | 2018-07-02 | en_US |
dc.description.abstract | © 2018 IEEE. This paper presents classification of tweets related to illegal activities in Thai language. The unfiltered nature of Twitter allows it to be used as platform for communication about illegal activities. The sheer number of tweets makes an automatic tweet classification needed to detect these illegal tweets. Very little had been done about this issue, especially in the Thai language. Tweets classification is more difficult that standard text classification due to their short length colloquial nature. Furthermore, the training data is imbalanced because legal tweets are very easy to find while illegal tweets of specific types are quite hard to come by. We propose a tree-like hierarchical model where each node is a full deep neural network based on convolutional LSTM architecture. In order to deal with highly imbalanced training data, tweets were classified in two stages: legal/illegal first before being classified among the illegal classes. Furthermore, ensemble classifiers were used to detect difficult illegal classes that were misclassified as legal by the first stage. Experiment result shows that this approach has significantly better performance than the baseline of using only a single network to classify among all classes in a single stage. | en_US |
dc.identifier.citation | 2018 International Joint Symposium on Artificial Intelligence and Natural Language Processing, iSAI-NLP 2018 - Proceedings. (2018) | en_US |
dc.identifier.doi | 10.1109/iSAI-NLP.2018.8692858 | en_US |
dc.identifier.other | 2-s2.0-85065081604 | en_US |
dc.identifier.uri | https://repository.li.mahidol.ac.th/handle/20.500.14594/45615 | |
dc.rights | Mahidol University | en_US |
dc.rights.holder | SCOPUS | en_US |
dc.source.uri | https://www.scopus.com/inward/record.uri?partnerID=HzOxMe3b&scp=85065081604&origin=inward | en_US |
dc.subject | Computer Science | en_US |
dc.subject | Medicine | en_US |
dc.title | Classification of Tweets Related to Illegal Activities in Thai Language | en_US |
dc.type | Conference Paper | en_US |
dspace.entity.type | Publication | |
mu.datasource.scopus | https://www.scopus.com/inward/record.uri?partnerID=HzOxMe3b&scp=85065081604&origin=inward | en_US |