Publication:
Tree-based text stream clustering with application to spam mail classification

dc.contributor.authorPhimphaka Taninpongen_US
dc.contributor.authorSudsanguan Ngamsuriyarojen_US
dc.contributor.otherMahidol Universityen_US
dc.date.accessioned2019-08-23T10:43:37Z
dc.date.available2019-08-23T10:43:37Z
dc.date.issued2018-01-01en_US
dc.description.abstractCopyright © 2018 Inderscience Enterprises Ltd. This paper proposes a new text clustering algorithm based on a tree structure. The main idea of the clustering algorithm is a sub-tree at a specific node represents a document cluster. Our clustering algorithm is a single pass scanning algorithm which traverses down the tree to search for all clusters without having to predefine the number of clusters. Thus, it fits our objectives to produce document clusters having high cohesion, and to keep the minimum number of clusters. Moreover, an incremental learning process will perform after a new document is inserted into the tree, and the clusters will be rebuilt to accommodate the new information. In addition, we applied the proposed clustering algorithm to spam mail classification and the experimental results show that tree-based text clustering spam filter gives higher accuracy and specificity than the cobweb clustering, naïve Bayes and KNN.en_US
dc.identifier.citationInternational Journal of Data Mining, Modelling and Management. Vol.10, No.4 (2018), 353-370en_US
dc.identifier.doi10.1504/IJDMMM.2018.095354en_US
dc.identifier.issn17591171en_US
dc.identifier.issn17591163en_US
dc.identifier.other2-s2.0-85054534251en_US
dc.identifier.urihttps://repository.li.mahidol.ac.th/handle/20.500.14594/45390
dc.rightsMahidol Universityen_US
dc.rights.holderSCOPUSen_US
dc.source.urihttps://www.scopus.com/inward/record.uri?partnerID=HzOxMe3b&scp=85054534251&origin=inwarden_US
dc.subjectBusiness, Management and Accountingen_US
dc.subjectComputer Scienceen_US
dc.subjectMathematicsen_US
dc.titleTree-based text stream clustering with application to spam mail classificationen_US
dc.typeArticleen_US
dspace.entity.typePublication
mu.datasource.scopushttps://www.scopus.com/inward/record.uri?partnerID=HzOxMe3b&scp=85054534251&origin=inwarden_US

Files

Collections