Publication: Incremental naïve bayesian spam mail filtering and variant incremental training
Issued Date
2009-11-10
Resource Type
Other identifier(s)
2-s2.0-70350706167
Rights
Mahidol University
Rights Holder(s)
SCOPUS
Bibliographic Citation
Proceedings of the 2009 8th IEEE/ACIS International Conference on Computer and Information Science, ICIS 2009. (2009), 383-387
Suggested Citation
Phimphaka Taninpong, Sudsanguan Ngamsuriyaroj Incremental naïve bayesian spam mail filtering and variant incremental training. Proceedings of the 2009 8th IEEE/ACIS International Conference on Computer and Information Science, ICIS 2009. (2009), 383-387. doi:10.1109/ICIS.2009.176 Retrieved from: https://repository.li.mahidol.ac.th/handle/123456789/27485
Research Projects
Organizational Units
Authors
Journal Issue
Thesis
Title
Incremental naïve bayesian spam mail filtering and variant incremental training
Author(s)
Other Contributor(s)
Abstract
This paper proposes an incremental spam mail filtering using Naïve Bayesian classification which gives simplicity and adaptability. To keep the training set to a limited size and small, the sliding window is applied and the training set is updated when new emails are received. In effect, features in the training set are incrementally updated, and the model would be adaptive to a new spam pattern. In addition, we present three incremental training schemes: a window containing only the most recent emails, a window containing the previous batch of mails, and a window containing all already seen emails. The proposed model is evaluated using two spam corpora: Trec05p-1 [12] and Trec06p [13]. In our experiments, the window size is varied, the processing time per message, and the ham and spam misclassification rates are measured. The results show that the third incremental training scheme gives the best outcomes, and the window size significantly affects the misclassification rates and the processing time. © 2009 IEEE.
