Incremental Naïve Bayesian Spam Mail Filtering and Variant Incremental Training
Issued Date
2009
Resource Type
Language
eng
ISBN
978-0-7695-3641-5
Rights
Mahidol University
Rights Holder(s)
IEEEXPLORE
Suggested Citation
Phimphaka Taninpong, Sudsanguan Ngamsuriyaroj, สุดสงวน งามสุริยโรจน์ (2009). Incremental Naïve Bayesian Spam Mail Filtering and Variant Incremental Training. Retrieved from: https://repository.li.mahidol.ac.th/handle/20.500.14594/10453
Title
Incremental Naïve Bayesian Spam Mail Filtering and Variant Incremental Training
Abstract
This paper proposes an incremental spam mail filtering using Naïve Bayesian classification which gives simplicity and adaptability. To keep the training set to a limited size and small, the sliding window is applied and the training set is updated when new emails are received. In effect, features in the training set are incrementally updated, and the model would be adaptive to a new spam pattern. In addition, we present three incremental training schemes: a window containing only the most recent emails, a window containing the previous batch of emails, and a window containing all already seen emails. The proposed model is evaluated using two spam corpora: Trec05p-1 [12] and Trec06p [13]. In our experiments, the window size is varied, the processing time per message, and the ham and spam misclassification rates are measured. The results show that the third incremental training scheme gives the best outcomes, and the window size significantly affects the misclassification rates and the processing time.
Description
The 8th IEEE/ACIS International Conference on Computer and Information Science (ICIS 2009). Pine City Hotel, Shanghai, China, page 383-387