Publication: Web proxy log classification for burst behavior
Issued Date
2017-02-08
Resource Type
ISSN
21593450
21593442
21593442
Other identifier(s)
2-s2.0-85015402538
Rights
Mahidol University
Rights Holder(s)
SCOPUS
Bibliographic Citation
IEEE Region 10 Annual International Conference, Proceedings/TENCON. (2017), 472-477
Suggested Citation
Nattapol Kiatkumjounwong, Sudsanguan Ngamsuriyaroj, Anon Plangprasopchok Web proxy log classification for burst behavior. IEEE Region 10 Annual International Conference, Proceedings/TENCON. (2017), 472-477. doi:10.1109/TENCON.2016.7848044 Retrieved from: https://repository.li.mahidol.ac.th/handle/20.500.14594/42440
Research Projects
Organizational Units
Authors
Journal Issue
Thesis
Title
Web proxy log classification for burst behavior
Other Contributor(s)
Abstract
© 2016 IEEE. Many organizations and most Internet service providers need to keep the history of web accesses in the form of proxy logs. Such logs would be later used for web usage as well as for investigating abnormal activities including an abuse, a misuse or fraud. This paper classifies web proxy logs into normal, non-burst and burst. To filter out normal logs, we use Apriori algorithm in Weka mining tool to detect the outlier based on the duration and the bandwidth of logs for file categories. Burst logs are separated out from outlier logs using the threshold rates computed for file types. The experimental results show the majority of about 80% for normal logs, and burst logs count for about 2% which should be further investigated for unusual behavior. Since the number of logs kept on storage would be very huge, it would take a long time to process them timely. Thus, we measure the performance of parallel log processing on a Hadoop system with varying data size and the number of nodes. We found that the speedup of log processing is well corresponded to the increasing workload, and it would be convincing to process logs in real time.