Publication: Improving pseudo-code detection in ubiquitous scholarly data using ensemble machine learning
Issued Date
2017-02-21
Resource Type
Other identifier(s)
2-s2.0-85016200381
Rights
Mahidol University
Rights Holder(s)
SCOPUS
Bibliographic Citation
20th International Computer Science and Engineering Conference: Smart Ubiquitos Computing and Knowledge, ICSEC 2016. (2017)
Suggested Citation
Suppawong Tuarob Improving pseudo-code detection in ubiquitous scholarly data using ensemble machine learning. 20th International Computer Science and Engineering Conference: Smart Ubiquitos Computing and Knowledge, ICSEC 2016. (2017). doi:10.1109/ICSEC.2016.7859944 Retrieved from: https://repository.li.mahidol.ac.th/handle/123456789/42401
Research Projects
Organizational Units
Authors
Journal Issue
Thesis
Title
Improving pseudo-code detection in ubiquitous scholarly data using ensemble machine learning
Author(s)
Other Contributor(s)
Abstract
© 2016 IEEE. A significant number of new algorithms constantly emerge ubiquitously as computer science and other computational related disciplines grow in advancement and complexity. A majority of these algorithms are developed by professional researchers who publish their algorithmic advancements in scholarly articles, especially in the form of pseudo-codes. The ability to automatically collect, manage, and index these pseudocodes could prove to be useful for computer scientists and software developers seeking cutting-edge algorithmic solutions to their problems. In an effort towards automatic retrieval of these pseudo-codes, a machine learning based approach that detects and extracts these pseudo-codes in large scale scholarly documents has recently been proposed. In this paper, we extend the previous findings by investigating possible enhancement on the previously proposed classification methodology using ensemble learning techniques. The results illustrate that Random Forest is by far the most effective ensemble learning method which improves the classification performance by 13% over the best base classifier.
