Publication: On evaluating the quality of rule-based classification systems
Issued Date
2017-10-01
Resource Type
ISSN
1881803X
Other identifier(s)
2-s2.0-85030100427
Rights
Mahidol University
Rights Holder(s)
SCOPUS
Bibliographic Citation
ICIC Express Letters. Vol.11, No.10 (2017), 1515-1523
Suggested Citation
Nassim Dehouche On evaluating the quality of rule-based classification systems. ICIC Express Letters. Vol.11, No.10 (2017), 1515-1523. Retrieved from: https://repository.li.mahidol.ac.th/handle/20.500.14594/42324
Research Projects
Organizational Units
Authors
Journal Issue
Thesis
Title
On evaluating the quality of rule-based classification systems
Author(s)
Other Contributor(s)
Abstract
© 2017. Two indicators are classically used to evaluate the quality of rule-based classification systems: predictive accuracy, i.e., the system’s ability to successfully reproduce learning data and coverage, i.e., the proportion of possible cases for which the logical rules constituting the system apply. In this work, we claim that these two indicators may be insufficient, and additional measures of quality may need to be developed. We theoretically show that classification systems presenting “good” predictive accuracy and coverage can, nonetheless, be trivially improved and illustrate this proposition with examples. To conceptualize our main claim, we characterize a property of reducibility. A classification system is said to be reducible, if and only if, its constituent rules can be replaced by a subset of their elementary conditions, while preserving the quality of the system. We derive a time-efficient constructive algorithm to test this property and to improve a system’s predictive accuracy and coverage in case of a positive response. Furthermore, we provide a set of sufficient conditions that can be used to detect non-reducibility and thus validate rule-based classification systems. We use the proposed approach to evaluate a previously published work applied to a public dataset pertaining to the business bankruptcy prediction, using three popular machine learning approaches (namely genetic algorithms, inductive learning and neural networks). The results of this application support our main claim. We conclude this paper by suggesting that a classification system’s ability to clarify trade-offs between attributes should be measured, and used as an additional performance indicator. A possible further development of this work consists in developing such an indicator.