Publication:
PKIP: Feature selection in text categorization for item banks

dc.contributor.authorAtom Nuntiyagulen_US
dc.contributor.authorKanlaya Naruedomkulen_US
dc.contributor.authorNick Cerconeen_US
dc.contributor.authorDamras Wongsawangen_US
dc.contributor.otherMahidol Universityen_US
dc.contributor.otherDalhousie Universityen_US
dc.date.accessioned2018-06-21T08:13:39Z
dc.date.available2018-06-21T08:13:39Z
dc.date.issued2005-12-01en_US
dc.description.abstractWe propose an alternative approach to text categorization for item banks. An item bank is a collection of textual data in which each item consists of short sentences and has only a few relevant words for categorization; some items could be categorized into many categories. The traditional categorization techniques cannot provide sufficiently accurate results because of a "lack of words" problem. From this observation, items in the same category always have the same group of terms (or keywords) and the similar locations of these terms in phrases suggest that the items have a high probability to be in the same category. Our new methodology PKIP, patterned keywords in phrase, is proposed to improve categorization accuracy and recover from the "lack of words" problem. The k-highest weight order words are selected as the keywords from each category and their patterns are mapped for feature selection. The value of k affects the classification result. The item bank categorization process is based on a supervised machine learning technique. The sample of the item bank that is used in this research is the collection of Thai primary mathematics problems item bank and we use SVM in the Weka machine learning software package as our classifier. The result of the classification shows that our approach produces acceptable classification results and the highest classification result is given when k = 12. © 2005 IEEE.en_US
dc.identifier.citationProceedings - International Conference on Tools with Artificial Intelligence, ICTAI. Vol.2005, (2005), 212-216en_US
dc.identifier.doi10.1109/ICTAI.2005.95en_US
dc.identifier.issn10823409en_US
dc.identifier.other2-s2.0-33845883004en_US
dc.identifier.urihttps://repository.li.mahidol.ac.th/handle/20.500.14594/16500
dc.rightsMahidol Universityen_US
dc.rights.holderSCOPUSen_US
dc.source.urihttps://www.scopus.com/inward/record.uri?partnerID=HzOxMe3b&scp=33845883004&origin=inwarden_US
dc.subjectEngineeringen_US
dc.titlePKIP: Feature selection in text categorization for item banksen_US
dc.typeConference Paperen_US
dspace.entity.typePublication
mu.datasource.scopushttps://www.scopus.com/inward/record.uri?partnerID=HzOxMe3b&scp=33845883004&origin=inwarden_US

Files

Collections