Publication:
Automatic Classification of Algorithm Citation Functions in Scientific Literature

dc.contributor.authorSuppawong Tuaroben_US
dc.contributor.authorSung Woo Kangen_US
dc.contributor.authorPoom Wettayakornen_US
dc.contributor.authorChanatip Pornprasiten_US
dc.contributor.authorTanakitti Sachatien_US
dc.contributor.authorSaeed Ul Hassanen_US
dc.contributor.authorPeter Haddawyen_US
dc.contributor.otherInformation Technology Universityen_US
dc.contributor.otherInha University, Incheonen_US
dc.contributor.otherMahidol Universityen_US
dc.date.accessioned2020-10-05T04:39:38Z
dc.date.available2020-10-05T04:39:38Z
dc.date.issued2020-10-01en_US
dc.description.abstract© 1989-2012 IEEE. Computer sciences and related disciplines evolve around developing, evaluating, and applying algorithms. Typically, an algorithm is not developed from scratch, but uses and builds upon existing ones, which often are proposed and published in scholarly articles. The ability to capture this evolution relationship among these algorithms in scientific literature would not only allow us to understand how a particular algorithm is composed, but also shed light on large-scale analysis of algorithmic evolution through different temporal spans and thematic scales. We propose to capture such evolution relationship between two algorithms by investigating the knowledge represented in citation contexts, where authors explain how cited algorithms are used in their works. A set of heterogeneous ensemble machine-learning methods is proposed, where the combination of two base classifiers trained with heterogeneous feature types is used to automatically identify the algorithm usage relationship. The proposed heterogeneous ensemble methods achieve the best average F1 of 0.749 and 0.905 for fine-grained and binary algorithm citation function classification, respectively. The success of this study will allow us to generate a large-scale algorithm citation network from a collection of scholarly documents representing multiple time spans, venues, and fields of study. Such a network will be used as an instrument not only to answer critical questions in algorithm search, such as identifying the most influential and generalizable algorithms, but also to study the evolution of algorithmic development and trends over time.en_US
dc.identifier.citationIEEE Transactions on Knowledge and Data Engineering. Vol.32, No.10 (2020), 1881-1896en_US
dc.identifier.doi10.1109/TKDE.2019.2913376en_US
dc.identifier.issn15582191en_US
dc.identifier.issn10414347en_US
dc.identifier.other2-s2.0-85091230018en_US
dc.identifier.urihttps://repository.li.mahidol.ac.th/handle/123456789/59041
dc.rightsMahidol Universityen_US
dc.rights.holderSCOPUSen_US
dc.source.urihttps://www.scopus.com/inward/record.uri?partnerID=HzOxMe3b&scp=85091230018&origin=inwarden_US
dc.subjectComputer Scienceen_US
dc.titleAutomatic Classification of Algorithm Citation Functions in Scientific Literatureen_US
dc.typeArticleen_US
dspace.entity.typePublication
mu.datasource.scopushttps://www.scopus.com/inward/record.uri?partnerID=HzOxMe3b&scp=85091230018&origin=inwarden_US

Files

Collections