Publication: Semi-automated augmentation of pandas dataframes
dc.contributor.author | Steven Lynden | en_US |
dc.contributor.author | Waran Taveekarn | en_US |
dc.contributor.other | National Institute of Advanced Industrial Science and Technology | en_US |
dc.contributor.other | Mahidol University | en_US |
dc.date.accessioned | 2020-01-27T08:23:27Z | |
dc.date.available | 2020-01-27T08:23:27Z | |
dc.date.issued | 2019-01-01 | en_US |
dc.description.abstract | © 2019, Springer Nature Singapore Pte Ltd. Creative feature engineering is an important aspect within machine learning prediction tasks which can be facilitated by augmenting datasets with additional data to improve predictions. This paper presents an approach towards augmenting existing datasets represented as pandas dataframes with data from open data sources, semi-automatically, with the aims of (1) automatically suggesting data augmentation options given an existing set of features, and (2) automatically augmenting the data when a suggestion is selected by the user. This paper demonstrates the performance of the approach in terms of aligning typical machine learning datasets with open data sources, suggesting useful augmentation options, and the design and implementation of a software tool implementing the approach, available as open-source software. | en_US |
dc.identifier.citation | Communications in Computer and Information Science. Vol.1071, (2019), 70-79 | en_US |
dc.identifier.doi | 10.1007/978-981-32-9563-6_8 | en_US |
dc.identifier.issn | 18650937 | en_US |
dc.identifier.issn | 18650929 | en_US |
dc.identifier.other | 2-s2.0-85070014915 | en_US |
dc.identifier.uri | https://repository.li.mahidol.ac.th/handle/20.500.14594/50680 | |
dc.rights | Mahidol University | en_US |
dc.rights.holder | SCOPUS | en_US |
dc.source.uri | https://www.scopus.com/inward/record.uri?partnerID=HzOxMe3b&scp=85070014915&origin=inward | en_US |
dc.subject | Computer Science | en_US |
dc.subject | Mathematics | en_US |
dc.title | Semi-automated augmentation of pandas dataframes | en_US |
dc.type | Conference Paper | en_US |
dspace.entity.type | Publication | |
mu.datasource.scopus | https://www.scopus.com/inward/record.uri?partnerID=HzOxMe3b&scp=85070014915&origin=inward | en_US |