Publication:
Semi-automated augmentation of pandas dataframes

dc.contributor.authorSteven Lyndenen_US
dc.contributor.authorWaran Taveekarnen_US
dc.contributor.otherNational Institute of Advanced Industrial Science and Technologyen_US
dc.contributor.otherMahidol Universityen_US
dc.date.accessioned2020-01-27T08:23:27Z
dc.date.available2020-01-27T08:23:27Z
dc.date.issued2019-01-01en_US
dc.description.abstract© 2019, Springer Nature Singapore Pte Ltd. Creative feature engineering is an important aspect within machine learning prediction tasks which can be facilitated by augmenting datasets with additional data to improve predictions. This paper presents an approach towards augmenting existing datasets represented as pandas dataframes with data from open data sources, semi-automatically, with the aims of (1) automatically suggesting data augmentation options given an existing set of features, and (2) automatically augmenting the data when a suggestion is selected by the user. This paper demonstrates the performance of the approach in terms of aligning typical machine learning datasets with open data sources, suggesting useful augmentation options, and the design and implementation of a software tool implementing the approach, available as open-source software.en_US
dc.identifier.citationCommunications in Computer and Information Science. Vol.1071, (2019), 70-79en_US
dc.identifier.doi10.1007/978-981-32-9563-6_8en_US
dc.identifier.issn18650937en_US
dc.identifier.issn18650929en_US
dc.identifier.other2-s2.0-85070014915en_US
dc.identifier.urihttps://repository.li.mahidol.ac.th/handle/20.500.14594/50680
dc.rightsMahidol Universityen_US
dc.rights.holderSCOPUSen_US
dc.source.urihttps://www.scopus.com/inward/record.uri?partnerID=HzOxMe3b&scp=85070014915&origin=inwarden_US
dc.subjectComputer Scienceen_US
dc.subjectMathematicsen_US
dc.titleSemi-automated augmentation of pandas dataframesen_US
dc.typeConference Paperen_US
dspace.entity.typePublication
mu.datasource.scopushttps://www.scopus.com/inward/record.uri?partnerID=HzOxMe3b&scp=85070014915&origin=inwarden_US

Files

Collections