Publication: A novel label aggregation with attenuated scores for ground-Truth identification of dataset annotation with crowdsourcing
dc.contributor.author | Ratchainant Thammasudjarit | en_US |
dc.contributor.author | Anon Plangprasopchok | en_US |
dc.contributor.author | Charnyote Pluempitiwiriyawej | en_US |
dc.contributor.other | Mahidol University | en_US |
dc.contributor.other | Thailand National Electronics and Computer Technology Center | en_US |
dc.date.accessioned | 2018-12-21T07:21:04Z | |
dc.date.accessioned | 2019-03-14T08:03:24Z | |
dc.date.available | 2018-12-21T07:21:04Z | |
dc.date.available | 2019-03-14T08:03:24Z | |
dc.date.issued | 2017-04-01 | en_US |
dc.description.abstract | © 2017 The Institute of Electronics, Information and Communication Engineers. Ground-Truth identification -The process, which infers the most probable labels, for a certain dataset, from crowdsourcing annotations - is a crucial task to make the dataset usable, e.g., for a supervised learning problem. Nevertheless, the process is challenging because annotations from multiple annotators are inconsistent and noisy. Existing methods require a set of data sample with corresponding ground-Truth labels to precisely estimate annotator performance but such samples are difficult to obtain in practice. Moreover, the process requires a post-editing step to validate indefinite labels, which are generally unidentifiable without thoroughly inspecting the whole annotated data. To address the challenges, this paper introduces: 1) Attenuated score (A-score) -An indicator that locally measures annotator performance for segments of annotation sequences, and 2) label aggregation method that applies A-score for ground-Truth identification. The experimental results demonstrate that A-score label aggregation outperforms majority vote in all datasets by accurately recovering more labels. It also achieves higher F1 scores than those of the strong baselines in all multi-class data. Additionally, the results suggest that A-score is a promising indicator that helps identifying indefinite labels for the postediting procedure. | en_US |
dc.identifier.citation | IEICE Transactions on Information and Systems. Vol.E100D, No.4 (2017), 750-757 | en_US |
dc.identifier.doi | 10.1587/transinf.2016DAP0024 | en_US |
dc.identifier.issn | 17451361 | en_US |
dc.identifier.issn | 09168532 | en_US |
dc.identifier.other | 2-s2.0-85017700020 | en_US |
dc.identifier.uri | https://repository.li.mahidol.ac.th/handle/20.500.14594/42357 | |
dc.rights | Mahidol University | en_US |
dc.rights.holder | SCOPUS | en_US |
dc.source.uri | https://www.scopus.com/inward/record.uri?partnerID=HzOxMe3b&scp=85017700020&origin=inward | en_US |
dc.subject | Computer Science | en_US |
dc.subject | Engineering | en_US |
dc.title | A novel label aggregation with attenuated scores for ground-Truth identification of dataset annotation with crowdsourcing | en_US |
dc.type | Article | en_US |
dspace.entity.type | Publication | |
mu.datasource.scopus | https://www.scopus.com/inward/record.uri?partnerID=HzOxMe3b&scp=85017700020&origin=inward | en_US |