Publication:
Fast Streaming k-Means Clustering with Coreset Caching

dc.contributor.authorYu Zhangen_US
dc.contributor.authorKanat Tangwongsanen_US
dc.contributor.authorSrikanta Tirthapuraen_US
dc.contributor.otherMahidol Universityen_US
dc.contributor.otherIowa State Universityen_US
dc.date.accessioned2020-10-05T04:42:58Z
dc.date.available2020-10-05T04:42:58Z
dc.date.issued2020-01-01en_US
dc.description.abstractIEEE We present new algorithms for k-means clustering on a data stream with a focus on providing fast responses to clustering queries. Compared to the state-of-the-art, our algorithms provide substantial improvements in the query time for cluster centers while retaining the desirable properties of provably small approximation error and low space usage. Our proposed clustering algorithms systematically reuse the "coresets" (summaries of data) computed for recent queries in answering the current clustering query, a novel technique which we refer to as coreset caching. We also present an algorithm called OnlineCC that integrates the coreset caching idea with a simple sequential streaming k-means algorithm. In practice, OnlineCC algorithm can provide constant query time. We present both theoretical analysis and detailed experiments demonstrating the correctness, accuracy, and efficiency of all our proposed clustering algorithms.en_US
dc.identifier.citationIEEE Transactions on Knowledge and Data Engineering. (2020)en_US
dc.identifier.doi10.1109/TKDE.2020.3018744en_US
dc.identifier.issn15582191en_US
dc.identifier.issn10414347en_US
dc.identifier.other2-s2.0-85090466944en_US
dc.identifier.urihttps://repository.li.mahidol.ac.th/handle/20.500.14594/59046
dc.rightsMahidol Universityen_US
dc.rights.holderSCOPUSen_US
dc.source.urihttps://www.scopus.com/inward/record.uri?partnerID=HzOxMe3b&scp=85090466944&origin=inwarden_US
dc.subjectComputer Scienceen_US
dc.titleFast Streaming k-Means Clustering with Coreset Cachingen_US
dc.typeArticleen_US
dspace.entity.typePublication
mu.datasource.scopushttps://www.scopus.com/inward/record.uri?partnerID=HzOxMe3b&scp=85090466944&origin=inwarden_US

Files

Collections