Publication:
Element matching across data-oriented XML sources using a multi-strategy clustering model

dc.contributor.authorCharnyote Pluempitiwiriyawejen_US
dc.contributor.authorJoachim Hammeren_US
dc.contributor.otherMahidol Universityen_US
dc.contributor.otherUniversity of Floridaen_US
dc.date.accessioned2018-07-24T03:40:54Z
dc.date.available2018-07-24T03:40:54Z
dc.date.issued2004-03-01en_US
dc.description.abstractWe describe a family of heuristics-based clustering strategies to support the merging of XML data from multiple sources. As part of this research, we have developed a comprehensive classification for schematic and semantic conflicts that can occur when reconciling related XML data from multiple sources. Given the fact that element clustering is compute-intensive, especially when comparing large numbers of data elements that exhibit great representational diversity, performance is a critical, yet so far neglected aspect of the merging process. We have developed five heuristics for clustering data in the multi-dimensional metric space. Equivalence of data elements within the individual clusters is determined using several distance functions that calculate the semantic distances among the elements. The research described in this article is conducted within the context of the Integration Wizard (IWIZ) project at the University of Florida. IWIZ enables users to access and retrieve information from multiple XML-based sources through a consistent, integrated view. The results of our qualitative analysis of the clustering heuristics have validated the feasibility of our approach as well as its superior performance when compared to other similarity search techniques. © 2002 Elsevier Science B.V. All rights reserved.en_US
dc.identifier.citationData and Knowledge Engineering. Vol.48, No.3 (2004), 297-333en_US
dc.identifier.doi10.1016/j.datak.2003.06.001en_US
dc.identifier.issn0169023Xen_US
dc.identifier.other2-s2.0-1142288175en_US
dc.identifier.urihttps://repository.li.mahidol.ac.th/handle/20.500.14594/21294
dc.rightsMahidol Universityen_US
dc.rights.holderSCOPUSen_US
dc.source.urihttps://www.scopus.com/inward/record.uri?partnerID=HzOxMe3b&scp=1142288175&origin=inwarden_US
dc.subjectDecision Sciencesen_US
dc.titleElement matching across data-oriented XML sources using a multi-strategy clustering modelen_US
dc.typeArticleen_US
dspace.entity.typePublication
mu.datasource.scopushttps://www.scopus.com/inward/record.uri?partnerID=HzOxMe3b&scp=1142288175&origin=inwarden_US

Files

Collections