A Novel Method for Data Conflict Resolution using Multiple Rules

Zhang Yong-Xin1, Li Qing-Zhong1 and Peng Zhao-Hui1

  1. School of Computer Science and Technology, Shandong University
    Jinan 250101, China
    waterzyx@gmail.com, {lqz; pzh}@sdu.edu.cn

Abstract

In data integration, data conflict resolution is the crucial issue which is closely correlated with the quality of integrated data. Current research focuses on resolving data conflict on single attribute, which does not consider not only the conflict degree of different attributes but also the interrelationship of data conflict resolution on different attributes, and it can reduce the accuracy of resolution results. This paper proposes a novel two-stage data conflict resolution based on Markov Logic Networks. Our approach can divide attributes according to their conflict degree, then resolves data conflicts in the following two steps: (1)For the week conflicting attributes, we exploit a few common rules to resolve data conflicts, such rules as voting and mutual implication between facts. (2)Then, we resolve the strong conflicting attributes based on results from the first step. In this step, additional rules are added in rules set, such rules as inter-dependency between sources and facts, mutual dependency between sources and the influence of week conflicting attributes to strong conflicting attributes. Experimental results using a large number of real-world data collected from two domains show that the proposed approach can significantly improve the accuracy of data conflict resolution.

Key words

Data integration; Data conflict resolution; Markov Logic Networks

Digital Object Identifier (DOI)

https://doi.org/10.2298/CSIS110613005Y

Publication information

Volume 10, Issue 1 (Januar 2013)
Year of Publication: 2013
ISSN: 1820-0214 (Print) 2406-1018 (Online)
Publisher: ComSIS Consortium

Full text

DownloadAvailable in PDF
Portable Document Format

How to cite

Yong-Xin, Z., Qing-Zhong, L., Zhao-Hui, P.: A Novel Method for Data Conflict Resolution using Multiple Rules. Computer Science and Information Systems, Vol. 10, No. 1, 215-235. (2013)