| | |
Summary: Using Domain Knowledge Provided by Ontologies for
Improving Data Quality Management
Stefan Brueggemann
(OFFIS, Oldenburg, Germany
brueggemann@offis.de)
Fabian Gruening
(University of Oldenburg, Germany
fabian.gruening@informatik.uni-oldenburg.de)
Abstract: Several data quality management (DQM) tasks like duplicate detection or
consistency checking depend on domain specific knowledge. Many DQM approaches
have potential for bringing together domain knowledge and DQM metadata. We pro-
vide an approach which uses this knowledge modeled in ontologies instead of aquiring
that knowledge by cost-intensive interviews with domain-experts. These ontologies can
directly be annotated with DQM specific metadata. With our approach a synergy effect
can be achieved when modeling a domain ontology, e.g. for defining a shared vocab-
ulary for improved interoperability, and performing DQM. We present three DQM
applications which directly use knowledge provided by domain ontologies. These ap-
plications use the ontology structure itself to provide correction suggestions for invalid
data, identify duplicates, and to store data quality annotations at schema and instance
level.
|