| | |
Summary: Sasquatch Semantical Approach of Assuring
high Data Quality by Applying Data Mining Techniques
Fabian Grüning
OFFIS, Betriebliches Informationsmanagement
Escherweg 2, 26121 Oldenburg
fabian.gruening@informatik.uni-oldenburg.de, fabian.gruening@offis.de
Abstract
Sasquatch is a holistic approach of assuring high data quality in
enterprises' data management systems. It utilizes two main ideas for its
goal: On the one hand the given data schemas of one or more data
management systems are mapped to the concepts and properties of an
either generic or preferably a domain specific ontology. This mapping
allows abstracting from data management specific characteristics like
normalization in relational database management systems. The mapping
also has another advantage. As on the other hand data mining
techniques are used to acquire the data's characteristics to decide
whether or not a data tuple is correct, the conceptual view increases the
"semantical density" of the data so that the machine learning algorithms
used perform optimal.
Data quality management analyses the data continually to find data of
|