skip to main content
OSTI.GOV title logo U.S. Department of Energy
Office of Scientific and Technical Information

Title: Mining entity-identification rules for database integration

Conference ·
OSTI ID:421296
;  [1];  [2]
  1. Univ. of Minnesota, Minneapolis, MN (United States)
  2. Apertus Technologies, Inc., Eden Prairie, MN (United States)

Entity identification (BI) is the identification and integration of all records which represent the same real-world entity, and is an important task in database integration process. When a common identification mechanism for similar records across heterogeneous databases is not readily available, EI is performed by examining the relationships between various attribute values among the records. We propose the use of distances between attribute values as a measure of similarity between the records they represent. Record-matching conditions for EI can then be expressed as constraints on the attribute distances. We show how knowledge discovery techniques can be used to automatically derive these conditions (expressed as decision trees) directly from the data, using a distance-based framework.

OSTI ID:
421296
Report Number(s):
CONF-960830-; TRN: 96:005928-0051
Resource Relation:
Conference: 2. international conference on knowledge discovery and data mining, Portland, OR (United States), 2-4 Aug 1996; Other Information: PBD: 1996; Related Information: Is Part Of Proceedings of the second international conference on knowledge discovery & data mining; Simoudis, E.; Han, J.; Fayyad, U. [eds.]; PB: 405 p.
Country of Publication:
United States
Language:
English