skip to main content
OSTI.GOV title logo U.S. Department of Energy
Office of Scientific and Technical Information

Title: Nonparametric Bayesian Modeling for Automated Database Schema Matching

Abstract

The problem of merging databases arises in many government and commercial applications. Schema matching, a common first step, identifies equivalent fields between databases. We introduce a schema matching framework that builds nonparametric Bayesian models for each field and compares them by computing the probability that a single model could have generated both fields. Our experiments show that our method is more accurate and faster than the existing instance-based matching algorithms in part because of the use of nonparametric Bayesian models.

Authors:
 [1];  [1]
  1. ORNL
Publication Date:
Research Org.:
Oak Ridge National Lab. (ORNL), Oak Ridge, TN (United States)
Sponsoring Org.:
USDOE
OSTI Identifier:
1330510
DOE Contract Number:  
AC05-00OR22725
Resource Type:
Conference
Resource Relation:
Conference: International Conference on Machine Learning Applications, Miami, FL, USA, 20151209, 20151211
Country of Publication:
United States
Language:
English
Subject:
probabilistic modeling; Bayesian; machine learning; schema matching

Citation Formats

Ferragut, Erik M, and Laska, Jason A. Nonparametric Bayesian Modeling for Automated Database Schema Matching. United States: N. p., 2015. Web. doi:10.1109/ICMLA.2015.235.
Ferragut, Erik M, & Laska, Jason A. Nonparametric Bayesian Modeling for Automated Database Schema Matching. United States. doi:10.1109/ICMLA.2015.235.
Ferragut, Erik M, and Laska, Jason A. Thu . "Nonparametric Bayesian Modeling for Automated Database Schema Matching". United States. doi:10.1109/ICMLA.2015.235.
@article{osti_1330510,
title = {Nonparametric Bayesian Modeling for Automated Database Schema Matching},
author = {Ferragut, Erik M and Laska, Jason A},
abstractNote = {The problem of merging databases arises in many government and commercial applications. Schema matching, a common first step, identifies equivalent fields between databases. We introduce a schema matching framework that builds nonparametric Bayesian models for each field and compares them by computing the probability that a single model could have generated both fields. Our experiments show that our method is more accurate and faster than the existing instance-based matching algorithms in part because of the use of nonparametric Bayesian models.},
doi = {10.1109/ICMLA.2015.235},
journal = {},
number = ,
volume = ,
place = {United States},
year = {Thu Jan 01 00:00:00 EST 2015},
month = {Thu Jan 01 00:00:00 EST 2015}
}

Conference:
Other availability
Please see Document Availability for additional information on obtaining the full-text document. Library patrons may search WorldCat to identify libraries that hold this conference proceeding.

Save / Share: