Skip to main content
U.S. Department of Energy
Office of Scientific and Technical Information

A probabilistic NF2 relational algebra for integrated information retrieval and database systems

Conference ·
OSTI ID:501651
;  [1]
  1. Universitaet Dortmund (Germany)
The integration of information retrieval (IR) and database systems requires a data model which allows for modelling documents as entities, representing uncertainty and vagueness and performing uncertain inference. For this purpose, we present a probabilistic data model based on relations in non-first-normal-form (NF2). Here, tuples are assigned probabilistic weights giving the probability that a tuple belongs to a relation. Thus, the set of weighted index terms of a document are represented as a probabilistic subrelation. In a similar way, imprecise attribute values are modelled as a set-valued attribute. We redefine the relational operators for this type of relations such that the result of each operator is again a probabilistic NF2 relation, where the weight of a tuple gives the probability that this tuple belongs to the result. By ordering the tuples according to decreasing probabilities, the model yields a ranking of answers like in most IR models. This effect also can be used for typical database queries involving imprecise attribute values as well as for combinations of database and IR queries.
OSTI ID:
501651
Report Number(s):
CONF-961239--
Country of Publication:
United States
Language:
English