Summary: Analyzing Quantitative Databases: Image is Everything
Amihood Amir \Lambda Reuven Kashi y Nathan S. Netanyahu z
BarIlan University BarIlan University BarIlan University
Georgia Tech Univ. of Maryland
Traditional statistical methods deal with corroborating given hypotheses on a given body of data.
However, generating the hypothesis itself is a matter of intuition and ingenuity. It is clearly impossible
to test all hypotheses on a database with millions of records and hundreds of fields.
There have been attempts to bridge this gap through data mining. Association generation is
a method of creating such statistical hypotheses for binary data. For quantitative databases the
situation is still not good. There are a number of known methods. One is a reduction to binary data
by creating intervals and then generating associations. This method is computationally expensive.
Another suggested method was by generating associations that are statistically interesting. This
method also was tried only on small databases and is applicable only for binary relations, e.g., in
certain ranges of field X, field Y lies significantly outside its average.
We suggest a method that answers some of the problems with the current techniques. Our idea is
based on using visualization techniques and image processing ideas to rank subsets of fields according
to the relation between them in the database. This ranking suggests the hypotheses to be statistically