Home

About

Advanced Search

Browse by Discipline

Scientific Societies

E-print Alerts

Add E-prints

E-print Network
FAQHELPSITE MAPCONTACT US


  Advanced Search  

 
Fast Counting with AV-Space for Efficient Rule Induction Linyan Wang and Aijun An*
 

Summary: Fast Counting with AV-Space for Efficient Rule Induction
Linyan Wang and Aijun An*
Abstract*
We present AV-space, a new data structure for caching
data set statistics for efficiently learning classification
rules from large data sets. The AV-space is designed to
work with sequential-covering rule induction algorithms.
It is used to accelerate queries about the count of the
examples in a data set that satisfy a conjunction of attrib-
ute-value pairs. With an AV-space, the learning algorithm
does not have to access the training data to obtain the
statistics about the data. We present the structure of an
AV-space, algorithms for building and querying an AV-
space, and procedures for dynamically updating the AV-
space during the rule induction process. We present an
experimental evaluation that compares the AV-space with
a commonly-used data structure that simply loads the
(encoded) training examples into memory. We show that
the use of AV-space significantly improves the speed of
rule induction and that it consumes less memory on large

  

Source: An, Aijun - Department of Computer Science, York University (Toronto)

 

Collections: Computer Technologies and Information Sciences