Home

About

Advanced Search

Browse by Discipline

Scientific Societies

E-print Alerts

Add E-prints

E-print Network
FAQHELPSITE MAPCONTACT US


  Advanced Search  

 
An Algorithm for In-Core Frequent Itemset Mining on Streaming Data Ruoming Jin Gagan Agrawal
 

Summary: An Algorithm for In-Core Frequent Itemset Mining on Streaming Data
Ruoming Jin Gagan Agrawal
Department of Computer and Information Sciences
Ohio State University, Columbus OH 43210
jinr,agrawalĄ @cis.ohio-state.edu
Abstract
Frequent itemset mining is a core data mining operation and
has been extensively studied over the last decade. This paper
takes a new approach for this problem and makes two ma-
jor contributions. First, we present a one pass algorithm for
frequent itemset mining, which has deterministic bounds on
the accuracy, and does not require any out-of-core summary
structure. Second, because our one pass algorithm does not
produce any false negatives, it can be easily extended to a
two pass accurate algorithm. Our two pass algorithm is very
memory efficient, and allows mining of datasets with large
number of distinct items and/or very low support levels.
Our detailed experimental evaluation on synthetic and real
datasets shows the following. First, our one pass algorithm is
very accurate in practice. Second, our algorithm requires sig-

  

Source: Agrawal, Gagan - Department of Computer Science and Engineering, Ohio State University
Jin, Ruoming - Department of Computer Science, Kent State University

 

Collections: Computer Technologies and Information Sciences