DisClose: Discovering Colossal Closed Itemsets via a Memory Efficient Compact Row-Tree
Itemset mining has recently focused on discovery of frequent itemsets from high-dimensional datasets with relatively few rows and a larger number of items. With exponentially in-creasing running time as average row length increases, mining such datasets renders most conventional algorithms impracti-cal. Unfortunately, large cardinality closed itemsets are likely to be more informative than small cardinality closed itemsets in this type of dataset. This paper proposes an approach, called DisClose, to extract large cardinality (colossal) closed itemsets from high-dimensional datasets. The approach relies on a memory-efficient Compact Row-Tree data structure to represent itemsets during the search process. The search strategy explores the transposed representation of the dataset. Large cardinality itemsets are enumerated first followed by smaller ones. In addition, we utilize a minimum cardinality threshold to further reduce the search space. Experimental result shows that DisClose can complete the extraction of colossal closed itemsets in the considered dataset, even for low support thresholds. The algorithm immediately discovers closed itemsets without needing to check if each new closed itemset has previously been found.
- Research Organization:
- Pacific Northwest National Laboratory (PNNL), Richland, WA (US)
- Sponsoring Organization:
- USDOE
- DOE Contract Number:
- AC05-76RL01830
- OSTI ID:
- 1076707
- Report Number(s):
- PNNL-SA-85968; 400470000
- Country of Publication:
- United States
- Language:
- English
Similar Records
An efficient approach to discovering knowledge from large databases
A fast distributed algorithm for mining association rules
Hash based parallel algorithms for mining association rules
Conference
·
Mon Dec 30 23:00:00 EST 1996
·
OSTI ID:535538
A fast distributed algorithm for mining association rules
Conference
·
Mon Dec 30 23:00:00 EST 1996
·
OSTI ID:535540
Hash based parallel algorithms for mining association rules
Conference
·
Mon Dec 30 23:00:00 EST 1996
·
OSTI ID:535539