Advanced Search

Browse by Discipline

Scientific Societies

E-print Alerts

Add E-prints

E-print Network

  Advanced Search  

Compiler and Runtime Support for Shared Memory Parallelization of Data Mining Algorithms

Summary: Compiler and Runtime Support for Shared Memory
Parallelization of Data Mining Algorithms

Xiaogang Li Ruoming Jin Gagan Agrawal
Department of Computer and Information Sciences
Ohio State University, Columbus OH 43210°xgli,jinr,agrawalĘ@cis.ohio-state.edu
Abstract. Data mining techniques focus on finding novel and useful patterns or
models from large datasets. Because of the volume of the data to be analyzed, the
amount of computation involved, and the need for rapid or even interactive anal-
ysis, data mining applications require the use of parallel machines. We have been
developing compiler and runtime support for developing scalable implementa-
tions of data mining algorithms. Our work encompasses shared memory paral-
lelization, distributed memory parallelization, and optimizations for processing
disk-resident datasets.
In this paper, we focus on compiler and runtime support for shared memory par-
allelization of data mining algorithms. We have developed a set of parallelization
techniques that apply across algorithms for a variety of mining tasks. We describe
the interface of the middleware where these techniques are implemented. Then,
we present compiler techniques for translating data parallel code to the middle-
ware specification. Finally, we present a brief evaluation of our compiler using


Source: Agrawal, Gagan - Department of Computer Science and Engineering, Ohio State University


Collections: Computer Technologies and Information Sciences