skip to main content

Title: GPU-Accelerated Text Mining

Accelerating hardware devices represent a novel promise for improving the performance for many problem domains but it is not clear for which domains what accelerators are suitable. While there is no room in general-purpose processor design to significantly increase the processor frequency, developers are instead resorting to multi-core chips duplicating conventional computing capabilities on a single die. Yet, accelerators offer more radical designs with a much higher level of parallelism and novel programming environments. This present work assesses the viability of text mining on CUDA. Text mining is one of the key concepts that has become prominent as an effective means to index the Internet, but its applications range beyond this scope and extend to providing document similarity metrics, the subject of this work. We have developed and optimized text search algorithms for GPUs to exploit their potential for massive data processing. We discuss the algorithmic challenges of parallelization for text search problems on GPUs and demonstrate the potential of these devices in experiments by reporting significant speedups. Our study may be one of the first to assess more complex text search problems for suitability for GPU devices, and it may also be one of the first to exploit andmore » report on atomic instruction usage that have recently become available in NVIDIA devices.« less
Authors:
 [1] ;  [2] ;  [1] ;  [1]
  1. ORNL
  2. North Carolina State University
Publication Date:
OSTI Identifier:
962625
DOE Contract Number:
DE-AC05-00OR22725
Resource Type:
Conference
Resource Relation:
Conference: Workshop on Exploiting Parallelism using GPUs and other Hardware-Assisted Methods, Seattle, WA, USA, 20090322, 20090322
Research Org:
Oak Ridge National Lab. (ORNL), Oak Ridge, TN (United States)
Sponsoring Org:
USDOE Laboratory Directed Research and Development (LDRD) Program
Country of Publication:
United States
Language:
English
Subject:
43 PARTICLE ACCELERATORS; ACCELERATORS; ALGORITHMS; DATA PROCESSING; DESIGN; INTERNET; METRICS; MINING; PERFORMANCE; PROGRAMMING; RADICALS; VIABILITY