Get 20M+ Full-Text Papers For Less Than $1.50/day. Start a 14-Day Trial for You or Your Team.

Learn More →

Parallel data mining techniques on Graphics Processing Unit with Compute Unified Device Architecture (CUDA)

Parallel data mining techniques on Graphics Processing Unit with Compute Unified Device... Recent development in Graphics Processing Units (GPUs) has enabled inexpensive high performance computing for general-purpose applications. Compute Unified Device Architecture (CUDA) programming model provides the programmers adequate C language like APIs to better exploit the parallel power of the GPU. Data mining is widely used and has significant applications in various domains. However, current data mining toolkits cannot meet the requirement of applications with large-scale databases in terms of speed. In this paper, we propose three techniques to speedup fundamental problems in data mining algorithms on the CUDA platform: scalable thread scheduling scheme for irregular pattern, parallel distributed top-k scheme, and parallel high dimension reduction scheme. They play a key role in our CUDA-based implementation of three representative data mining algorithms, CU-Apriori, CU-KNN, and CU-K-means. These parallel implementations outperform the other state-of-the-art implementations significantly on a HP xw8600 workstation with a Tesla C1060 GPU and a Core-quad Intel Xeon CPU. Our results have shown that GPU + CUDA parallel architecture is feasible and promising for data mining applications. http://www.deepdyve.com/assets/images/DeepDyve-Logo-lg.png The Journal of Supercomputing Springer Journals

Parallel data mining techniques on Graphics Processing Unit with Compute Unified Device Architecture (CUDA)

Loading next page...
 
/lp/springer-journals/parallel-data-mining-techniques-on-graphics-processing-unit-with-Pi0sjMt0EF

References (33)

Publisher
Springer Journals
Copyright
Copyright © 2011 by Springer Science+Business Media, LLC
Subject
Computer Science; Programming Languages, Compilers, Interpreters; Processor Architectures; Computer Science, general
ISSN
0920-8542
eISSN
1573-0484
DOI
10.1007/s11227-011-0672-7
Publisher site
See Article on Publisher Site

Abstract

Recent development in Graphics Processing Units (GPUs) has enabled inexpensive high performance computing for general-purpose applications. Compute Unified Device Architecture (CUDA) programming model provides the programmers adequate C language like APIs to better exploit the parallel power of the GPU. Data mining is widely used and has significant applications in various domains. However, current data mining toolkits cannot meet the requirement of applications with large-scale databases in terms of speed. In this paper, we propose three techniques to speedup fundamental problems in data mining algorithms on the CUDA platform: scalable thread scheduling scheme for irregular pattern, parallel distributed top-k scheme, and parallel high dimension reduction scheme. They play a key role in our CUDA-based implementation of three representative data mining algorithms, CU-Apriori, CU-KNN, and CU-K-means. These parallel implementations outperform the other state-of-the-art implementations significantly on a HP xw8600 workstation with a Tesla C1060 GPU and a Core-quad Intel Xeon CPU. Our results have shown that GPU + CUDA parallel architecture is feasible and promising for data mining applications.

Journal

The Journal of SupercomputingSpringer Journals

Published: Aug 26, 2011

There are no references for this article.