Use this link to cite:
http://hdl.handle.net/2183/40835 Efficient Solving of Scan Primitive on Multi-GPU Systems
Loading...
Identifiers
Publication date
Authors
Advisors
Other responsabilities
Journal Title
Bibliographic citation
A. P. Diéguez, M. Amor, R. Doallo, A. Nukada and S. Matsuoka, "Efficient Solving of Scan Primitive on Multi-GPU Systems," 2018 IEEE International Parallel and Distributed Processing Symposium (IPDPS), Vancouver, BC, Canada, 2018, pp. 794-803, doi: 10.1109/IPDPS.2018.00089.
Type of academic work
Academic degree
Abstract
[Abstract]: GPUs fulfill high computation demands, but it is necessary to develop code carefully, selecting algorithms well suited to the GPU architecture and applying different optimizations. This article presents a GPU-suitable algorithm and a tuning strategy for performing the scan primitive over large problem sizes in CUDA. This tuning strategy defines different performance premises to find the GPU execution parameters that maximize performance. Taking these premises into consideration, we easily develop the kernels using CUDA skeletons to ensure efficiency and portability. Based on this, we describe an optimal proposal analyzed over different multiple GPU environments, the first multiple-GPU batch scan proposal to the best of our knowledge. The resulting implementations outperform other well-known libraries in most cases, such as CUDPP, ModernGPU, Thrust, CUB and LightScan.
Description
This version of the article has been accepted for publication, after peer review. Personal use of this material is permitted. Permission from IEEE must be obtained for all other uses, in any current or future media, including reprinting/republishing this material for advertising or promotional purposes, creating new collective works, for resale or redistribution to servers or lists, or reuse of any copyrighted component of this work in other works. The Version of Record is available online at: https://doi.org/10.1109/IPDPS.2018.00089
Presented at: 32nd IEEE International Parallel and Distributed Processing Symposium, IPDPS 2018, Vancouver, 21-25 May 2018
Presented at: 32nd IEEE International Parallel and Distributed Processing Symposium, IPDPS 2018, Vancouver, 21-25 May 2018
Editor version
Rights
Copyright © 2018, IEEE






