Efficient Solving of Scan Primitive on Multi-GPU Systems

UDC.coleccionInvestigaciónes_ES
UDC.conferenceTitleIPDPS 2018es_ES
UDC.departamentoEnxeñaría de Computadoreses_ES
UDC.endPage803es_ES
UDC.grupoInvGrupo de Arquitectura de Computadores (GAC)es_ES
UDC.startPage794es_ES
dc.contributor.authorPérez Diéguez, Adrián
dc.contributor.authorAmor, Margarita
dc.contributor.authorDoallo, Ramón
dc.contributor.authorNukada, Akira
dc.contributor.authorMatsuoka, Satoshi
dc.date.accessioned2025-01-22T12:17:34Z
dc.date.available2025-01-22T12:17:34Z
dc.date.issued2018
dc.descriptionThis version of the article has been accepted for publication, after peer review. Personal use of this material is permitted. Permission from IEEE must be obtained for all other uses, in any current or future media, including reprinting/republishing this material for advertising or promotional purposes, creating new collective works, for resale or redistribution to servers or lists, or reuse of any copyrighted component of this work in other works. The Version of Record is available online at: https://doi.org/10.1109/IPDPS.2018.00089es_ES
dc.descriptionPresented at: 32nd IEEE International Parallel and Distributed Processing Symposium, IPDPS 2018, Vancouver, 21-25 May 2018es_ES
dc.description.abstract[Abstract]: GPUs fulfill high computation demands, but it is necessary to develop code carefully, selecting algorithms well suited to the GPU architecture and applying different optimizations. This article presents a GPU-suitable algorithm and a tuning strategy for performing the scan primitive over large problem sizes in CUDA. This tuning strategy defines different performance premises to find the GPU execution parameters that maximize performance. Taking these premises into consideration, we easily develop the kernels using CUDA skeletons to ensure efficiency and portability. Based on this, we describe an optimal proposal analyzed over different multiple GPU environments, the first multiple-GPU batch scan proposal to the best of our knowledge. The resulting implementations outperform other well-known libraries in most cases, such as CUDPP, ModernGPU, Thrust, CUB and LightScan.es_ES
dc.description.sponsorshipThis work was cofunded by the Government of Galicia and ERDF funds from the EU, under the Consolidation Programme of Competitive Reference Groups [ED431C 2017/04] and Competitive Research Units [R2014/049 and R2016/037]; by the Ministry of Economy and Competitiveness of Spain and ERDF funds [TIN2016-75845-P]; and by the Ministry of Education of Spain (FPU14/02801).es_ES
dc.description.sponsorshipXunta de Galicia; ED431C 2017/04es_ES
dc.description.sponsorshipXunta de Galicia; R2014/049es_ES
dc.description.sponsorshipXunta de Galicia; R2016/037es_ES
dc.identifier.citationA. P. Diéguez, M. Amor, R. Doallo, A. Nukada and S. Matsuoka, "Efficient Solving of Scan Primitive on Multi-GPU Systems," 2018 IEEE International Parallel and Distributed Processing Symposium (IPDPS), Vancouver, BC, Canada, 2018, pp. 794-803, doi: 10.1109/IPDPS.2018.00089.es_ES
dc.identifier.doi10.1109/IPDPS.2018.00089
dc.identifier.isbn978153864368-6
dc.identifier.issn1530-2075
dc.identifier.urihttp://hdl.handle.net/2183/40835
dc.language.isoenges_ES
dc.publisherInstitute of Electrical and Electronics Engineers Inc.es_ES
dc.relation.projectIDinfo:eu-repo/grantAgreement/MINECO/Plan Estatal de Investigación Científica y Técnica y de Innovación 2013-2016/TIN2016-75845-P/ES/NUEVOS DESAFIOS EN COMPUTACION DE ALTAS PRESTACIONES: DESDE ARQUITECTURAS HASTA APLICACIONES (II)es_ES
dc.relation.projectIDinfo:eu-repo/grantAgreement/MECD/Programa Estatal de Promoción del Talento y su Empleabilidad/FPU14%2F02801/ES/es_ES
dc.relation.urihttps://doi.org/10.1109/IPDPS.2018.00089es_ES
dc.rightsCopyright © 2018, IEEEes_ES
dc.rights.accessRightsopen accesses_ES
dc.subjectGraphics processing unitses_ES
dc.subjectInstruction setses_ES
dc.subjectKerneles_ES
dc.subjectTuninges_ES
dc.subjectPeer-to-peer computinges_ES
dc.subjectLibrarieses_ES
dc.subjectRegisterses_ES
dc.subjectCUDAes_ES
dc.subjectMultiGPUes_ES
dc.subjectMPIes_ES
dc.subjectScanes_ES
dc.subjectTuninges_ES
dc.titleEfficient Solving of Scan Primitive on Multi-GPU Systemses_ES
dc.typeconference outputes_ES
dspace.entity.typePublication
relation.isAuthorOfPublication31d7c9d0-70ef-44ef-af1d-e40f560c41bc
relation.isAuthorOfPublicationc98c1fe1-2016-44c1-9225-43fe1c6b8088
relation.isAuthorOfPublicationb3302f65-05d3-4b2c-b8b3-8503e58bba5e
relation.isAuthorOfPublication.latestForDiscovery31d7c9d0-70ef-44ef-af1d-e40f560c41bc

Files

Original bundle

Now showing 1 - 1 of 1
Loading...
Thumbnail Image
Name:
Amor_Margarita_2018_Efficient_Solving_of_Scan_Primitive_on_Multi_GPU_Systems.pdf
Size:
1.82 MB
Format:
Adobe Portable Document Format
Description:
Versión aceptada