Designing Efficient Index-Digit Algorithms for CUDA GPU Architectures

UDC.coleccionInvestigaciónes_ES
UDC.departamentoEnxeñaría de Computadoreses_ES
UDC.endPage1343es_ES
UDC.grupoInvGrupo de Arquitectura de Computadores (GAC)es_ES
UDC.issue5es_ES
UDC.journalTitleIEEE Transactions on Parallel and Distributed Systemses_ES
UDC.startPage1331es_ES
UDC.volume27es_ES
dc.contributor.authorLobeiras Blanco, Jacobo
dc.contributor.authorAmor, Margarita
dc.contributor.authorDoallo, Ramón
dc.date.accessioned2025-01-14T11:32:57Z
dc.date.available2025-01-14T11:32:57Z
dc.date.issued2016-05
dc.descriptionThis version of the article has been accepted for publication, after peer review. Personal use of this material is permitted. Permission from IEEE must be obtained for all other uses, in any current or future media, including reprinting/republishing this material for advertising or promotional purposes, creating new collective works, for resale or redistribution to servers or lists, or reuse of any copyrighted component of this work in other works. The Version of Record is available online at: https://doi.org/10.1109/TPDS.2015.2450718.es_ES
dc.description.abstract[Abstract]: Modern graphics processing units (GPUs) offer very high computing power at relatively low cost. Nevertheless, designing efficient algorithms for the GPUs normally requires additional time and effort, even for experienced programmers. In this work we present a tuning methodology that allows the design for CUDA-enabled GPU architectures of index-digit algorithms, that is, algorithms where the data movement can be described as the permutations of the digits comprising the indices of the data elements. This methodology, based on two-stages identified as GPU resource analysis and operators string manipulation, is applied to FFT and tridiagonal systems solver algorithms, analyzing the performance features and the most adequate solutions. The resulting implementation is compact and outperforms other well-known and commonly used state-of-the-art libraries, with an improvement of up to 19.2 percent over NVIDIA's complex CUFFT , and more than 3000 percent over the NVIDIA'sCUDPP for real data tridiagonal systems.es_ES
dc.description.sponsorshipThis research has been supported by the Galician Government (Xunta de Galicia) under the Consolidation Program of Competitive Reference Groups, cofunded by FEDER funds of the EU (Ref. GRC2013/055); and by the Ministry of Economy and Competitiveness of Spain and FEDER funds of the EU (Project TIN2013-42148-P).es_ES
dc.description.sponsorshipXunta de Galicia; GRC2013/055es_ES
dc.identifier.citationJ. Lobeiras, M. Amor and R. Doallo, "Designing Efficient Index-Digit Algorithms for CUDA GPU Architectures," in IEEE Transactions on Parallel and Distributed Systems, vol. 27, no. 5, pp. 1331-1343, 1 May 2016, doi: 10.1109/TPDS.2015.2450718.es_ES
dc.identifier.doi10.1109/TPDS.2015.2450718
dc.identifier.urihttp://hdl.handle.net/2183/40697
dc.language.isoenges_ES
dc.publisherIEEEes_ES
dc.relation.projectIDinfo:eu-repo/grantAgreement/MINECO/Plan Estatal de Investigación Científica y Técnica y de Innovación 2013-2016/TIN2013-42148-P/ES/NUEVOS DESAFIOS EN COMPUTACION DE ALTAS PRESTACIONES: DESDE ARQUITECTURAS HASTA APLICACIONESes_ES
dc.relation.urihttps://doi.org/10.1109/TPDS.2015.2450718es_ES
dc.rights© 2016 IEEE.es_ES
dc.rights.accessRightsopen accesses_ES
dc.subjectCUDAes_ES
dc.subjectFFTes_ES
dc.subjectGPGPUes_ES
dc.subjectOperators stringes_ES
dc.subjectTridiagonal systems solveres_ES
dc.subjectTuninges_ES
dc.titleDesigning Efficient Index-Digit Algorithms for CUDA GPU Architectureses_ES
dc.typejournal articlees_ES
dspace.entity.typePublication
relation.isAuthorOfPublication0124b851-fdc5-473b-a559-32a1954aafd0
relation.isAuthorOfPublicationc98c1fe1-2016-44c1-9225-43fe1c6b8088
relation.isAuthorOfPublicationb3302f65-05d3-4b2c-b8b3-8503e58bba5e
relation.isAuthorOfPublication.latestForDiscovery0124b851-fdc5-473b-a559-32a1954aafd0

Files

Original bundle

Now showing 1 - 1 of 1
Loading...
Thumbnail Image
Name:
Amor_Margarita_2016_Designing_Efficient_Index_Digit_Algorithms_for_CUDA.pdf
Size:
1.03 MB
Format:
Adobe Portable Document Format
Description:
Versión aceptada