Adapt-S: Effective DNN Pruning via Unified Accuracy and Performance Tuning

UDC.coleccionInvestigación
UDC.conferenceTitleIPDPS 2025
UDC.departamentoEnxeñaría de Computadores
UDC.grupoInvGrupo de Arquitectura de Computadores (GAC)
UDC.institutoCentroCITIC - Centro de Investigación de Tecnoloxías da Información e da Comunicación
dc.contributor.authorCastro, Roberto L.
dc.contributor.authorAndrade, Diego
dc.contributor.authorFraguela, Basilio B.
dc.date.accessioned2025-10-20T12:56:15Z
dc.date.available2025-10-20T12:56:15Z
dc.date.issued2025-07
dc.descriptionTraballo presentado no: 2025 IEEE International Parallel and Distributed Processing Symposium (IPDPS), 03-07 June 2025, Milán, Italia © 2025 IEEE. This version of the paper has been accepted for publication. Personal use of this material is permitted. Permission from IEEE must be obtained for all other uses, in any current or future media, including reprinting/republishing this material for advertising or promotional purposes, creating new collective works, for resale or redistribution to servers or lists, or reuse of any copyrighted component of this work in other works. The final published paper is available online at: https://doi.org/10.1109/10.1109/IPDPS64566.2025.00018
dc.description.abstract[Abstract]: Model sparsification has emerged as a promising approach to reducing model size with minimum impact on accuracy. This is achieved through the removal of some model parameters, a process also known as Deep Neural Network (DNN) pruning. The irregular nature of the generated sparse tensors poses a great challenge in the development of efficient GPU kernels optimized for these workloads. This challenge has been recently addressed through the use of hardware-aware semistructured sparsification methods designed to conform to specialized sparse formats and codesigned with template-based kernel implementations. These methods are commonly based on grouping the non-pruned values in blocks of a given size to generate regularity or on generating patterns that fit specialized hardware units. This pruning pattern-format-kernel triplet presents a high degree of tunability, both at the pruning and kernel sides, which can be used to fit certain accuracy-to-performance tradeoffs. On the pruning side, using larger blocks of consecutive non-pruned values favors performance over accuracy, as the weight selection for removal policy becomes less flexible. On the kernel side, recent studies have proven that the tuning of the configuration parameters of template-based multi-level tiling kernel implementations can yield an extra performance boost. This paper presents AdAPT-S, an autotuning system that generates DNN pruning recipes and optimized kernel configurations to fit an accuracy-to-performance specification. This is done through a cost model that integrates both aspects. AdAPT-S gets extra benefits from the exploitation of layer sensitivity by providing per-layer pruning recipes and kernel configurations. The results show that our approach can achieve superior accuracy-to-performance trade-offs and that this can be used to produce models that fit the user requirements.
dc.description.sponsorshipThis research was supported by grants PID2019-104184RBI00 and PID2022-136435NB-I00, funded by MCIN/AEI/10.13039/501100011033, PID2022 also funded by “ERDF Away of making Europe”, EU, the predoctoral grant of Roberto L. Castro (FPU19/03974), and by the Xunta de Galicia cofounded by the European Regional Development Fund (ERDF) under the Consolidation Programme of Competitive Reference Groups (ED431C 2021/30). CITIC, as a center accredited for excellence within the Galician University System and a member of the CIGUS Network, receives subsidies from the Department of Education, Science, Universities, and Vocational Training of the Xunta de Galicia. Additionally, it is co-financed by the EU through the FEDER Galicia 2021-27 operational program (Ref. ED431G 2023/01) Innovation Study ESPLAG has received funding through the Inno4scale project, which is funded by the European High-Performance Computing Joint Undertaking (JU) under Grant Agreement No 101118139. The JU receives support from the EuropeanUnion’s Horizon Europe Programme.
dc.description.sponsorshipXunta der Galicia; ED431C 2021/30
dc.description.sponsorshipXunta de Galicia; ED431G 2023/01
dc.identifier.citationR. L. Castro, D. Andrade, and B. B. Fraguela, "Adapt-S: Effective DNN Pruning via Unified Accuracy and Performance Tuning", 2025 IEEE International Parallel and Distributed Processing Symposium (IPDPS), 03-07 June 2025, Milán, Italia, doi: 10.1109/IPDPS64566.2025.00018
dc.identifier.doi10.1109/IPDPS64566.2025.00018
dc.identifier.isbn979-8-3315-3237-6
dc.identifier.issn1530-2075
dc.identifier.urihttps://hdl.handle.net/2183/46024
dc.language.isoeng
dc.publisherIEEE
dc.relation.projectIDinfo: eu-repo/grantAgreement/AEI/Plan Estatal de Investigación Científica y Técnica y de Innovación 2017-2020/PID2019-104184RB-I00/ES/DESAFÍOS ACTUALES EN HPC: ARQUITECTURAS, SOFTWARE Y APLICACIONES
dc.relation.projectIDinfo: eu-repo/grantAgreement/AEI/Plan Estatal de Investigación Científica y Técnica y de Innovación 2021-2023/PID2022-136435NB-I00/ES/ARQUITECTURAS, FRAMEWORKS Y APLICACIONES DE LA COMPUTACION DE ALTAS PRESTACIONES
dc.relation.projectIDinfo:eu-repo/grantAgreement/MCIU/Plan Estatal de Investigación Científica y Técnica y de Innovación 2017-2020/FPU19%2F03974/ES/
dc.relation.projectIDinfo:eu-repo/grantAgreement/EC/HE/101118139
dc.relation.urihttps://doi.org/10.1109/IPDPS64566.2025.00018
dc.rights© 2025, IEEE
dc.rights.accessRightsopen access
dc.subjectDNN Pruning
dc.subjectTensor Core Unit
dc.subjectGPU programming
dc.subjectCUDA
dc.titleAdapt-S: Effective DNN Pruning via Unified Accuracy and Performance Tuning
dc.typeconference output
dspace.entity.typePublication
relation.isAuthorOfPublicationba3b1a6d-65dd-4366-a7d4-f6c802c5f07a
relation.isAuthorOfPublication7f5bae1c-08f6-4204-b22a-fbe20407a6e4
relation.isAuthorOfPublication.latestForDiscoveryba3b1a6d-65dd-4366-a7d4-f6c802c5f07a

Files

Original bundle

Now showing 1 - 1 of 1
Loading...
Thumbnail Image
Name:
Andrade_Diego_2025_AdaptS.pdf
Size:
1.46 MB
Format:
Adobe Portable Document Format