Adapt-S: Effective DNN Pruning via Unified Accuracy and Performance Tuning
| UDC.coleccion | Investigación | |
| UDC.conferenceTitle | IPDPS 2025 | |
| UDC.departamento | Enxeñaría de Computadores | |
| UDC.grupoInv | Grupo de Arquitectura de Computadores (GAC) | |
| UDC.institutoCentro | CITIC - Centro de Investigación de Tecnoloxías da Información e da Comunicación | |
| dc.contributor.author | Castro, Roberto L. | |
| dc.contributor.author | Andrade, Diego | |
| dc.contributor.author | Fraguela, Basilio B. | |
| dc.date.accessioned | 2025-10-20T12:56:15Z | |
| dc.date.available | 2025-10-20T12:56:15Z | |
| dc.date.issued | 2025-07 | |
| dc.description | Traballo presentado no: 2025 IEEE International Parallel and Distributed Processing Symposium (IPDPS), 03-07 June 2025, Milán, Italia © 2025 IEEE. This version of the paper has been accepted for publication. Personal use of this material is permitted. Permission from IEEE must be obtained for all other uses, in any current or future media, including reprinting/republishing this material for advertising or promotional purposes, creating new collective works, for resale or redistribution to servers or lists, or reuse of any copyrighted component of this work in other works. The final published paper is available online at: https://doi.org/10.1109/10.1109/IPDPS64566.2025.00018 | |
| dc.description.abstract | [Abstract]: Model sparsification has emerged as a promising approach to reducing model size with minimum impact on accuracy. This is achieved through the removal of some model parameters, a process also known as Deep Neural Network (DNN) pruning. The irregular nature of the generated sparse tensors poses a great challenge in the development of efficient GPU kernels optimized for these workloads. This challenge has been recently addressed through the use of hardware-aware semistructured sparsification methods designed to conform to specialized sparse formats and codesigned with template-based kernel implementations. These methods are commonly based on grouping the non-pruned values in blocks of a given size to generate regularity or on generating patterns that fit specialized hardware units. This pruning pattern-format-kernel triplet presents a high degree of tunability, both at the pruning and kernel sides, which can be used to fit certain accuracy-to-performance tradeoffs. On the pruning side, using larger blocks of consecutive non-pruned values favors performance over accuracy, as the weight selection for removal policy becomes less flexible. On the kernel side, recent studies have proven that the tuning of the configuration parameters of template-based multi-level tiling kernel implementations can yield an extra performance boost. This paper presents AdAPT-S, an autotuning system that generates DNN pruning recipes and optimized kernel configurations to fit an accuracy-to-performance specification. This is done through a cost model that integrates both aspects. AdAPT-S gets extra benefits from the exploitation of layer sensitivity by providing per-layer pruning recipes and kernel configurations. The results show that our approach can achieve superior accuracy-to-performance trade-offs and that this can be used to produce models that fit the user requirements. | |
| dc.description.sponsorship | This research was supported by grants PID2019-104184RBI00 and PID2022-136435NB-I00, funded by MCIN/AEI/10.13039/501100011033, PID2022 also funded by “ERDF Away of making Europe”, EU, the predoctoral grant of Roberto L. Castro (FPU19/03974), and by the Xunta de Galicia cofounded by the European Regional Development Fund (ERDF) under the Consolidation Programme of Competitive Reference Groups (ED431C 2021/30). CITIC, as a center accredited for excellence within the Galician University System and a member of the CIGUS Network, receives subsidies from the Department of Education, Science, Universities, and Vocational Training of the Xunta de Galicia. Additionally, it is co-financed by the EU through the FEDER Galicia 2021-27 operational program (Ref. ED431G 2023/01) Innovation Study ESPLAG has received funding through the Inno4scale project, which is funded by the European High-Performance Computing Joint Undertaking (JU) under Grant Agreement No 101118139. The JU receives support from the EuropeanUnion’s Horizon Europe Programme. | |
| dc.description.sponsorship | Xunta der Galicia; ED431C 2021/30 | |
| dc.description.sponsorship | Xunta de Galicia; ED431G 2023/01 | |
| dc.identifier.citation | R. L. Castro, D. Andrade, and B. B. Fraguela, "Adapt-S: Effective DNN Pruning via Unified Accuracy and Performance Tuning", 2025 IEEE International Parallel and Distributed Processing Symposium (IPDPS), 03-07 June 2025, Milán, Italia, doi: 10.1109/IPDPS64566.2025.00018 | |
| dc.identifier.doi | 10.1109/IPDPS64566.2025.00018 | |
| dc.identifier.isbn | 979-8-3315-3237-6 | |
| dc.identifier.issn | 1530-2075 | |
| dc.identifier.uri | https://hdl.handle.net/2183/46024 | |
| dc.language.iso | eng | |
| dc.publisher | IEEE | |
| dc.relation.projectID | info: eu-repo/grantAgreement/AEI/Plan Estatal de Investigación Científica y Técnica y de Innovación 2017-2020/PID2019-104184RB-I00/ES/DESAFÍOS ACTUALES EN HPC: ARQUITECTURAS, SOFTWARE Y APLICACIONES | |
| dc.relation.projectID | info: eu-repo/grantAgreement/AEI/Plan Estatal de Investigación Científica y Técnica y de Innovación 2021-2023/PID2022-136435NB-I00/ES/ARQUITECTURAS, FRAMEWORKS Y APLICACIONES DE LA COMPUTACION DE ALTAS PRESTACIONES | |
| dc.relation.projectID | info:eu-repo/grantAgreement/MCIU/Plan Estatal de Investigación Científica y Técnica y de Innovación 2017-2020/FPU19%2F03974/ES/ | |
| dc.relation.projectID | info:eu-repo/grantAgreement/EC/HE/101118139 | |
| dc.relation.uri | https://doi.org/10.1109/IPDPS64566.2025.00018 | |
| dc.rights | © 2025, IEEE | |
| dc.rights.accessRights | open access | |
| dc.subject | DNN Pruning | |
| dc.subject | Tensor Core Unit | |
| dc.subject | GPU programming | |
| dc.subject | CUDA | |
| dc.title | Adapt-S: Effective DNN Pruning via Unified Accuracy and Performance Tuning | |
| dc.type | conference output | |
| dspace.entity.type | Publication | |
| relation.isAuthorOfPublication | ba3b1a6d-65dd-4366-a7d4-f6c802c5f07a | |
| relation.isAuthorOfPublication | 7f5bae1c-08f6-4204-b22a-fbe20407a6e4 | |
| relation.isAuthorOfPublication.latestForDiscovery | ba3b1a6d-65dd-4366-a7d4-f6c802c5f07a |
Files
Original bundle
1 - 1 of 1
Loading...
- Name:
- Andrade_Diego_2025_AdaptS.pdf
- Size:
- 1.46 MB
- Format:
- Adobe Portable Document Format

