A Fast Solver for Large Tridiagonal Systems on Multi-Core Processors (Lass Library)

Valero-Lara, Pedro; Andrade, Diego; Sirvent, Raül; Labarta, Jesús; Fraguela, Basilio B.; Doallo, Ramón

A Fast Solver for Large Tridiagonal Systems on Multi-Core Processors (Lass Library)

UDC.coleccion	Investigación	es_ES
UDC.departamento	Enxeñaría de Computadores	es_ES
UDC.endPage	23378	es_ES
UDC.grupoInv	Grupo de Arquitectura de Computadores (GAC)	es_ES
UDC.journalTitle	IEEE Access	es_ES
UDC.startPage	23365	es_ES
UDC.volume	7	es_ES
dc.contributor.author	Valero-Lara, Pedro
dc.contributor.author	Andrade, Diego
dc.contributor.author	Sirvent, Raül
dc.contributor.author	Labarta, Jesús
dc.contributor.author	Fraguela, Basilio B.
dc.contributor.author	Doallo, Ramón
dc.date.accessioned	2023-11-16T19:43:29Z
dc.date.available	2023-11-16T19:43:29Z
dc.date.issued	2019
dc.description.abstract	[Abstract]: Many problems of industrial and scientific interest require the solving of tridiagonal linear systems. This paper presents several implementations for the parallel solving of large tridiagonal systems on multi-core architectures, using the OmpSs programming model. The strategy used for the parallelization is based on the combination of two different existing algorithms, PCR and Thomas. The Thomas algorithm, which cannot be parallelized, requires the fewest number of floating point operations. The PCR algorithm is the most popular parallel method, but it is more computationally expensive than Thomas. The method proposed in this paper starts applying the PCR algorithm to break down one large tridiagonal system into a set of smaller and independent ones. In a second step, these independent systems are concurrently solved using Thomas. The paper also contains an analytical study of which is the best point to switch from PCR to Thomas. Also, the paper addresses the main performance issues of combining PCR and Thomas proposing a set of alternative implementations, some of them even imply algorithmic changes. The performance evaluation shows that the best implementation achieves a peak speedup of 4 with respect to the Intel MKL counterpart routine and 2.5 with respect to a single-threaded Thomas.	es_ES
dc.description.sponsorship	This work was supported in part by the European Union’s Horizon 2020 Framework Programme for Research and Innovation under the Specific Grant Agreements Human Brain Project SGA1 and Human Brain Project SGA2 under Grant 720270 and Grant 785907, in part by the Spanish Ministry of Economy and Competitiveness under the Project Computación de Altas Prestaciones VII under Grant TIN2015-65316-P, in part by the Departament d’Innovació, Universitats i Empresa de la Generalitat de Catalunya, under project MPEXPAR: Models de Programació i Entorns d’Execució Paralůlels under Grant 2014-SGR-1051, in part by the Juan de la Cierva under Grant IJCI-2017-33511, in part by the Fujitsu under the Barcelona Supercomputing Center-Fujitsu Joint Project: Math Libraries Migration and Optimization, in part by the Ministerio de Economía, Industria y Competitividad of Spain, in part by the Fondo Europeo de Desarrollo Regional Funds of the European Union under Grant TIN2016-75845-P, and in part by the Xunta de Galicia co-founded by the European Regional Development Fund (ERDF) under the Consolidation Programme of Competitive Reference Groups under Grant ED431C 2017/04, and in part by the Centro Singular de Investigación de Galicia accreditatión 2016-2019 under Grant ED431G/01.	es_ES
dc.description.sponsorship	Xunta de Galicia; ED431C 2017/04	es_ES
dc.description.sponsorship	Xunta de Galicia; ED431G/01	es_ES
dc.description.sponsorship	Generalitat de Catalunya; 2014-SGR-1051	es_ES
dc.identifier.citation	P. Valero-Lara, D. Andrade, R. Sirvent, J. Labarta, B. B. Fraguela and R. Doallo, "A Fast Solver for Large Tridiagonal Systems on Multi-Core Processors (Lass Library)," in IEEE Access, vol. 7, pp. 23365-23378, 2019, doi: 10.1109/ACCESS.2019.2900122.	es_ES
dc.identifier.doi	10.1109/ACCESS.2019.2900122
dc.identifier.issn	2169-3536
dc.identifier.uri	http://hdl.handle.net/2183/34272
dc.language.iso	eng	es_ES
dc.publisher	Institute of Electrical and Electronics Engineers	es_ES
dc.relation.projectID	info:eu-repo/grantAgreement/MINECO/Plan Estatal de Investigación Científica y Técnica y de Innovación 2013-2016/TIN2016-75845-P/ES/NUEVOS DESAFIOS EN COMPUTACION DE ALTAS PRESTACIONES: DESDE ARQUITECTURAS HASTA APLICACIONES (II)/	es_ES
dc.relation.projectID	info:eu-repo/grantAgreement/AEI/ Plan Estatal de Investigación Científica y Técnica y de Innovación/IJCI-2017-33511/ES/	es_ES
dc.relation.projectID	info:eu-repo/grantAgreement/MINECO/Plan Estatal de Investigación Científica y Técnica y de Innovación 2013-2016/TIN2015-65316-P/ES/COMPUTACION DE ALTAS PRESTACIONES VII/	es_ES
dc.relation.projectID	info:eu-repo/grantAgreement/EC/H2020/720270	es_ES
dc.relation.projectID	info:eu-repo/grantAgreement/EC/H2020/785907	es_ES
dc.relation.uri	https://ieeexplore.ieee.org/document/8643931	es_ES
dc.rights	Atribución-NoComercial-SinDerivadas 3.0 España	es_ES
dc.rights.accessRights	open access	es_ES
dc.rights.uri	http://creativecommons.org/licenses/by-nc-nd/3.0/es/	*
dc.subject	Tridiagonal solve	es_ES
dc.subject	Multi-core	es_ES
dc.subject	Auto-tuning	es_ES
dc.subject	OmpSs	es_ES
dc.subject	Mathematical model	es_ES
dc.subject	Linear systems	es_ES
dc.subject	Computational modeling	es_ES
dc.subject	Memory management	es_ES
dc.title	A Fast Solver for Large Tridiagonal Systems on Multi-Core Processors (Lass Library)	es_ES
dc.type	journal article	es_ES
dspace.entity.type	Publication
relation.isAuthorOfPublication	ba3b1a6d-65dd-4366-a7d4-f6c802c5f07a
relation.isAuthorOfPublication	7f5bae1c-08f6-4204-b22a-fbe20407a6e4
relation.isAuthorOfPublication	b3302f65-05d3-4b2c-b8b3-8503e58bba5e
relation.isAuthorOfPublication.latestForDiscovery	ba3b1a6d-65dd-4366-a7d4-f6c802c5f07a

Files

Original bundle

Now showing 1 - 1 of 1

Name:: Valero_Lara_Pedro_2019_A_Fast_Solver_for_Large_Tridiagonal_Systems_on_Multi-Core_Processors_Lass_Library.pdf
Size:: 5.9 MB
Format:: Adobe Portable Document Format
Description:

Download

Collections

Investigación (FIC)