Show simple item record

dc.contributor.authorGeorganas, Evangelos
dc.contributor.authorGonzález-Domínguez, Jorge
dc.contributor.authorSolomonik, Edgar
dc.contributor.authorZheng, Yili
dc.contributor.authorTouriño, Juan
dc.contributor.authorYelick, Katherine
dc.date.accessioned2019-07-03T17:30:20Z
dc.date.available2019-07-03T17:30:20Z
dc.date.issued2013-02-25
dc.identifier.citationE. Georganas, J. Gonzalez-Dominguez, E. Solomonik, Y. Zheng, J. Tourino and K. Yelick, "Communication avoiding and overlapping for numerical linear algebra," SC '12: Proceedings of the International Conference on High Performance Computing, Networking, Storage and Analysis, Salt Lake City, UT, 2012, pp. 1-11.es_ES
dc.identifier.otherINSPEC Accession Number: 13372346
dc.identifier.urihttp://hdl.handle.net/2183/23395
dc.descriptionThis is a post-peer-review, pre-copyedit version. The final authenticated version is available online at: http://dx.doi.org/10.1109/SC.2012.32es_ES
dc.description.abstract[Abstract] To efficiently scale dense linear algebra problems to future exascale systems, communication cost must be avoided or overlapped. Communication-avoiding 2.5D algorithms improve scalability by reducing inter-processor data transfer volume at the cost of extra memory usage. Communication overlap attempts to hide messaging latency by pipelining messages and overlapping with computational work. We study the interaction and compatibility of these two techniques for two matrix multiplication algorithms (Cannon and SUMMA), triangular solve, and Cholesky factorization. For each algorithm, we construct a detailed performance model that considers both critical path dependencies and idle time. We give novel implementations of 2.5D algorithms with overlap for each of these problems. Our software employs UPC, a partitioned global address space (PGAS) language that provides fast one-sided communication. We show communication avoidance and overlap provide a cumulative benefit as core counts scale, including results using over 24K cores of a Cray XE6 system.es_ES
dc.description.sponsorshipOffice of Science of the U.S. Department of Energy; DE-AC02-05CH11231es_ES
dc.description.sponsorshipOffice of Science of the U.S. Department of Energy; DARPA HR0011-10-9-0008es_ES
dc.description.sponsorshipMinisterio de Ciencia e Innovación; TIN2010-16735es_ES
dc.description.sponsorshipMinisterio de Educación; AP2008-01578es_ES
dc.description.sponsorshipKrell Department of Energy Computational Science Graduate Fellowship; DE-FG02-7ER25308es_ES
dc.description.sponsorshipOffice of Science of the U.S. Department of Energy; DE-AC02-05CH11231.es_ES
dc.language.isoenges_ES
dc.publisherIEEE Computer Societyes_ES
dc.relation.urihttp://dx.doi.org/10.1109/SC.2012.32es_ES
dc.subjectProgram processorses_ES
dc.subjectBandwidthes_ES
dc.subjectPartitioning algorithmses_ES
dc.subjectLinear algebraes_ES
dc.subjectMessage systemses_ES
dc.subjectHardwarees_ES
dc.subjectLayoutes_ES
dc.titleCommunication avoiding and overlapping for numerical linear algebraes_ES
dc.typeinfo:eu-repo/semantics/conferenceObjectes_ES
dc.rights.accessinfo:eu-repo/semantics/openAccesses_ES
UDC.startPage1es_ES
UDC.endPage11es_ES
dc.identifier.doi10.1109/SC.2012.32
UDC.conferenceTitleSC '12: Proceedings of the International Conference on High Performance Computing, Networking, Storage and Analysises_ES


Files in this item

Thumbnail

This item appears in the following Collection(s)

Show simple item record