ScalaParBiBit: Scaling the Binary Biclustering in Distributed-Memory Systems

UDC.coleccionInvestigaciónes_ES
UDC.departamentoEnxeñaría de Computadoreses_ES
UDC.endPage2268es_ES
UDC.grupoInvGrupo de Arquitectura de Computadores (GAC)es_ES
UDC.issue3es_ES
UDC.journalTitleCluster Computinges_ES
UDC.startPage2249es_ES
UDC.volume24es_ES
dc.contributor.authorFraguela, Basilio B.
dc.contributor.authorAndrade, Diego
dc.contributor.authorGonzález-Domínguez, Jorge
dc.date.accessioned2021-08-24T09:48:01Z
dc.date.embargoEndDate2022-03-19es_ES
dc.date.embargoLift2022-03-19
dc.date.issued2021-03-19
dc.description.abstract[Abstract] Biclustering is a data mining technique that allows us to find groups of rows and columns that are highly correlated in a 2D dataset. Although there exist several software applications to perform biclustering, most of them suffer from a high computational complexity which prevents their use in large datasets. In this work we present ScalaParBiBit, a parallel tool to find biclusters on binary data, quite common in many research fields such as text mining, marketing or bioinformatics. ScalaParBiBit takes advantage of the special characteristics of these binary datasets, as well as of an efficient parallel implementation and algorithm, to accelerate the biclustering procedure in distributed-memory systems. The experimental evaluation proves that our tool is significantly faster and more scalable that the state-of-the-art tool ParBiBit in a cluster with 32 nodes and 768 cores. Our tool together with its reference manual are freely available at https://github.com/fraguela/ScalaParBiBit.es_ES
dc.description.sponsorshipThis research was supported by the Ministry of Science and Innovation of Spain (TIN2016-75845-P and PID2019-104184RB-I00, AEI/FEDER/EU, 10.13039/ 501100011033), and by the Xunta de Galicia co-founded by the European Regional Development Fund (ERDF) under the Consolidation Programme of Competitive Reference Groups (ED431C 2017/04). We acknowledge also the support from the Centro Singular de Investigación de Galicia “CITIC”, funded by Xunta de Galicia and the European Union (European Regional Development Fund- Galicia 2014-2020 Program), by grant ED431G 2019/01. We also acknowledge the Centro de Supercomputación de Galicia (CESGA) for the usage of their resourceses_ES
dc.description.sponsorshipXunta de Galicia; ED431C 2017/04es_ES
dc.description.sponsorshipXunta de Galicia; ED431G 2019/01
dc.identifier.citationFraguela, B.B., Andrade, D. & González-Domínguez, J. ScalaParBiBit: scaling the binary biclustering in distributed-memory systems. Cluster Comput 24, 2249–2268 (2021). https://doi.org/10.1007/s10586-021-03261-zes_ES
dc.identifier.doi10.1007/s10586-021-03261-z
dc.identifier.urihttp://hdl.handle.net/2183/28278
dc.language.isoenges_ES
dc.publisherSpringerLinkes_ES
dc.relation.projectIDinfo:eu-repo/grantAgreement/MINECO/Plan Estatal de Investigación Científica y Técnica y de Innovación 2013-2016/TIN2016-75845-P/ES/NUEVOS DESAFIOS EN COMPUTACION DE ALTAS PRESTACIONES: DESDE ARQUITECTURAS HASTA APLICACIONES (II)
dc.relation.projectIDinfo:eu-repo/grantAgreement/AEI/Plan Estatal de Investigación Científica y Técnica y de Innovación 2017-2020/PID2019-104184RB-I00/ES/DESAFIOS ACTUALES EN HPC: ARQUITECTURAS, SOFTWARE Y APLICACIONES
dc.relation.projectIDinfo:eu-repo/grantAgreement/AEI/Plan Estatal de Investigación Científica y Técnica y de Innovación 2017-2020/PID2019-104184RB-I00/ES/DESAFIOS ACTUALES EN HPC: ARQUITECTURAS, SOFTWARE Y APLICACIONES
dc.relation.projectIDinfo:eu-repo/grantAgreement/MINECO/Plan Estatal de Investigación Científica y Técnica y de Innovación 2013-2016/TIN2016-75845-P/ES/NUEVOS DESAFIOS EN COMPUTACION DE ALTAS PRESTACIONES: DESDE ARQUITECTURAS HASTA APLICACIONES (II)
dc.relation.urihttps://doi.org/10.1007/s10586-021-03261-zes_ES
dc.rights.accessRightsopen accesses_ES
dc.subjectBiclusteringes_ES
dc.subjectHigh performance computinges_ES
dc.subjectMulticore clusterses_ES
dc.subjectMPIes_ES
dc.subjectMaster–slavees_ES
dc.titleScalaParBiBit: Scaling the Binary Biclustering in Distributed-Memory Systemses_ES
dc.typejournal articlees_ES
dspace.entity.typePublication
relation.isAuthorOfPublication7f5bae1c-08f6-4204-b22a-fbe20407a6e4
relation.isAuthorOfPublicationba3b1a6d-65dd-4366-a7d4-f6c802c5f07a
relation.isAuthorOfPublication84d13059-7f4b-4cb5-ac65-0e07a77271f0
relation.isAuthorOfPublication.latestForDiscovery7f5bae1c-08f6-4204-b22a-fbe20407a6e4

Files

Original bundle

Now showing 1 - 1 of 1
Loading...
Thumbnail Image
Name:
Fraguela_Rodriguez_Basilio_2021_ScalaParBiBit_ Scaling_Binary_Biclustering.pdf
Size:
606.46 KB
Format:
Adobe Portable Document Format
Description: