In-Transit Molecular Dynamics Analysis with Apache Flink

UDC.coleccionInvestigaciónes_ES
UDC.conferenceTitleISAV '18: Proceedings of the Workshop on In Situ Infrastructures for Enabling Extreme-Scale Analysis and Visualizationes_ES
UDC.departamentoEnxeñaría de Computadoreses_ES
UDC.endPage32es_ES
UDC.grupoInvGrupo de Arquitectura de Computadores (GAC)es_ES
UDC.startPage25es_ES
dc.contributor.authorZamúz, Henrique C.
dc.contributor.authorRaffin, Bruno
dc.contributor.authorMures, Omar A.
dc.contributor.authorPadrón, Emilio J.
dc.date.accessioned2022-12-30T09:01:12Z
dc.date.available2022-12-30T09:01:12Z
dc.date.issued2018-11
dc.descriptionSC18's workshopes_ES
dc.description.abstract[Abstract] In this paper, an on-line parallel analytics framework is proposed to process and store in transit all the data being generated by a Molecular Dynamics (MD) simulation run using staging nodes in the same cluster executing the simulation. The implementation and deployment of such a parallel workflow with standard HPC tools, managing problems such as data partitioning and load balancing, can be a hard task for scientists. In this paper we propose to leverage Apache Flink, a scalable stream processing engine from the Big Data domain, in this HPC context. Flink enables to program analyses within a simple window based map/reduce model, while the runtime takes care of the deployment, load balancing and fault tolerance. We build a complete in transit analytics workflow, connecting an MD simulation to Apache Flink and to a distributed database, Apache HBase, to persist all the desired data. To demonstrate the expressivity of this programming model and its suitability for HPC scientific environments, two common analytics in the MD field have been implemented. We assessed the performance of this framework, concluding that it can handle simulations of sizes used in the literature while providing an effective and versatile tool for scientists to easily incorporate on-line parallel analytics in their current workflows.es_ES
dc.description.sponsorshipMinisterio de Economía y Competitividad; TIN2016-75845-Pes_ES
dc.description.sponsorshipXunta de Galicia; ED431C2017/04es_ES
dc.identifier.citationHenrique C. Zanúz, Bruno Raffin, Omar A. Mures, and Emilio J. Padrón. 2018. In-transit molecular dynamics analysis with Apache flink. In Proceedings of the Workshop on In Situ Infrastructures for Enabling Extreme-Scale Analysis and Visualization (ISAV '18). Association for Computing Machinery, New York, NY, USA, 25–32. https://doi.org/10.1145/3281464.3281469es_ES
dc.identifier.doi10.1145/3281464.3281469
dc.identifier.isbn978-1-4503-6579-6
dc.identifier.urihttp://hdl.handle.net/2183/32255
dc.language.isoenges_ES
dc.publisherAssociation for Computing Machinery (ACM)es_ES
dc.relation.urihttps://doi.org/10.1145/3281464.3281469es_ES
dc.rights.accessRightsopen accesses_ES
dc.subjectIn-transit analysises_ES
dc.subjectApache flinkes_ES
dc.subjectMolecular dynamicses_ES
dc.subjectOnline analysises_ES
dc.titleIn-Transit Molecular Dynamics Analysis with Apache Flinkes_ES
dc.typeconference outputes_ES
dspace.entity.typePublication
relation.isAuthorOfPublication532a32fe-d0a1-4634-84b5-d8f87c2ccae3
relation.isAuthorOfPublicationbdccb1db-e727-4b63-b2ca-1941cc096c00
relation.isAuthorOfPublication.latestForDiscovery532a32fe-d0a1-4634-84b5-d8f87c2ccae3

Files

Original bundle

Now showing 1 - 1 of 1
Loading...
Thumbnail Image
Name:
Padron_Emilio_2018_In-transit_analysis_Apache_flink.pdf
Size:
1.22 MB
Format:
Adobe Portable Document Format
Description:
Main article with artifact description appendix