Performance Evaluation of Big Data Frameworks for Large-Scale Data Analytics

Veiga, Jorge; Expósito, Roberto R.; Pardo, Xoán C.; Taboada, Guillermo L.; Touriño, Juan

dc.contributor.author	Veiga, Jorge
dc.contributor.author	Expósito, Roberto R.
dc.contributor.author	Pardo, Xoán C.
dc.contributor.author	Taboada, Guillermo L.
dc.contributor.author	Touriño, Juan
dc.date.accessioned	2019-07-02T14:28:27Z
dc.date.available	2019-07-02T14:28:27Z
dc.date.issued	2017-02-06
dc.identifier.citation	J. Veiga, R. R. Expósito, X. C. Pardo, G. L. Taboada and J. Tourifio, "Performance evaluation of big data frameworks for large-scale data analytics," 2016 IEEE International Conference on Big Data (Big Data), Washington, DC, 2016, pp. 424-431.	es_ES
dc.identifier.uri	http://hdl.handle.net/2183/23359
dc.description	This is a post-peer-review, pre-copyedit version of an article published. The final authenticated version is available online at: http://dx.doi.org/10.1109/BigData.2016.7840633	es_ES
dc.description.abstract	[Abstract] The increasing adoption of Big Data analytics has led to a high demand for efficient technologies in order to manage and process large datasets. Popular MapReduce frameworks such as Hadoop are being replaced by emerging ones like Spark or Flink, which improve both the programming APIs and performance. However, few works have focused on comparing these frameworks. This paper addresses this issue by performing a comparative evaluation of Hadoop, Spark and Flink using representative Big Data workloads and considering factors like performance and scalability. Moreover, the behavior of these frameworks has been characterized by modifying some of the main parameters of the workloads such as HDFS block size, input data size, interconnect network or thread configuration. The analysis of the results has shown that replacing Hadoop with Spark or Flink can lead to a reduction in execution times by 77% and 70% on average, respectively, for non-sort benchmarks.	es_ES
dc.description.sponsorship	Ministerio de Ecnomía y Competitividad; TIN2013-42148-P	es_ES
dc.description.sponsorship	Ministerio de Educación; FPU14/02805	es_ES
dc.language.iso	eng	es_ES
dc.publisher	IEEE Computer Society	es_ES
dc.relation.uri	http://dx.doi.org/10.1109/BigData.2016.7840633	es_ES
dc.subject	Sparks	es_ES
dc.subject	Benchmark testing	es_ES
dc.subject	Big Data	es_ES
dc.subject	Generators	es_ES
dc.subject	Programming	es_ES
dc.subject	Clustering algorithms	es_ES
dc.subject	Computational modeling	es_ES
dc.title	Performance Evaluation of Big Data Frameworks for Large-Scale Data Analytics	es_ES
dc.type	info:eu-repo/semantics/conferenceObject	es_ES
dc.type	info:eu-repo/semantics/conferenceObject	es_ES
dc.rights.access	info:eu-repo/semantics/openAccess	es_ES
UDC.startPage	424	es_ES
UDC.endPage	431	es_ES
dc.identifier.doi	10.1109/BigData.2016.7840633
UDC.conferenceTitle	2016 IEEE International Conference on Big Data (Big Data)	es_ES

Ficheiros no ítem

Nome:: J.Veiga_2016_Performance_Evalu ...
Tamaño:: 289.9Kb
Formato:: PDF

Ver/abrir

Este ítem aparece na(s) seguinte(s) colección(s)

GI-GAC - Congresos, conferencias, etc. [55]

Mostrar o rexistro simple do ítem