• Analysis and evaluation of MapReduce solutions on an HPC cluster 

      Veiga, Jorge; Expósito, Roberto R.; Taboada, Guillermo L.; Touriño, Juan (Pergamon Press, 2016-02)
      [Abstract] The ever growing needs of Big Data applications are demanding challenging capabilities which cannot be handled easily by traditional systems, and thus more and more organizations are adopting High Performance ...
    • Assessment, Design and Implementation of a Private Cloud for MapReduce Applications 

      Salgueiro, M.; González, Patricia; Fernández Pena, Tomás; Cabaleiro, José Carlos (Stowarzyszenie Komputerowej Nauki o Materialach i Inzynierii Powierzchni,Association of Computational Materials Science and Surface Engineering, 2014)
      [Abstract] Scientific computation and data intensive analyses are ever more frequent. On the one hand, the MapReduce programming model has gained a lot of attention for its applicability in large parallel data analyses and ...
    • Design and Implementation of MapReduce using the PGAS Programming Model with UPC 

      Teijeiro Barjas, Carlos; Taboada, Guillermo L.; Touriño, Juan; Doallo, Ramón (IEEE Computer Society, 2012-01-03)
      [Abstract] MapReduce is a powerful tool for processing large data sets used by many applications running in distributed environments. However, despite the increasing number of computationally intensive problems that require ...
    • Enabling Hardware Affinity in JVM-Based Applications: A Case Study for Big Data 

      Expósito, Roberto R.; Veiga, Jorge; Touriño, Juan (Springer, 2020)
      [Abstract]: Java has been the backbone of Big Data processing for more than a decade due to its interesting features such as object orientation, cross-platform portability and good programming productivity. In fact, most ...
    • Enhancing in-memory Efficiency for MapReduce-based Data Processing 

      Veiga Fachal, Jorge; Expósito, Roberto R.; Taboada, Guillermo L.; Touriño, Juan (Academic Press, 2018-10)
      [Abstract] As the memory capacity of computational systems increases, the in-memory data management of Big Data processing frameworks becomes more crucial for performance. This paper analyzes and improves the memory ...
    • Evaluation of Parallel Differential Evolution Implementations on MapReduce and Spark 

      Teijeiro, Diego; Pardo, Xoán C.; Penas, David R.; González, Patricia; Banga, Julio R.; Doallo, Ramón (Springer, 2017-09)
      [Abstract] Global optimization problems arise in many areas of science and engineering, computational and systems biology and bioinformatics among them. Many research efforts have focused on developing parallel metaheuristics ...
    • Flame-MR: An event-driven architecture for MapReduce applications 

      Veiga, Jorge; Expósito, Roberto R.; Taboada, Guillermo L.; Touriño, Juan (Elsevier BV * North-Holland, 2016)
      [Abstract] Nowadays, many organizations analyze their data with the MapReduce paradigm, most of them using the popular Apache Hadoop framework. As the data size managed by MapReduce applications is steadily increasing, the ...
    • Implementing cloud-based parallel metaheuristics: an overview 

      González, Patricia; Pardo, Xoán C.; Doallo, Ramón; Banga, Julio R. (Universidad Nacional de la Plata - Facultad de Informatica, 2018-12-12)
      [Abstract] Metaheuristics are among the most popular methods for solving hard global optimization problems in many areas of science and engineering. Their parallel im- plementation applying HPC techniques is a common ...
    • MarDRe: efficient MapReduce-based removal of duplicate DNA reads in the cloud 

      Expósito, Roberto R.; Veiga, Jorge; González-Domínguez, Jorge; Touriño, Juan (Oxford University Press, 2017)
      [Abstract] This article presents MarDRe, a de novo cloud-ready duplicate and near-duplicate removal tool that can process single- and paired-end reads from FASTQ/FASTA datasets. MarDRe takes advantage of the widely adopted ...
    • MREv: An Automatic MapReduce Evaluation Tool for Big Data Workloads 

      Veiga, Jorge; Expósito, Roberto R.; Taboada, Guillermo L.; Touriño, Juan (Elsevier, 2015)
      [Abstract]: The popularity of Big Data computing models like MapReduce has caused the emergence of many frameworks oriented to High Performance Computing (HPC) systems. The suitability of each one to a particular use case ...
    • Performance Evaluation of Data-Intensive Computing Applications on a Public IaaS Cloud 

      Expósito, Roberto R.; Taboada, Guillermo L.; Ramos Garea, Sabela; Touriño, Juan; Doallo, Ramón (Oxford University Press, 2016)
      [Abstract] The advent of cloud computing technologies, which dynamically provide on-demand access to computational resources over the Internet, is offering new possibilities to many scientists and researchers. Nowadays, ...
    • RGen: Data Generator for Benchmarking Big Data Workloads 

      Pérez-Jove, Rubén; Expósito, Roberto R.; Touriño, Juan (MDPI, 2021)
      [Abstract] This paper presents RGen, a parallel data generator for benchmarking Big Data workloads, which integrates existing features and new functionalities in a standalone tool. The main functionalities developed in ...