• A pipeline architecture for feature-based unsupervised clustering using multivariate time series from HPC jobs 

      Enes, Jonatan; Expósito, Roberto R.; Fuentes Rodríguez, Jose; López Cacheiro, Javier; Touriño, Juan (Elsevier B.V., 2023-05)
      [Abstract]: Time series are key across industrial and research areas for their ability to model behaviour across time, making them ideal for a wide range of use cases such as event monitoring, trend prediction or anomaly ...
    • Accelerating binary biclustering on platforms with CUDA-enabled GPUs 

      González-Domínguez, Jorge; Expósito, Roberto R. (Elsevier Ltd, 2018)
      [Abstract]: Data mining is nowadays essential in many scientific fields to extract valuable information from large input datasets and transform it into an understandable structure. For instance, biclustering techniques are ...
    • Analysis and evaluation of MapReduce solutions on an HPC cluster 

      Veiga, Jorge; Expósito, Roberto R.; Taboada, Guillermo L.; Touriño, Juan (Pergamon Press, 2016-02)
      [Abstract] The ever growing needs of Big Data applications are demanding challenging capabilities which cannot be handled easily by traditional systems, and thus more and more organizations are adopting High Performance ...
    • Analysis of I/O Performance on an Amazon EC2 Cluster Compute and High I/O Platform 

      Expósito, Roberto R.; Taboada, Guillermo L.; Ramos Garea, Sabela; González-Domínguez, Jorge; Touriño, Juan; Doallo, Ramón (Springer Netherlands, 2013-12)
      [Abstract] Cloud computing is currently being explored by the scientific community to assess its suitability for High Performance Computing (HPC) environments. In this novel paradigm, compute and storage resources, as well ...
    • BDEv 3.0: energy efficiency and microarchitectural characterization of Big Data processing frameworks 

      Veiga, Jorge; Enes, Jonatan; Expósito, Roberto R.; Touriño, Juan (Elsevier BV * North-Holland, 2018-09)
      [Abstract] As the size of Big Data workloads keeps increasing, the evaluation of distributed frameworks becomes a crucial task in order to identify potential performance bottlenecks that may delay the processing of large ...
    • BDWatchdog: real-time monitoring and profiling of Big Data applications and frameworks 

      Enes, Jonatan; Expósito, Roberto R.; Touriño, Juan (Elsevier BV * North-Holland, 2018-10)
      [Abstract] Current Big Data applications are characterized by a heavy use of system resources (e.g., CPU, disk) generally distributed across a cluster. To effectively improve their performance there is a critical need for ...
    • Big Data-Oriented PaaS Architecture with Disk-as-a-Resource Capability and Container-Based Virtualization 

      López Cacheiro, Javier; Expósito, Roberto R.; Touriño, Juan; Enes, Jonatan (Springer Netherlands, 2018-12)
      [Abstract] With the increasing adoption of Big Data technologies as basic tools for the ongoing Digital Transformation, there is a high demand for data-intensive applications. In order to efficiently execute such applications, ...
    • BigDEC: A multi-algorithm Big Data tool based on the k-mer spectrum method for scalable short-read error correction 

      Expósito, Roberto R.; González-Domínguez, Jorge (Elsevier, 2024-05)
      [Abstract]: Despite the significant improvements in both throughput and cost provided by modern Next-Generation Sequencing (NGS) platforms, sequencing errors in NGS datasets can still degrade the quality of downstream ...
    • CUDA-JMI: Acceleration of feature selection on heterogeneous systems 

      González-Domínguez, Jorge; Expósito, Roberto R.; Bolón-Canedo, Verónica (Elsevier, 2020-01)
      [Abstract]: Feature selection is a crucial step nowadays in machine learning and data analytics to remove irrelevant and redundant characteristics and thus to provide fast and reliable analyses. Many research works have ...
    • Design of Scalable Java Communication Middleware for Multi-Core Systems 

      Ramos Garea, Sabela; Taboada, Guillermo L.; Expósito, Roberto R.; Touriño, Juan; Doallo, Ramón (Oxford University Press, 2013-02-01)
      [Abstract] This paper presents smdev, a shared memory communication middleware for multi-core systems. smdev provides a simple and powerful messaging application program interface that is able to exploit the underlying ...
    • Design of scalable Java message-passing communications over InfiniBand 

      Expósito, Roberto R.; Taboada, Guillermo L.; Touriño, Juan; Doallo, Ramón (Springer New York LLC, 2012-07)
      [Abstract] This paper presents ibvdev a scalable and efficient low-level Java message-passing communication device over InfiniBand. The continuous increase in the number of cores per processor underscores the need for ...
    • Enhancing in-memory Efficiency for MapReduce-based Data Processing 

      Veiga Fachal, Jorge; Expósito, Roberto R.; Taboada, Guillermo L.; Touriño, Juan (Academic Press, 2018-10)
      [Abstract] As the memory capacity of computational systems increases, the in-memory data management of Big Data processing frameworks becomes more crucial for performance. This paper analyzes and improves the memory ...
    • Evaluation of messaging middleware for high-performance cloud computing 

      Expósito, Roberto R.; Taboada, Guillermo L.; Ramos Garea, Sabela; Touriño, Juan; Doallo, Ramón (Springer U K, 2013-12)
      [Abstract] Cloud computing is posing several challenges, such as security, fault tolerance, access interface singularity, and network constraints, both in terms of latency and bandwidth. In this scenario, the performance ...
    • FastMPJ: a scalable and efficient Java message-passing library 

      Expósito, Roberto R.; Ramos Garea, Sabela; Taboada, Guillermo L.; Touriño, Juan; Doallo, Ramón (Springer New York LLC, 2014)
      [Abstract] The performance and scalability of communications are key for high performance computing (HPC) applications in the current multi-core era. Despite the significant benefits (e.g., productivity, portability, ...
    • Flame-MR: An event-driven architecture for MapReduce applications 

      Veiga, Jorge; Expósito, Roberto R.; Taboada, Guillermo L.; Touriño, Juan (Elsevier BV * North-Holland, 2016)
      [Abstract] Nowadays, many organizations analyze their data with the MapReduce paradigm, most of them using the popular Apache Hadoop framework. As the data size managed by MapReduce applications is steadily increasing, the ...
    • General‐purpose computation on GPUs for high performance cloud computing 

      Expósito, Roberto R.; Taboada, Guillermo L.; Ramos Garea, Sabela; Touriño, Juan; Doallo, Ramón (John Wiley & Sons Ltd., 2013-08)
      [Abstract] Cloud computing is offering new approaches for High Performance Computing (HPC) as it provides dynamically scalable resources as a service over the Internet. In addition, General‐Purpose computation on Graphical ...
    • HSRA: Hadoop-based spliced read aligner for RNA sequencing data 

      Expósito, Roberto R.; González-Domínguez, Jorge; Touriño, Juan (Public Library of Science, 2018-07-31)
      [Abstract] Nowadays, the analysis of transcriptome sequencing (RNA-seq) data has become the standard method for quantifying the levels of gene expression. In RNA-seq experiments, the mapping of short reads to a reference ...
    • Java in the High Performance Computing arena: Research, practice and experience 

      Taboada, Guillermo L.; Ramos Garea, Sabela; Expósito, Roberto R.; Touriño, Juan; Doallo, Ramón (Elsevier BV, 2013-05-01)
      [Abstract] The rising interest in Java for High Performance Computing (HPC) is based on the appealing features of this language for programming multi-core cluster architectures, particularly the built-in networking and ...
    • Low‐latency Java communication devices on RDMA‐enabled networks 

      Expósito, Roberto R.; Taboada, Guillermo L.; Ramos Garea, Sabela; Touriño, Juan; Doallo, Ramón (John Wiley & Sons Ltd., 2015)
      [Abstract] Providing high‐performance inter‐node communication is a key capability for running high performance computing applications efficiently on parallel architectures. In fact, current systems deployments are aggregating ...
    • MarDRe: efficient MapReduce-based removal of duplicate DNA reads in the cloud 

      Expósito, Roberto R.; Veiga, Jorge; González-Domínguez, Jorge; Touriño, Juan (Oxford University Press, 2017)
      [Abstract] This article presents MarDRe, a de novo cloud-ready duplicate and near-duplicate removal tool that can process single- and paired-end reads from FASTQ/FASTA datasets. MarDRe takes advantage of the widely adopted ...