• Characterization of message-passing overhead on the AP3000 multicomputer 

      Touriño, Juan; Doallo, Ramón (IEEE, 2001-09)
      [Abstract] The performance of the communication primitives of parallel computers is critical for the overall system performance. The characterization of the communication overhead is very important to estimate the global ...
    • Clupiter: a Raspberry Pi mini-supercomputer for educational purposes 

      Rodríguez-Iglesias, Alonso; Martín, María J.; Touriño, Juan (Institute of Electrical and Electronics Engineers, 2024)
      [Abstract]: The main objective of this work is to bring supercomputing and parallel processing closer to non-specialized audiences by building a Raspberry Pi cluster, called Clupiter, which emulates the operation of a ...
    • Communication avoiding and overlapping for numerical linear algebra 

      Georganas, Evangelos; González-Domínguez, Jorge; Solomonik, Edgar; Zheng, Yili; Touriño, Juan; Yelick, Katherine (IEEE Computer Society, 2013-02-25)
      [Abstract] To efficiently scale dense linear algebra problems to future exascale systems, communication cost must be avoided or overlapped. Communication-avoiding 2.5D algorithms improve scalability by reducing inter-processor ...
    • Compiler support for parallel code generation through kernel recognition 

      Arenaz Silva, Manuel; Touriño, Juan; Doallo, Ramón (IEEE Computer Society, 2004-06-07)
      [Abstract] Summary form only given. The automatic parallelization of loops that contain complex computations is still a challenge for current parallelizing compilers. The main limitations are related to the analysis of ...
    • Compiler-Assisted Checkpointing of Parallel Codes: The Cetus and LLVM Experience 

      Rodríguez, Gabriel; Martín, María J.; González, Patricia; Touriño, Juan; Doallo, Ramón (Springer New York LLC, 2013)
      [Abstract] With the evolution of high-performance computing, parallel applications have developed an increasing necessity for fault tolerance, most commonly provided by checkpoint and restart techniques. Checkpointing tools ...
    • CPPC: a compiler‐assisted tool for portable checkpointing of message‐passing applications 

      Rodríguez, Gabriel; Martín, María J.; González, Patricia; Touriño, Juan; Doallo, Ramón (John Wiley & Sons Ltd., 2010-11-19)
      [Abstract] With the evolution of high‐performance computing toward heterogeneous, massively parallel systems, parallel applications have developed new checkpoint and restart necessities. Whether due to a failure in the ...
    • CUDA acceleration of MI-based feature selection methods 

      Beceiro, Bieito; González-Domínguez, Jorge; Morán-Fernández, Laura; Bolón-Canedo, Verónica; Touriño, Juan (Elsevier, 2024-08)
      [Abstract]: Feature selection algorithms are necessary nowadays for machine learning as they are capable of removing irrelevant and redundant information to reduce the dimensionality of the data and improve the quality of ...
    • Design and Implementation of an extended collectives library for unified Parallel C 

      Teijeiro Barjas, Carlos; Taboada, Guillermo L.; Touriño, Juan; Doallo, Ramón; Mouriño, José C.; Mallón, Damián A.; Wibecan, Brian (Springer New York LLC, 2013)
      [Abstract] Unified Parallel C (UPC) is a parallel extension of ANSI C based on the Partitioned Global Address Space (PGAS) programming model, which provides a shared memory view that simplifies code development while it ...
    • Design and Implementation of MapReduce using the PGAS Programming Model with UPC 

      Teijeiro Barjas, Carlos; Taboada, Guillermo L.; Touriño, Juan; Doallo, Ramón (IEEE Computer Society, 2012-01-03)
      [Abstract] MapReduce is a powerful tool for processing large data sets used by many applications running in distributed environments. However, despite the increasing number of computationally intensive problems that require ...
    • Design of efficient Java message-passing collectives on multi-core clusters 

      Taboada, Guillermo L.; Ramos Garea, Sabela; Touriño, Juan; Doallo, Ramón (Springer New York LLC, 2011-02)
      [Abstract] This paper presents a scalable and efficient Message-Passing in Java (MPJ) collective communication library for parallel computing on multi-core architectures. The continuous increase in the number of cores per ...
    • Design of Scalable Java Communication Middleware for Multi-Core Systems 

      Ramos Garea, Sabela; Taboada, Guillermo L.; Expósito, Roberto R.; Touriño, Juan; Doallo, Ramón (Oxford University Press, 2013-02-01)
      [Abstract] This paper presents smdev, a shared memory communication middleware for multi-core systems. smdev provides a simple and powerful messaging application program interface that is able to exploit the underlying ...
    • Design of scalable Java message-passing communications over InfiniBand 

      Expósito, Roberto R.; Taboada, Guillermo L.; Touriño, Juan; Doallo, Ramón (Springer New York LLC, 2012-07)
      [Abstract] This paper presents ibvdev a scalable and efficient low-level Java message-passing communication device over InfiniBand. The continuous increase in the number of cores per processor underscores the need for ...
    • Device level communication libraries for high‐performance computing in Java 

      Taboada, Guillermo L.; Touriño, Juan; Doallo, Ramón; Shafi, Aamir; Baker, Mark; Carpenter, Bryan (John Wiley & Sons Ltd., 2011-12-25)
      [Abstract] Since its release, the Java programming language has attracted considerable attention from the high‐performance computing (HPC) community because of its portability, high programming productivity, and built‐in ...
    • Efficient Java Communication Protocols on High-speed Cluster Interconnects 

      Taboada, Guillermo L.; Touriño, Juan; Doallo, Ramón (IEEE Computer Society, 2007-02-26)
      [Abstract] This paper presents communication strategies for achieving efficient parallel and distributed Java applications on clusters with high-speed interconnects. Communication performance is critical for the overall ...
    • Efficient Parallel Numerical Solver for the Elastohydrodynamic Reynolds–Hertz Problem 

      Arenaz Silva, Manuel; Doallo, Ramón; Touriño, Juan; Regueiro, Carlos V. (Elsevier BV * North-Holland, 2001-12-01)
      [Abstract] This work presents a parallel version of a complex numerical algorithm for solving an elastohydrodynamic piezoviscous lubrication problem studied in tribology. The numerical algorithm combines regula falsi, fixed ...
    • Enabling Hardware Affinity in JVM-Based Applications: A Case Study for Big Data 

      Expósito, Roberto R.; Veiga, Jorge; Touriño, Juan (Springer, 2020)
      [Abstract]: Java has been the backbone of Big Data processing for more than a decade due to its interesting features such as object orientation, cross-platform portability and good programming productivity. In fact, most ...
    • Enhancing in-memory Efficiency for MapReduce-based Data Processing 

      Veiga Fachal, Jorge; Expósito, Roberto R.; Taboada, Guillermo L.; Touriño, Juan (Academic Press, 2018-10)
      [Abstract] As the memory capacity of computational systems increases, the in-memory data management of Big Data processing frameworks becomes more crucial for performance. This paper analyzes and improves the memory ...
    • Evaluation of Java for General Purpose GPU Computing 

      Docampo, Jorge; Ramos Garea, Sabela; Taboada, Guillermo L.; Expósito, Roberto R.; Touriño, Juan; Doallo, Ramón (IEEE Computer Society, 2013-07-01)
      [Abstract] The presence of many-core units as accelerators has been increasing due to their ability to improve the performance of highly parallel workloads. General Purpose GPU(GPGPU) computing has allowed the graphical ...
    • Evaluation of messaging middleware for high-performance cloud computing 

      Expósito, Roberto R.; Taboada, Guillermo L.; Ramos Garea, Sabela; Touriño, Juan; Doallo, Ramón (Springer U K, 2013-12)
      [Abstract] Cloud computing is posing several challenges, such as security, fault tolerance, access interface singularity, and network constraints, both in terms of latency and bandwidth. In this scenario, the performance ...
    • Exploiting locality in the run-time parallelization of irregular loops 

      Martín, María J.; Singh, David E.; Touriño, Juan; Rivera, Francisco F. (C R C Press, LLC, 2002-12-10)
      [Abstract] The goal of this work is the efficient parallel execution of loops with indirect array accesses, in order to be embedded in a parallelizing compiler framework. In this kind of loop pattern, dependences can not ...