Search
Now showing items 61-70 of 264
An Inspector-Executor Algorithm for Irregular Assignment Parallelization
(Springer, 2004)
[Abstract] A loop with irregular assignment computations contains loop-carried output data dependences that can only be detected at run-time. In this paper, a load-balanced method based on the inspector-executor model is ...
A PVM Based Library for Sparse Matrix Factorizations
(Springer, 1998)
[Abstract] We present 3LM, a C Linked List Management Library for parallel sparse factorizations on a PVM environment which takes into account the fill-in, an important drawback of sparse computations. It is restricted to ...
Compiler support for parallel code generation through kernel recognition
(IEEE Computer Society, 2004-06-07)
[Abstract] Summary form only given. The automatic parallelization of loops that contain complex computations is still a challenge for current parallelizing compilers. The main limitations are related to the analysis of ...
Exploiting locality in the run-time parallelization of irregular loops
(C R C Press, LLC, 2002-12-10)
[Abstract] The goal of this work is the efficient parallel execution of loops with indirect array accesses, in order to be embedded in a parallelizing compiler framework. In this kind of loop pattern, dependences can not ...
Evaluation of Java for General Purpose GPU Computing
(IEEE Computer Society, 2013-07-01)
[Abstract] The presence of many-core units as accelerators has been increasing due to their ability to improve the performance of highly parallel workloads. General Purpose GPU(GPGPU) computing has allowed the graphical ...
Integrating the common information model with MDS4
(IEEE Computer Society, 2008-10-31)
[Abstract] The management and monitoring of static and dynamic resources is a key issue in grid environments. Information models are an abstract representation of software and hardware aspects of these resources, a common ...
Clupiter: a Raspberry Pi mini-supercomputer for educational purposes
(Institute of Electrical and Electronics Engineers, 2024)
[Abstract]: The main objective of this work is to bring supercomputing and parallel processing closer to non-specialized audiences by building a Raspberry Pi cluster, called Clupiter, which emulates the operation of a ...
VENOM: A Vectorized N:M Format for Unleashing the Power of Sparse Tensor Cores
(Association for Computing Machinery, 2023-11)
[Abstract]: The increasing success and scaling of Deep Learning models demands higher computational efficiency and power. Sparsification can lead to both smaller models as well as higher compute efficiency, and accelerated ...
Speed and accuracy improvement of higher-order epistasis detection on CUDA-enabled GPUs
(Springer, 2017)
[Abstract]: The discovery of higher-order epistatic interactions is an important task in the field of genome wide association studies which allows for the identification of complex interaction patterns between multiple ...
Robust step counting for inertial navigation with mobile phones
(MDPI AG, 2018-09-19)
[Abstract]: Mobile phones are increasingly used for purposes that have nothing to do with phone calls or simple data transfers, and one such use is indoor inertial navigation. Nevertheless, the development of a standalone ...