Buscar
Mostrando ítems 1-10 de 11
SeQual: Big Data Tool to Perform Quality Control and Data Preprocessing of Large NGS Datasets
(Institute of Electrical and Electronics Engineers, 2020-08-07)
[Abstract]
This paper presents SeQual, a scalable tool to efficiently perform quality control of large genomic datasets. Our tool currently supports more than 30 different operations (e.g., filtering, trimming, formatting) ...
Toxo: A Library for Calculating Penetrance Tables of High-Order Epistasis Models
(BioMed Central Ltd., 2020-04-09)
[Abstract]
Background
Epistasis is defined as the interaction between different genes when expressing a specific phenotype. The most common way to characterize an epistatic relationship is using a penetrance table, which ...
ScalaParBiBit: Scaling the Binary Biclustering in Distributed-Memory Systems
(SpringerLink, 2021-03-19)
[Abstract] Biclustering is a data mining technique that allows us to find groups of rows and columns that are highly correlated in a 2D dataset. Although there exist several software applications to perform biclustering, ...
Evaluation of Existing Methods for High-Order Epistasis Detection
(Institute of Electrical and Electronics Engineers, 2020-10-15)
[Abstract]
Finding epistatic interactions among loci when expressing a phenotype is a widely employed strategy to understand the genetic architecture of complex traits in GWAS. The abundance of methods dedicated to the ...
Parallel-FST: A feature selection library for multicore clusters
(Elsevier, 2022-11)
[Abstract]: Feature selection is a subfield of machine learning focused on reducing the dimensionality of datasets by performing a computationally intensive process. This work presents Parallel-FST, a publicly available ...
MPI-dot2dot: A Parallel Tool to Find DNA Tandem Repeats on Multicore Clusters
(Springer, 2022)
[Abstract] Tandem Repeats (TRs) are segments that occur several times in a DNA sequence, and each copy is adjacent to other. In the last few years, TRs have gained significant attention as they are thought to be related ...
PyToxo: a Python tool for calculating penetrance tables of high-order epistasis models
(BMC, 2022)
[Abstract] Background
Epistasis is the interaction between different genes when expressing a certain phenotype. If epistasis involves more than two loci it is called high-order epistasis. High-order epistasis is an area ...
A SIMD Algorithm for the Detection of Epistatic Interactions of Any Order
(Elsevier, 2022)
[Abstract] Epistasis is a phenomenon in which a phenotype outcome is determined by the interaction of genetic variation at two or more loci and it cannot be attributed to the additive combination of effects corresponding ...
CUDA-JMI: Acceleration of feature selection on heterogeneous systems
(Elsevier, 2020-01)
[Abstract]: Feature selection is a crucial step nowadays in machine learning and data analytics to remove irrelevant and redundant characteristics and thus to provide fast and reliable analyses. Many research works have ...
SMusket: Spark-based DNA error correction on distributed-memory systems
(Elsevier B.V., 2020)
[Abstract]: Next-Generation Sequencing (NGS) technologies have revolutionized genomics research over the last decade, bringing new opportunities for scientists to perform groundbreaking biological studies. Error correction ...