Buscar
Mostrando ítems 1-10 de 15
SeQual: Big Data Tool to Perform Quality Control and Data Preprocessing of Large NGS Datasets
(Institute of Electrical and Electronics Engineers, 2020-08-07)
[Abstract]
This paper presents SeQual, a scalable tool to efficiently perform quality control of large genomic datasets. Our tool currently supports more than 30 different operations (e.g., filtering, trimming, formatting) ...
Toxo: A Library for Calculating Penetrance Tables of High-Order Epistasis Models
(BioMed Central Ltd., 2020-04-09)
[Abstract]
Background
Epistasis is defined as the interaction between different genes when expressing a specific phenotype. The most common way to characterize an epistatic relationship is using a penetrance table, which ...
Acceleration of a Feature Selection Algorithm Using High Performance Computing
(MDPI AG, 2020-09-01)
[Abstract]
Feature selection is a subfield of data analysis that is on reducing the dimensionality of datasets, so that subsequent analyses over them can be performed in affordable execution times while keeping the same ...
ScalaParBiBit: Scaling the Binary Biclustering in Distributed-Memory Systems
(SpringerLink, 2021-03-19)
[Abstract] Biclustering is a data mining technique that allows us to find groups of rows and columns that are highly correlated in a 2D dataset. Although there exist several software applications to perform biclustering, ...
Evaluation of Existing Methods for High-Order Epistasis Detection
(Institute of Electrical and Electronics Engineers, 2020-10-15)
[Abstract]
Finding epistatic interactions among loci when expressing a phenotype is a widely employed strategy to understand the genetic architecture of complex traits in GWAS. The abundance of methods dedicated to the ...
Parallel-FST: A feature selection library for multicore clusters
(Elsevier, 2022-11)
[Abstract]: Feature selection is a subfield of machine learning focused on reducing the dimensionality of datasets by performing a computationally intensive process. This work presents Parallel-FST, a publicly available ...
PATO: genome-wide prediction of lncRNA-DNA triple helices
(Oxford University Press, 2023-03)
[Abstract]: Motivation: Long non-coding RNA (lncRNA) plays a key role in many biological processes. For instance, lncRNA regulates chromatin using different molecular mechanisms, including direct RNA-DNA hybridization via ...
MPI-dot2dot: A Parallel Tool to Find DNA Tandem Repeats on Multicore Clusters
(Springer, 2022)
[Abstract] Tandem Repeats (TRs) are segments that occur several times in a DNA sequence, and each copy is adjacent to other. In the last few years, TRs have gained significant attention as they are thought to be related ...
PyToxo: a Python tool for calculating penetrance tables of high-order epistasis models
(BMC, 2022)
[Abstract] Background
Epistasis is the interaction between different genes when expressing a certain phenotype. If epistasis involves more than two loci it is called high-order epistasis. High-order epistasis is an area ...
A SIMD Algorithm for the Detection of Epistatic Interactions of Any Order
(Elsevier, 2022)
[Abstract] Epistasis is a phenomenon in which a phenotype outcome is determined by the interaction of genetic variation at two or more loci and it cannot be attributed to the additive combination of effects corresponding ...