ParDRe: faster parallel duplicated reads removal tool for sequencing studies

Use este enlace para citar
http://hdl.handle.net/2183/20962Colecciones
- Investigación (FIC) [1685]
Metadatos
Mostrar el registro completo del ítemTítulo
ParDRe: faster parallel duplicated reads removal tool for sequencing studiesFecha
2016Cita bibliográfica
Jorge González-Domínguez, Bertil Schmidt; ParDRe: faster parallel duplicated reads removal tool for sequencing studies, Bioinformatics, Volume 32, Issue 10, 15 May 2016, Pages 1562–1564, https://doi.org/10.1093/bioinformatics/btw038
Resumen
[Abstract] Summary: Current next generation sequencing technologies often generate duplicated or near-duplicated reads that (depending on the application scenario) do not provide any interesting biological information but can increase memory requirements and computational time of downstream analysis. In this work we present ParDRe , a de novo parallel tool to remove duplicated and near-duplicated reads through the clustering of Single-End or Paired-End sequences from fasta or fastq files. It uses a novel bitwise approach to compare the suffixes of DNA strings and employs hybrid MPI/multithreading to reduce runtime on multicore systems. We show that ParDRe is up to 27.29 times faster than Fulcrum (a representative state-of-the-art tool) on a platform with two 8-core Sandy-Bridge processors.
Availability and implementation: Source code in C ++ and MPI running on Linux systems as well as a reference manual are available at https://sourceforge.net/projects/pardre/
Palabras clave
ParDRe
Parallel tool
DNA strings
Hybrid MPI/multithreading
Parallel tool
DNA strings
Hybrid MPI/multithreading
Descripción
This is a pre-copyedited, author-produced version of an article accepted for publication in Bioinformatics following peer review. The version of record [insert complete citation information here] is available online at: https://doi.org/10.1093/bioinformatics/btw038
Versión del editor
ISSN
1367-4803
1367-4811
1367-4811