Skip navigation
  •  Inicio
  • UDC 
    • Cómo depositar
    • Políticas del RUC
    • FAQ
    • Derechos de autor
    • Más información en INFOguías UDC
  • Listar 
    • Comunidades
    • Buscar por:
    • Fecha de publicación
    • Autor
    • Título
    • Materia
  • Ayuda
    • español
    • Gallegan
    • English
  • Acceder
  •  Español 
    • Español
    • Galego
    • English
  
Ver ítem 
  •   RUC
  • Facultade de Informática
  • Investigación (FIC)
  • Ver ítem
  •   RUC
  • Facultade de Informática
  • Investigación (FIC)
  • Ver ítem
JavaScript is disabled for your browser. Some features of this site may not work without it.

HSRA: Hadoop-based spliced read aligner for RNA sequencing data

Thumbnail
Ver/Abrir
R.R.Exósito_2018_HSRA_Hadoop-based_Spliced_Read_Aligner_for_RNA_Sequencing_Data.pdf (3.934Mb)
Use este enlace para citar
http://hdl.handle.net/2183/21813
Colecciones
  • Investigación (FIC) [1685]
Metadatos
Mostrar el registro completo del ítem
Título
HSRA: Hadoop-based spliced read aligner for RNA sequencing data
Autor(es)
Expósito, Roberto R.
González-Domínguez, Jorge
Touriño, Juan
Fecha
2018-07-31
Cita bibliográfica
Expósito RR, González-Domínguez J, Touriño J (2018) HSRA: Hadoop-based spliced read aligner for RNA sequencing data. PLoS ONE 13(7): e0201483. https://doi.org/10.1371/journal.pone.0201483
Resumen
[Abstract] Nowadays, the analysis of transcriptome sequencing (RNA-seq) data has become the standard method for quantifying the levels of gene expression. In RNA-seq experiments, the mapping of short reads to a reference genome or transcriptome is considered a crucial step that remains as one of the most time-consuming. With the steady development of Next Generation Sequencing (NGS) technologies, unprecedented amounts of genomic data introduce significant challenges in terms of storage, processing and downstream analysis. As cost and throughput continue to improve, there is a growing need for new software solutions that minimize the impact of increasing data volume on RNA read alignment. In this work we introduce HSRA, a Big Data tool that takes advantage of the MapReduce programming model to extend the multithreading capabilities of a state-of-the-art spliced read aligner for RNA-seq data (HISAT2) to distributed memory systems such as multi-core clusters or cloud platforms. HSRA has been built upon the Hadoop MapReduce framework and supports both single- and paired-end reads from FASTQ/FASTA datasets, providing output alignments in SAM format. The design of HSRA has been carefully optimized to avoid the main limitations and major causes of inefficiency found in previous Big Data mapping tools, which cannot fully exploit the raw performance of the underlying aligner. On a 16-node multi-core cluster, HSRA is on average 2.3 times faster than previous Hadoop-based tools. Source code in Java as well as a user’s guide are publicly available for download at http://hsra.dec.udc.es.
Palabras clave
Sequence alignment
Data processing
Genome analysis
RNA
Sequencing
Memory
RNA analysis
Preprocessing
RNA alignment
 
Versión del editor
https://doi.org/10.1371/journal.pone.0201483
ISSN
1932-6203

Listar

Todo RUCComunidades & ColeccionesPor fecha de publicaciónAutoresTítulosMateriasGrupo de InvestigaciónTitulaciónEsta colecciónPor fecha de publicaciónAutoresTítulosMateriasGrupo de InvestigaciónTitulación

Mi cuenta

AccederRegistro

Estadísticas

Ver Estadísticas de uso
Sherpa
OpenArchives
OAIster
Scholar Google
UNIVERSIDADE DA CORUÑA. Servizo de Biblioteca.    DSpace Software Copyright © 2002-2013 Duraspace - Sugerencias