Use this link to cite:
http://hdl.handle.net/2183/34023 Prototipo de sistema de reconocimiento de entidades para la extracción de información en fuentes no estructuradas
Loading...
Identifiers
Publication date
Authors
Prado-Valiño, Francisco
Advisors
Other responsabilities
Universidade da Coruña. Facultade de Informática
Journal Title
Bibliographic citation
Type of academic work
Academic degree
Abstract
[Resumen]: La investigación en el ámbito biomédico requiere del estudio de enormes cantidades de información
textual no estructurada, lo cual supone un gran gasto de tiempo y recursos por
parte de los expertos médicos. Debido a esto, existe un gran interés por desarrollar sistemas
capaces de automatizar estas tareas mediante la Minería de Texto. Una de las tareas clave de la
Minería de Texto es el Reconocimiento de Entidades, que se encargan de extraer las entidades
de interés de los textos y clasificarlas en categorías preestablecidas.
Nuestro trabajo consiste en la aplicación de dichas técnicas para automatizar la detección
de entidades en textos clínicos no estructurados. En nuestro caso, nos vamos a centrar en
el ámbito de la resistencia a antimicrobianos, concretamente en la resistencia de bacterias a
antibióticos. Esta tarea forma parte del proyecto GRALENIA, que busca mejorar la gestión
digital de la resistencia a antimicrobianos en el ámbito hospitalario. El objetivo principal de
este trabajo es la implementación de un prototipo del sistema de Reconocimiento de Entidades
encargado de preidentificar y preetiquetar las expresiones sintomáticas de interés (síntomas,
enfermedades, etc) en los informes clínicos. Los resultados obtenidos serán usados por etiquetadores
humanos en etapas futuras para entrenar modelos de aprendizaje automático que
identifiquen de forma más robusta las expresiones de interés. Puesto que existe una escasez
de datos y de un corpus de evaluación específicos al ámbito del proyecto matriz, añadiremos
como objetivo el estudio de la problemática de la falta de recursos y posibles soluciones
alternativas que posteriormente deberán adaptarse a los datos reales.
[Abstract]: Research in the biomedical field requires the study of huge amounts of unstructured textual information, which is very time consuming and resource intensive for medical experts. Because of this, there is great interest in developing systems capable of automating these tasks through Text Mining. One of the key tasks of Text Mining is Entity Recognition, which extracts entities of interest from texts and classifies them into pre-established categories. Our work consists in the application of these techniques to automate the detection of entities in unstructured clinical texts. In our case, we are going to focus on the field of antimicrobial resistance, specifically on bacterial resistance to antibiotics. This work is part of the GRALENIA project, which aims to improve the digital management of antimicrobial resistance in hospitals. The main objective of this work is the implementation of a prototype of the Entity Recognition system in charge of pre-identifying and pre-tagging symptomatic expressions (symptoms, diseases, etc) of interest in clinical reports. The results obtained will be used by human labellers in future stages to train machine learning models that more robustly identify expressions of interest. Since there is a shortage of data and of an evaluation corpus specific to the scope of the parent project, we will add as an objective the study of the problem of the lack of resources and possible alternative solutions that will subsequently have to be adapted to the real data.
[Abstract]: Research in the biomedical field requires the study of huge amounts of unstructured textual information, which is very time consuming and resource intensive for medical experts. Because of this, there is great interest in developing systems capable of automating these tasks through Text Mining. One of the key tasks of Text Mining is Entity Recognition, which extracts entities of interest from texts and classifies them into pre-established categories. Our work consists in the application of these techniques to automate the detection of entities in unstructured clinical texts. In our case, we are going to focus on the field of antimicrobial resistance, specifically on bacterial resistance to antibiotics. This work is part of the GRALENIA project, which aims to improve the digital management of antimicrobial resistance in hospitals. The main objective of this work is the implementation of a prototype of the Entity Recognition system in charge of pre-identifying and pre-tagging symptomatic expressions (symptoms, diseases, etc) of interest in clinical reports. The results obtained will be used by human labellers in future stages to train machine learning models that more robustly identify expressions of interest. Since there is a shortage of data and of an evaluation corpus specific to the scope of the parent project, we will add as an objective the study of the problem of the lack of resources and possible alternative solutions that will subsequently have to be adapted to the real data.
Description
Editor version
Rights
Atribución 3.0 España








