Semantic Relation Extraction. Resources, Tools and Strategies

Ver/ abrir
Use este enlace para citar
http://hdl.handle.net/2183/19316Coleccións
- Investigación (FFIL) [840]
Metadatos
Mostrar o rexistro completo do ítemTítulo
Semantic Relation Extraction. Resources, Tools and StrategiesAutor(es)
Data
2016-07Cita bibliográfica
Marcos Garcia, Semantic Relation Extraction. Resources, Tools and Strategies, in João Silva, Ricardo Ribeiro, Paulo Quaresma, André Adami, António Branco (eds.), Computational Processing of the Portuguese Language. 12th International Conference, PROPOR 2016, Tomar, Portugal, July 13-15, 2016, Proceedings, volume 9727 of Lecture Notes in Artificial Intelligence, pp. 141-152, Springer, 2016.
Resumo
[Abstract] Relation extraction is a subtask of information extraction that aims at obtaining instances of semantic relations present in texts. This information can be arranged in machine-readable formats, useful for several applications that need structured semantic knowledge. The work presented in this paper explores different strategies to automate the extraction of semantic relations from texts in Portuguese, Galician and Spanish. Both machine learning (distant-supervised and supervised) and rule-based techniques are investigated, and the impact of the different levels of linguistic knowledge is analyzed for the various approaches. Regarding domains, the experiments are focused on the extraction of encyclopedic knowledge, by means of the development of biographical relations classifiers (in a closed domain) and the evaluation of an open information extraction tool. To implement the extraction systems, several natural language processing tools have been built for the three research languages: From sentence splitting and tokenization modules to part-of-speech taggers, named entity recognizers and coreference resolution systems. Furthermore, several lexica and corpora have been compiled and enriched with different levels of linguistic annotation, which are useful for both training and testing probabilistic and symbolic models. As a result of the performed work, new resources and tools are available for automated processing of texts in Portuguese, Galician and Spanish.
Palabras chave
Information extraction
Natural language processing
Named entity recognition
Part-of-speech tagging
Coreference resolution
Natural language processing
Named entity recognition
Part-of-speech tagging
Coreference resolution
Versión do editor
ISSN
0302-9743
ISBN
978-3-319-41551-2