Use this link to cite:
https://hdl.handle.net/2183/47072 GALIASdoc: Automatic Intermediate Language Generator for fast Syntactic Analysis over massive document sets
Loading...
Identifiers
Publication date
Authors
Trabazo-Sardón, Diego
Silvelo, Arturo
Advisors
Other responsabilities
Journal Title
Bibliographic citation
Type of academic work
Academic degree
Abstract
[Abstract]: The GALIASdoc software is a system for extracting relevant information from large volumes of documents with common formats and heterogeneous origins. The data obtained are ready to be exploited by other applications such as content management systems (CMS), enterprise resource planning (ERP) systems, databases, and similar platforms. The system is responsible for identifying the document model in order to locate the semantic information it contains. During the ingestion process, an initial version in text format is obtained, applying optical character recognition (OCR) techniques when necessary. The model includes geometric data defining the areas of interest presented in the document. This record has been in operational use since 2020 through the signing of two exploitation contracts with companies in the ICT sector.
Description
Registration of the intellectual property (of a software)
Editor version
Rights
Right holders: Universidade da Coruña (100%)







