Buscar
Mostrando ítems 1-10 de 12
Regional Finite-State Error Repair
(Springer-Verlag, 2004)
[Abstract] We describe an algorithm to deal with error repair over finite-state architectures. Such a technique is of interest in spelling correction as well as approximate string matching in a variety of applications ...
Spelling correction on technical documents
(2005)
[Abstract] We describe a novel approach to spelling correction applied on technical documents, a task that requires a number of especific properties such as eficiency, safety and maintenance. In opposite to previous works, ...
Regional finite-state error repair
(Springer-Verlag, 2004)
[Abstract] We describe an algorithm to deal with error repair over finite-state architectures. Such a technique is of interest in spelling correction as well as approximate string matching in a variety of applications ...
Formal methods of tokenization for part-of-speech tagging
(Springer-Verlag, 2002)
[Abstract] One of the most important prior tasks for robust part-of-speech tagging is the correct tokenization or segmentation of the texts. This task can involve processes which are much more complex than the simple ...
Integrating external dictionaries into Part-of-speech taggers
(2001)
[Abstract] The highest performances in part-of-speech tagging have been obtained by using stochastic methods, such as hidden Markov models. The running parameters of a hidden Markov model for tagging can be estimated from ...
Tokenization and proper noun recognition for information retrieval
(IEEE Computer Society Press, 2005-11-21)
[Abstract] In this paper we consider a set of natural language processing techniques that can be used to analyze large amounts of texts, focusing on the advanced tokenizer which accounts for a number of complex linguistic ...
Regional versus global finite-state error repair
(Springer-Verlag, 2005)
[Abstract] We focus on the domain of a regional least-cost strategy in order to illustrate the viability of non-global repair models over finitestate architectures. Our interest is justified by the difficulty, shared by ...
Practical NLP-Based Text Indexing
(Springer Verlag, 2002)
A common solution for tokenization and part-of-speech tagging: one-pass Viterbi algorithm vs. Iterative approaches
(Springer-Verlag, 2002)
Current taggers assume that input texts are already tokenized, i.e. correctly segmented in \emph{tokens} or high level information units that identify each individual component of the texts. This working hypothesis is ...