ITPilot: a toolkit for industrial-strength Web data extraction
| UDC.coleccion | Investigación | es_ES |
| UDC.conferenceTitle | The 2005 IEEE/WIC/ACM International Conference on Web Intelligence (WI'05) | es_ES |
| UDC.departamento | Ciencias da Computación e Tecnoloxías da Información | es_ES |
| UDC.grupoInv | Telemática | es_ES |
| dc.contributor.author | Pan Bermúdez, Alberto | |
| dc.contributor.author | Raposo Santiago, Juan | |
| dc.contributor.author | Álvarez Díaz, Manuel | |
| dc.contributor.author | Montoto, Paula | |
| dc.contributor.author | Losada, José | |
| dc.contributor.author | Hidalgo, Justo | |
| dc.date.accessioned | 2025-05-07T09:48:44Z | |
| dc.date.available | 2025-05-07T09:48:44Z | |
| dc.date.issued | 2005-10-17 | |
| dc.description | © 2005 IEEE. This version of the paper has been accepted for publication. Personal use of this material is permitted. Permission from IEEE must be obtained for all other uses, in any current or future media, including reprinting/republishing this material for advertising or promotional purposes, creating new collective works, for resale or redistribution to servers or lists, or reuse of any copyrighted component of this work in other works. | es_ES |
| dc.description | Conference held from 19 to 22 September 2005, Compiègne, France | es_ES |
| dc.description.abstract | [Abstract]: In recent years, many research systems have been proposed to perform data extraction and automation tasks on Web sources. Since most of today's Web sources are "human-readable" but not "machine-readable", these systems must address a number of difficult challenges, such as dealing with complex navigation sequences, extracting data from HTML pages and reacting to source changes. Denodo Corporation has developed ITPilot, an industrial-strength solution that allows complex "wrappers" for Web sources to be graphically generated and automatically maintained. This paper presents the architecture and the basic ideas "behind the scenes" in ITPilot. | es_ES |
| dc.identifier.citation | A. Pan, J. Raposo, M. Alvarez, P. Montoto, J. Losada, y J. Hidalgo, «ITPilot: A Toolkit for Industrial-Strength Web Data Extraction», en The 2005 IEEE/WIC/ACM International Conference on Web Intelligence (WI’05), Compiegne, France: IEEE, 2005, pp. 798-801. doi: 10.1109/WI.2005.85 | es_ES |
| dc.identifier.isbn | 0-7695-2415-X | |
| dc.identifier.uri | http://hdl.handle.net/2183/41924 | |
| dc.language.iso | eng | es_ES |
| dc.publisher | IEEE | es_ES |
| dc.relation.uri | https://doi.org/10.1109/WI.2005.85 | es_ES |
| dc.rights | Copyright © 2005, IEEE | es_ES |
| dc.rights.accessRights | open access | es_ES |
| dc.subject | Data mining | es_ES |
| dc.subject | Books | es_ES |
| dc.subject | Navigation | es_ES |
| dc.subject | Web services | es_ES |
| dc.subject | HTML | es_ES |
| dc.subject | Java | es_ES |
| dc.subject | Computer languages | es_ES |
| dc.subject | Automation | es_ES |
| dc.subject | Computer architecture | es_ES |
| dc.subject | World Wide Web | es_ES |
| dc.title | ITPilot: a toolkit for industrial-strength Web data extraction | es_ES |
| dc.type | conference output | es_ES |
| dc.type.hasVersion | AM | es_ES |
| dspace.entity.type | Publication | |
| relation.isAuthorOfPublication | 79d8a555-94f9-4edc-b6d1-ad514f81941d | |
| relation.isAuthorOfPublication | 76f0a84a-79bb-4d46-8de5-a960191fb925 | |
| relation.isAuthorOfPublication | 8fb413a7-b40a-48ad-861f-985d0492628e | |
| relation.isAuthorOfPublication | 6711ba39-80ba-4e57-8881-db47fc022efd | |
| relation.isAuthorOfPublication | 400c236a-710a-4526-b9f3-f496a36ccfe0 | |
| relation.isAuthorOfPublication.latestForDiscovery | 79d8a555-94f9-4edc-b6d1-ad514f81941d |
Files
Original bundle
1 - 1 of 1
Loading...
- Name:
- Pan_Alberto_2005_ITPilot.pdf
- Size:
- 785.46 KB
- Format:
- Adobe Portable Document Format
- Description:
- Accepted Manuscript

