
Paramètres
En savoir plus sur le livre
Automatic fact retrieval from text documents is a crucial technology in the Information Age, aimed at assisting users in extracting valuable information from extensive data sources like intranets and the World Wide Web. Most existing systems focus on document retrieval, with few tailored solutions available for querying facts from the vast online information. Recent advancements in Information Extraction (IE) have led to the development of methods that automatically create extraction procedures, known as wrappers, which enable documents to be treated like relational databases. These wrappers are essential for future Intelligent Information Systems, allowing users to query, compare, and combine information from diverse textual sources. This thesis introduces a Logic Programming and Inductive Logic Programming (ILP) framework for supervised learning of wrappers using only positive examples. Unlike existing systems, this approach employs a purely logical bottom-up learning method under a new IE-ILP semantics. It presents three classes of ILP algorithms, including two one-step learning algorithms and a set of iterative algorithms, along with one that integrates clustering techniques with iterative ILP. The study explores various extraction tasks, defines wrapper classes, and presents three wrapper models based on different document representations. The learning algorithms and models are evaluated against standard test cases, showi
Achat du livre
Machine learning of information extraction procedures, Bernd Thomas
- Langue
- Année de publication
- 2005
- product-detail.submit-box.info.binding
- (souple)
Modes de paiement
Personne n'a encore évalué .