Intelligent Document Retrieval: Exploiting Markup Structure

Intelligent Document Retrieval: Exploiting Markup Structure

by Udo Kruschwitz
     
 

Collections of digital documents can nowadays be found everywhere in institutions, universities or companies. Examples are Web sites or intranets. But searching them for information can still be painful. Searches often return either large numbers of matches or no suitable matches at all.

Such document collections can vary a lot in size and how much structure they

See more details below

Overview

Collections of digital documents can nowadays be found everywhere in institutions, universities or companies. Examples are Web sites or intranets. But searching them for information can still be painful. Searches often return either large numbers of matches or no suitable matches at all.

Such document collections can vary a lot in size and how much structure they carry. What they have in common is that they typically do have some structure and that they cover a limited range of topics. The second point is significantly different from documents on the Web in general.

The type of search system that we propose in this book can suggest ways of refining or relaxing the query to assist a user in the search process. In order to suggest sensible query modifications we would need to know what the documents are about. Explicit knowledge about the document collection encoded in some electronic form is what we need. However, typically such knowledge is not available.

This book describes how that knowledge can be contructed automatically.

This book

demonstrates how document markup structure can be used to construct domain models for collections of partially structured documents

shows how such knowledge can be utilized when searching the document collections

presents two implemented search systems which demonstrate the usefulness of this approach.

Read More

Editorial Reviews

From the Publisher
From the reviews:

"The main idea of this book, based on the author’s PhD thesis, is to use markup information as a series of cues to the significance of words and concepts in a text, thus enhancing the indexing of that text. The technique is developed for collections of texts with a specific focus, such as a Web site or a collection of documents … . The presented approach is attractive, because it can be adapted to different contexts in a straightforward manner … ." (D. T. Barnard, Computing Reviews, July, 2006)

Product Details

ISBN-13:
9789048169573
Publisher:
Springer Netherlands
Publication date:
01/11/2011
Series:
The Information Retrieval Series, #17
Edition description:
Softcover reprint of hardcover 1st ed. 2005
Pages:
198
Product dimensions:
6.14(w) x 9.21(h) x 0.46(d)

Customer Reviews

Average Review:

Write a Review

and post it to your social network

     

Most Helpful Customer Reviews

See all customer reviews >