Cross-Language Information Retrieval

Overview

Search for information is no longer exclusively limited within the native language of the user, but is more and more extended to other languages. This gives rise to the problem of cross-language information retrieval (CLIR), whose goal is to find relevant information written in a different language to a query. In addition to the problems of monolingual information retrieval (IR), translation is the key problem in CLIR: one should translate either the query or the documents from a language to another. However, ...
See more details below
Paperback
$35.00
BN.com price
(Save 12%)$40.00 List Price
Other sellers (Paperback)
  • All (8) from $34.99   
  • New (6) from $34.99   
  • Used (2) from $35.59   
Sending request ...

Overview

Search for information is no longer exclusively limited within the native language of the user, but is more and more extended to other languages. This gives rise to the problem of cross-language information retrieval (CLIR), whose goal is to find relevant information written in a different language to a query. In addition to the problems of monolingual information retrieval (IR), translation is the key problem in CLIR: one should translate either the query or the documents from a language to another. However, this translation problem is not identical to full-text machine translation (MT): the goal is not to produce a human-readable translation, but a translation suitable for finding relevant documents. Specific translation methods are thus required. The goal of this book is to provide a comprehensive description of the specific problems arising in CLIR, the solutions proposed in this area, as well as the remaining problems. The book starts with a general description of the monolingual IR and CLIR problems. Different classes of approaches to translation are then presented: approaches using an MT system, dictionary-based translation and approaches based on parallel and comparable corpora. In addition, the typical retrieval effectiveness using different approaches is compared. It will be shown that translation approaches specifically designed for CLIR can rival and outperform high-quality MT systems. Finally, the book offers a look into the future that draws a strong parallel between query expansion in monolingual IR and query translation in CLIR, suggesting that many approaches developed in monolingual IR can be adapted to CLIR.

The book can be used as an introduction to CLIR. Advanced readers can also find more technical details and discussions about the remaining research challenges in the future. It is suitable to new researchers who intend to carry out research on CLIR.

Read More Show Less

Product Details

Table of Contents

Preface

1 Introduction 1

1.1 General IR Problems 1

1.2 General IR Approaches 2

1.2.1 IR Models 3

1.2.1.1 Boolean Models 3

1.2.1.2 Vector Space Model 4

1.2.1.3 Probabilistic Models 5

1.2.1.4 Statistical Language Models 6

1.2.2 Query Expansion 8

1.2.3 System Evaluation 10

1.3 Language Problems in IR 12

1.3.1 European Languages 12

1.3.1.1 Word Stemming 12

1.3.1.2 Decompounding 12

1.3.2 East Asian Languages 14

1.3.2.1 Chinese and Word Segmentation 14

1.3.2.2 Japanese and Korean 17

1.3.3 Other Languages 17

1.4 The Problems of Cross-Language Information Retrieval 18

1.4.1 Query Translation vs. Document Translation 19

1.4.2 Using Pivot Language and Interlingua 20

1.5 Approaches to Translation in CLIR 21

1.6 The Need for Cross-Language and Multilingual IR 23

1.7 The History of CLIR 24

2 Using Manually Constructed Translation Systems and Resources for CLIR 29

2.1 Machine Translation 29

2.1.1 Rule-Based MT 30

2.1.2 Statistical MT 32

2.2 Basic utilization of MT in CLIR 37

2.2.1 Rule-Based MT 39

2.2.2 Statistical MT 41

2.2.3 Unknown Word 41

2.3 Open the Box of MT 44

2.4 Dictionary-Based Translation for CLIR 45

2.4.1 Basic Approaches 46

2.4.2 The Term Weighting Problem 47

2.4.3 Coverage of the Dictionary 49

2.4.4 Translation Ambiguity 50

2.4.5 Selection of Translation Words 50

2.4.6 Other Related Approaches 53

3 Translation Based on Parallel and Comparable Corpora 57

3.1 Parallel Corpora 57

3.2 Paragraph/Sentence Alignment 60

3.3 Utilization of Translation Models in CLIR 63

3.4 Embedding Translation Models into CLIR Models 70

3.5 Alternative Approaches using Parallel Corpora 75

3.5.1 Exploiting a Parallel Corpus by Pseudo-Relevance Feedback 75

3.5.2 Using Latent Semantic Indexing (LSI) 76

3.5.3 Using Comparable Corpora 78

3.6 Discussions on CLIR Methods and Resources 80

3.7 Mining for Translation Resources and Relations 81

3.7.1 Mining for Parallel Texts 81

3.7.2 Transliteration 85

3.7.3 Mining Translations using Hyperlinks 88

3.7.4 Mining Translations from Monolingual Web Pages 90

4 Other Methods to Improve CLIR 95

4.1 Pre- and Post-Translation Expansion 95

4.2 Fuzzy Matching 96

4.3 Combining Translations 97

4.4 Transitive Translation 98

4.5 Integrating Monolingual and Translingual Relations 100

4.6 Discussions 103

5 A Look into the Future: Toward a Unified View of Monolingual IR and CLIR? 105

5.1 What has been Achieved? 105

5.2 Inspiring from Monolingual IR 106

5.2.1 Parallel Between Query Expansion and Query Translation 106

5.2.2 Inspiring Query Translation from Query Expansion---An Example 109

References 113

Author Biography 125

Read More Show Less

Customer Reviews

Be the first to write a review
( 0 )
Rating Distribution

5 Star

(0)

4 Star

(0)

3 Star

(0)

2 Star

(0)

1 Star

(0)

Your Rating:

Your Name: Create a Pen Name or

Barnes & Noble.com Review Rules

Our reader reviews allow you to share your comments on titles you liked, or didn't, with others. By submitting an online review, you are representing to Barnes & Noble.com that all information contained in your review is original and accurate in all respects, and that the submission of such content by you and the posting of such content by Barnes & Noble.com does not and will not violate the rights of any third party. Please follow the rules below to help ensure that your review can be posted.

Reviews by Our Customers Under the Age of 13

We highly value and respect everyone's opinion concerning the titles we offer. However, we cannot allow persons under the age of 13 to have accounts at BN.com or to post customer reviews. Please see our Terms of Use for more details.

What to exclude from your review:

Please do not write about reviews, commentary, or information posted on the product page. If you see any errors in the information on the product page, please send us an email.

Reviews should not contain any of the following:

  • - HTML tags, profanity, obscenities, vulgarities, or comments that defame anyone
  • - Time-sensitive information such as tour dates, signings, lectures, etc.
  • - Single-word reviews. Other people will read your review to discover why you liked or didn't like the title. Be descriptive.
  • - Comments focusing on the author or that may ruin the ending for others
  • - Phone numbers, addresses, URLs
  • - Pricing and availability information or alternative ordering information
  • - Advertisements or commercial solicitation

Reminder:

  • - By submitting a review, you grant to Barnes & Noble.com and its sublicensees the royalty-free, perpetual, irrevocable right and license to use the review in accordance with the Barnes & Noble.com Terms of Use.
  • - Barnes & Noble.com reserves the right not to post any review -- particularly those that do not follow the terms and conditions of these Rules. Barnes & Noble.com also reserves the right to remove any review at any time without notice.
  • - See Terms of Use for other conditions and disclaimers.
Search for Products You'd Like to Recommend

Recommend other products that relate to your review. Just search for them below and share!

Create a Pen Name

Your Pen Name is your unique identity on BN.com. It will appear on the reviews you write and other website activities. Your Pen Name cannot be edited, changed or deleted once submitted.

 
Your Pen Name can be any combination of alphanumeric characters (plus - and _), and must be at least two characters long.

Continue Anonymously

    If you find inappropriate content, please report it to Barnes & Noble
    Why is this product inappropriate?
    Comments (optional)