Multilingual Natural Language Processing Applications: From Theory to Practice

Overview

Multilingual Natural Language Processing Applications is the first comprehensive single-source guide to building robust and accurate multilingual NLP systems. Edited by two leading experts, it integrates cutting-edge advances with practical solutions drawn from extensive field experience.

Part I introduces the core concepts and theoretical foundations of modern multilingual natural language processing, presenting today’s best practices for understanding word and document ...

See more details below
Hardcover (New Edition)
$122.13
BN.com price
(Save 6%)$130.00 List Price
Other sellers (Hardcover)
  • All (14) from $23.60   
  • New (8) from $64.95   
  • Used (6) from $23.60   
Multilingual Natural Language Processing Applications: From Theory to Practice

Available on NOOK devices and apps  
  • NOOK Devices
  • NOOK HD/HD+ Tablet
  • NOOK
  • NOOK Color
  • NOOK Tablet
  • Tablet/Phone
  • NOOK for Windows 8 Tablet
  • NOOK for iOS
  • NOOK for Android
  • NOOK Kids for iPad
  • PC/Mac
  • NOOK for Windows 8
  • NOOK for PC
  • NOOK for Mac
  • NOOK Study

Want a NOOK? Explore Now

NOOK Book (eBook)
$59.49
BN.com price
(Save 42%)$103.99 List Price

Overview

Multilingual Natural Language Processing Applications is the first comprehensive single-source guide to building robust and accurate multilingual NLP systems. Edited by two leading experts, it integrates cutting-edge advances with practical solutions drawn from extensive field experience.

Part I introduces the core concepts and theoretical foundations of modern multilingual natural language processing, presenting today’s best practices for understanding word and document structure, analyzing syntax, modeling language, recognizing entailment, and detecting redundancy.

Part II thoroughly addresses the practical considerations associated with building real-world applications, including information extraction, machine translation, information retrieval/search, summarization, question answering, distillation, processing pipelines, and more.

This book contains important new contributions from leading researchers at IBM, Google, Microsoft, Thomson Reuters, BBN, CMU, University of Edinburgh, University of Washington, University of North Texas, and others.

Coverage includes

Core NLP problems, and today’s best algorithms for attacking them

  • Processing the diverse morphologies present in the world’s languages
  • Uncovering syntactical structure, parsing semantics, using semantic role labeling, and scoring grammaticality
  • Recognizing inferences, subjectivity, and opinion polarity
  • Managing key algorithmic and design tradeoffs in real-world applications
  • Extracting information via mention detection, coreference resolution, and events
  • Building large-scale systems for machine translation, information retrieval, and summarization
  • Answering complex questions through distillation and other advanced techniques
  • Creating dialog systems that leverage advances in speech recognition, synthesis, and dialog management
  • Constructing common infrastructure for multiple multilingual text processing applications

This book will be invaluable for all engineers, software developers, researchers, and graduate students who want to process large quantities of text in multiple languages, in any environment: government, corporate, or academic.

Read More Show Less

Product Details

  • ISBN-13: 9780137151448
  • Publisher: IBM Press
  • Publication date: 5/24/2012
  • Series: IBM Press Series
  • Edition description: New Edition
  • Edition number: 1
  • Pages: 640
  • Sales rank: 1,445,406
  • Product dimensions: 7.37 (w) x 9.50 (h) x 1.45 (d)

Meet the Author

Daniel M. Bikel is a senior research scientist at Google, developing new methods for NLP and speech recognition. While at IBM, he architected the distillation system for IBM’s GALE multilingual information extraction and question-answering system. While pursuing his doctorate at Penn, he built the first extensible multilingual syntactic parsing engine.

Imed Zitouni is a senior research scientist at IBM. He has led IBM’s Arabic information extraction and data resources efforts since 2004. He previously led both DIALOCA’s Speech/NLP group and Bell Labs/ Alcatel-Lucent’s language modeling and call routing activities. His work involves machine translation, NLP, and spoken dialog systems.

Read More Show Less

Table of Contents

Preface xxi

Acknowledgments xxv

About the Authors xxvii

Part I: In Theory 1

Chapter 1: Finding the Structure of Words 3

1.1 Words and Their Components 4

1.2 Issues and Challenges 8

1.3 Morphological Models 15

1.4 Summary 22

Chapter 2: Finding the Structure of Documents 29

2.1 Introduction 29

2.2 Methods 33

2.3 Complexity of the Approaches 40

2.4 Performances of the Approaches 41

2.5 Features 41

2.6 Processing Stages 48

2.7 Discussion 48

2.8 Summary 49

Chapter 3: Syntax 57

3.1 Parsing Natural Language 57

3.2 Treebanks: A Data-Driven Approach to Syntax 59

3.3 Representation of Syntactic Structure 63

3.4 Parsing Algorithms 70

3.5 Models for Ambiguity Resolution in Parsing 80

3.6 Multilingual Issues: What Is a Token? 87

3.7 Summary 92

Chapter 4: Semantic Parsing 97

4.1 Introduction 97

4.2 Semantic Interpretation 98

4.3 System Paradigms 101

4.4 Word Sense 102

4.5 Predicate-Argument Structure 118

4.6 Meaning Representation 147

4.7 Summary 152

Chapter 5: Language Modeling 169

5.1 Introduction 169

5.2 n-Gram Models 170

5.3 Language Model Evaluation 170

5.4 Parameter Estimation 171

5.5 Language Model Adaptation 176

5.6 Types of Language Models 178

5.7 Language-Specific Modeling Problems 188

5.8 Multilingual and Crosslingual Language Modeling 195

5.9 Summary 198

Chapter 6: Recognizing Textual Entailment 209

6.1 Introduction 209

6.2 The Recognizing Textual Entailment Task 210

6.3 A Framework for Recognizing Textual Entailment 219

6.4 Case Studies 238

6.5 Taking RTE Further 248

6.6 Useful Resources 252

6.7 Summary 253

Chapter 7: Multilingual Sentiment and Subjectivity Analysis 259

7.1 Introduction 259

7.2 Definitions 260

7.3 Sentiment and Subjectivity Analysis on English 262

7.4 Word- and Phrase-Level Annotations 264

7.5 Sentence-Level Annotations 270

7.6 Document-Level Annotations 272

7.7 What Works, What Doesn’t 274

7.8 Summary 277

Part II: In Practice 283

Chapter 8: Entity Detection and Tracking 285

8.1 Introduction 285

8.2 Mention Detection 287

8.3 Coreference Resolution 296

8.4 Summary 303

Chapter 9: Relations and Events 309

9.1 Introduction 309

9.2 Relations and Events 310

9.3 Types of Relations 311

9.4 Relation Extraction as Classification 312

9.5 Other Approaches to Relation Extraction 317

9.6 Events 320

9.7 Event Extraction Approaches 320

9.8 Moving Beyond the Sentence 323

9.9 Event Matching 323

9.10 Future Directions for Event Extraction 326

9.11 Summary 326

Chapter 10: Machine Translation 331

10.1 Machine Translation Today 331

10.2 Machine Translation Evaluation 332

10.3 Word Alignment 337

10.4 Phrase-Based Models 343

10.5 Tree-Based Models 350

10.6 Linguistic Challenges 354

10.7 Tools and Data Resources 356

10.8 Future Directions 358

10.9 Summary 359

Chapter 11: Multilingual Information Retrieval 365

11.1 Introduction 366

11.2 Document Preprocessing 366

11.3 Monolingual Information Retrieval 372

11.4 CLIR 378

11.5 MLIR 382

11.6 Evaluation in Information Retrieval 386

11.7 Tools, Software, and Resources 391

11.8 Summary 393

Chapter 12: Multilingual Automatic Summarization 397

12.1 Introduction 397

12.2 Approaches to Summarization 399

12.3 Evaluation 412

12.4 How to Build a Summarizer 420

12.5 Competitions and Datasets 424

12.6 Summary 426

Chapter 13: Question Answering 433

13.1 Introduction and History 433

13.2 Architectures 435

13.3 Source Acquisition and Preprocessing 437

13.4 Question Analysis 440

13.5 Search and Candidate Extraction 443

13.6 Answer Scoring 450

13.7 Crosslingual Question Answering 454

13.8 A Case Study 455

13.9 Evaluation 460

13.10 Current and Future Challenges 464

13.11 Summary and Further Reading 465

Chapter 14: Distillation 475

14.1 Introduction 475

14.2 An Example 476

14.3 Relevance and Redundancy 477

14.4 The Rosetta Consortium Distillation System 479

14.5 Other Distillation Approaches 488

14.6 Evaluation and Metrics 491

14.7 Summary 495

Chapter 15: Spoken Dialog Systems 499

15.1 Introduction 499

15.2 Spoken Dialog Systems 499

15.3 Forms of Dialog 509

15.4 Natural Language Call Routing 510

15.5 Three Generations of Dialog Applications 510

15.6 Continuous Improvement Cycle 512

15.7 Transcription and Annotation of Utterances 513

15.8 Localization of Spoken Dialog Systems 513

15.9 Summary 520

Chapter 16: Combining Natural Language Processing Engines 523

16.1 Introduction 523

16.2 Desired Attributes of Architectures for Aggregating Speech and NLP Engines 524

16.3 Architectures for Aggregation 527

16.4 Case Studies 531

16.5 Lessons Learned 540

16.6 Summary 542

16.7 Sample UIMA Code 542

Index 551

Read More Show Less

Customer Reviews

Be the first to write a review
( 0 )
Rating Distribution

5 Star

(0)

4 Star

(0)

3 Star

(0)

2 Star

(0)

1 Star

(0)

Your Rating:

Your Name: Create a Pen Name or

Barnes & Noble.com Review Rules

Our reader reviews allow you to share your comments on titles you liked, or didn't, with others. By submitting an online review, you are representing to Barnes & Noble.com that all information contained in your review is original and accurate in all respects, and that the submission of such content by you and the posting of such content by Barnes & Noble.com does not and will not violate the rights of any third party. Please follow the rules below to help ensure that your review can be posted.

Reviews by Our Customers Under the Age of 13

We highly value and respect everyone's opinion concerning the titles we offer. However, we cannot allow persons under the age of 13 to have accounts at BN.com or to post customer reviews. Please see our Terms of Use for more details.

What to exclude from your review:

Please do not write about reviews, commentary, or information posted on the product page. If you see any errors in the information on the product page, please send us an email.

Reviews should not contain any of the following:

  • - HTML tags, profanity, obscenities, vulgarities, or comments that defame anyone
  • - Time-sensitive information such as tour dates, signings, lectures, etc.
  • - Single-word reviews. Other people will read your review to discover why you liked or didn't like the title. Be descriptive.
  • - Comments focusing on the author or that may ruin the ending for others
  • - Phone numbers, addresses, URLs
  • - Pricing and availability information or alternative ordering information
  • - Advertisements or commercial solicitation

Reminder:

  • - By submitting a review, you grant to Barnes & Noble.com and its sublicensees the royalty-free, perpetual, irrevocable right and license to use the review in accordance with the Barnes & Noble.com Terms of Use.
  • - Barnes & Noble.com reserves the right not to post any review -- particularly those that do not follow the terms and conditions of these Rules. Barnes & Noble.com also reserves the right to remove any review at any time without notice.
  • - See Terms of Use for other conditions and disclaimers.
Search for Products You'd Like to Recommend

Recommend other products that relate to your review. Just search for them below and share!

Create a Pen Name

Your Pen Name is your unique identity on BN.com. It will appear on the reviews you write and other website activities. Your Pen Name cannot be edited, changed or deleted once submitted.

 
Your Pen Name can be any combination of alphanumeric characters (plus - and _), and must be at least two characters long.

Continue Anonymously

    If you find inappropriate content, please report it to Barnes & Noble
    Why is this product inappropriate?
    Comments (optional)