Statistical Machine Translation
This introductory text to statistical machine translation (SMT) provides all of the theories and methods needed to build a statistical machine translator, such as Google Language Tools and Babelfish. In general, statistical techniques allow automatic translation systems to be built quickly for any language-pair using only translated texts and generic software. With increasing globalization, statistical machine translation will be central to communication and commerce. Based on courses and tutorials, and classroom-tested globally, it is ideal for instruction or self-study, for advanced undergraduates and graduate students in computer science and/or computational linguistics, and researchers in natural language processing. The companion website provides open-source corpora and tool-kits.
1100955214
Statistical Machine Translation
This introductory text to statistical machine translation (SMT) provides all of the theories and methods needed to build a statistical machine translator, such as Google Language Tools and Babelfish. In general, statistical techniques allow automatic translation systems to be built quickly for any language-pair using only translated texts and generic software. With increasing globalization, statistical machine translation will be central to communication and commerce. Based on courses and tutorials, and classroom-tested globally, it is ideal for instruction or self-study, for advanced undergraduates and graduate students in computer science and/or computational linguistics, and researchers in natural language processing. The companion website provides open-source corpora and tool-kits.
93.0 In Stock
Statistical Machine Translation

Statistical Machine Translation

by Philipp Koehn
Statistical Machine Translation

Statistical Machine Translation

by Philipp Koehn

Hardcover(New Edition)

$93.00 
  • SHIP THIS ITEM
    In stock. Ships in 1-2 days.
  • PICK UP IN STORE

    Your local store may have stock of this item.

Related collections and offers


Overview

This introductory text to statistical machine translation (SMT) provides all of the theories and methods needed to build a statistical machine translator, such as Google Language Tools and Babelfish. In general, statistical techniques allow automatic translation systems to be built quickly for any language-pair using only translated texts and generic software. With increasing globalization, statistical machine translation will be central to communication and commerce. Based on courses and tutorials, and classroom-tested globally, it is ideal for instruction or self-study, for advanced undergraduates and graduate students in computer science and/or computational linguistics, and researchers in natural language processing. The companion website provides open-source corpora and tool-kits.

Product Details

ISBN-13: 9780521874151
Publisher: Cambridge University Press
Publication date: 12/17/2009
Edition description: New Edition
Pages: 446
Product dimensions: 7.00(w) x 9.80(h) x 1.00(d)

About the Author

Philipp Koehn is a lecturer in the School of Informatics at the University of Edinburgh. He is the scientific co-ordinator of the European EuroMatrix project and also involved in research funded by DARPA in the USA. He has also collaborated with leading companies in the field, such as Systran and Asia Online. He implemented the widely used decoder Pharoah, and is leading the development of the open source machine translation toolkit Moses.

Table of Contents

Preface xi

I Foundations 1

1 Introduction 3

1.1 Overview 4

1.2 History of Machine Translation 14

1.3 Applications 20

1.4 Available Resources 23

1.5 Summary 26

2 Words, Sentences, Corpora 33

2.1 Words 33

2.2 Sentences 45

2.3 Corpora 53

2.4 Summary 57

3 Probability Theory 63

3.1 Estimating Probability Distributions 63

3.2 Calculating Probability Distributions 67

3.3 Properties of Probability Distributions 71

3.4 Summary 75

II Core Methods 79

4 Word-Based Models 81

4.1 Machine Translation by Translating Words 81

4.2 Learning Lexical Translation Models 87

4.3 Ensuring Fluent Output 94

4.4 Higher IBM Models 96

4.5 Word Alignment 113

4.6 Summary 118

5 Phrase-Based Models 127

5.1 Standard Model 127

5.2 Learning a Phrase Translation Table 130

5.3 Extensions to the Translation Model 136

5.4 Extensions to the Reordering Model 142

5.5 EM Training of phrase-Based Models 145

5.6 Summary 148

6 Decoding 155

6.1 Translation Process 156

6.2 Beam Search 158

6.3 Future Cost Estimation 167

6.4 Other Decoding Algorithms 172

6.5 Summary 176

7 Language Models 181

7.1 N-Gram Language Models 182

7.2 Count Smoothing 188

7.3 Interpolation and Back-off 196

7.4 Managing the Size of the Model 204

7.5 Summary 212

8 Evaluation 217

8.1 Manual Evaluation 218

8.2 Automatic Evaluation 222

8.3 Hypothesis Testing 232

8.4 Task-Oriented Evaluation 237

8.5 Summary 240

111 Advanced Topics 247

9 Discriminative Training 249

9.1 Finding Candidate Translations 250

9.2 Principles of Discriminative Methods 255

9.3 Parameter Tuning 263

9.4 Large-Scale Discriminative Training 272

9.5 Posterior Methods and System Combination 278

9.6 Summary 283

10 Integrating Linguistic Information 289

10.1 Transliteration 291

10.2 Morphology 296

10.3 Syntactic Restructuring 302

10.4 Syntactic Features 310

10.5 Factored Translation Models 314

10.6 Summary 320

11 Tree-Based Models 331

11.1 Synchronous Grammars 331

11.2 Learning Synchronous Grammars 337

11.3 Decoding by Parsing 346

11.4 Summary 363

Bibliography 371

Author Index 416

Index 427

From the B&N Reads Blog

Customer Reviews