Statistical Methods for Speech Recognition

Add to Wishlist

Statistical Methods for Speech Recognition

Paperback

$65.00

Paperback
$65.00

SHIP THIS ITEM

In stock. Ships in 1-2 days.
PICK UP IN STORE

Your local store may have stock of this item.

Available within 2 business hours

Want it Today?
Check Store Availability

Related collections and offers

Overview

This book reflects decades of important research on the mathematical foundations of speech recognition. It focuses on underlying statistical techniques such as hidden Markov models, decision trees, the expectation-maximization algorithm, information theoretic goodness criteria, maximum entropy probability estimation, parameter and data clustering, and smoothing of probability distributions. The author's goal is to present these principles clearly in the simplest setting, to show the advantages of self-organization from real data, and to enable the reader to apply the techniques.

Bradford Books imprint

Product Details

ISBN-13:	9780262546607
Publisher:	MIT Press
Publication date:	11/01/2022
Series:	Language, Speech, and Communication
Pages:	306
Product dimensions:	6.00(w) x 9.00(h) x (d)

About the Author

Frederick Jelinek is Julian Sinclair Smith Professor in the Department of Electrical and Computer Engineering at Johns Hopkins University, where he is also Director for the Center for Language and Speech Processing.

Preface
Chapter 1
The Speech Recognition Problem
1.1 Introduction
1.2 A Mathematical Formulation
1.3 Components of a Speech Recognizer
1.4 About this Book
1.5 Vector Quantization
1.6 Additional Reading
References
Chapter 2
Hidden Markov Models
2.1 About Markov Chains
2.2 The Hidden Markov Model Concept
2.3 The Trellis
2.4 Search for the Likeliest State Transition Sequence
2.5 Presence of Null Transitions
2.6 Dealing with an HMM That Has Null Transitions That Do Not Form
a Loop
2.7 Estimation of Statistical Parameters of HMMs
2.8 Practical Need for Normalization
2.9 Alternative Definitions of HMMs
2.10 Additional Reading
References
Chapter 3
The Acoustic Model
3.1 Introduction
3.2 Phonetic Acoustic Models
3.3 More on Acoustic Model Training
3.4 The Effect of ConteXt
3.5 Viterbi Alignment
3.6 Singleton Fenonic Base Forms
3.7 A Needed Generalization
3.8 Generation of Synthetic Base Forms
3.9 A Further Refinement
3.10 Singleton Base Forms for Words Outside the Vocabulary
3.11 Additional Reading
References
Chapter 4
Basic Language Modeling
4.1 Introduction
4.2 Equivalence Classification of History
4.3 The Trigram Language Model
4.4 Optimal Linear Smoothing
4.5 An EXample of a Trigram Language Model
4.6 Practical Aspects of Deleted Interpolation
4.7 BackingOff
4.8 HMM Tagging
4.9 Use of Tag Equivalence Classification in a Language Model
4.10 Vocabulary Selection and Personalization from TeXt Database
4.11 Additional Reading
References
Chapter 5
The Viterbi Search
5.1 Introduction
5.2 Finding the Most Likely Word Sequence
5.3 The BeamSearch
5.4 Successive Language Model Refinement Search
5.5 Search versus Language Model State Spaces
5.6 NBest Search
5.7 A MaXimum Probability Lattice
5.8 Additional Reading
References
Chapter 6
Hypothesis Search on a Tree and the Fast Match
6.1 Introduction
6.2 Tree Search versus Trellis (Viterbi) Search
6.3 A* Search
6.4 Stack Algorithm for Speech Recognition
6.5 Modifications of the Tree Search
6.6 MultipleStack Search
6.7 Fast Match
6.8 The Cost of Search Shortcuts
6.9 Additional Reading
References
Chapter 7
Elements of Information Theory
7.1 Introduction
7.2 Functional Form of the Basic Information Measure
7.3 Some Mathematical Properties of Entropy
7.4 An Alternative Point of View and Notation
7.5 A SourceCoding Theorem
7.6 A Brief Digression
7.7 Mutual Information
7.8 Additional Reading
References
Chapter 8
The CompleXity of TasksThe Quality of Language Models
8.1 The Problem with Estimation of Recognition Task CompleXity
8.2 The Shannon Game
8.3 PerpleXity
8.4 The Conditional Entropy of the System
8.5 Additional Reading
References
Chapter 9
The EXpectationMaXimization Algorithm and Its Consequences
9.1 Introduction
9.2 The EM Theorem
9.3 The BaumWelch Algorithm
9.4 Real Vector Outputs of the Acoustic Processor
9.5 Constant and Tied Parameters
9.6 Tied MiXtures
9.7 Additional Reading
References
Chapter 10
Decision Trees and Tree Language Models
10.1 Introduction
10.2 Application of Decision Trees to Language Modeling
10.3 Decision Tree EXample
10.4 What Questions?
10.5 The Entropy Goodness Criterion for the Selection of
Questions, and a Stopping Rule
10.6 A Restricted Set of Questions
10.7 Selection of Questions by Chou's Method
10.8 Selection of the Initial Split of a Set [] into Complementary
Subsets
10.9 The Twoing Theorem
10.10 Practical Considerations of Chou's Method
10.11 Construction of Decision Trees Based on Word Encoding
10.12 A Hierarchical Classification of Vocabulary Words
10.13 More on Decision Trees Based on Word Encoding
10.14 Final Remarks on the Decision Tree Method
10.15 Additional Reading
References
Chapter 11
Phonetics from Orthography: SpellingtoBase Form Mappings
11.1 Overview of Base Form Generation from Spelling
11.2 Generating Alignment Data
11.3 Decision Tree Classification of Phonetic Environments
11.4 Finding the Base Forms
11.5 Additional Reading
References
Chapter 12
Triphones and Allophones
12.1 Introduction
12.2 Triphones
12.3 The General Method
12.4 Collecting Realizations of Particular Phones
12.5 A Direct Method
12.6 The Consequences
12.7 Back to Triphones
12.8 Additional Reading
References
Chapter 13
MaXimum Entropy Probability Estimation and Language Models
13.1 Outline of the MaXimum Entropy Approach
13.2 The Main Idea
13.3 The General Solution
13.4 The Practical Problem
13.5 An EXample
13.6 A Trigram Language Model
13.7 Limiting Computation
13.8 Iterative Scaling
13.9 The Problem of Finding Appropriate Constraints
13.10 Weighting of Diverse Evidence: Voting
13.11 Limiting Data Fragmentation: Multiple Decision Trees
13.12 Remaining Unsolved Problems
13.13 Additional Reading
References
Chapter 14
Three Applications of MaXimum Entropy Estimation to Language Modeling
14.1 About the Applications
14.2 Simple Language Model Adaption to a New Domain
14.3 A More CompleX Adaption
14.4 A Dynamic Language Model: Triggers
14.5 The Cache Language Model
14.6 Additional Reading
References
Chapter 15
Estimation of Probabilities from Counts and the BackOff Method
15.1 Inadequacy of Relative Frequency Estimates
15.2 Estimation of Probabilities from Counts Using HeldOut Data
15.3 Universality of the HeldOut Estimate
15.4 The GoodTuring Estimates
15.5 Applicability of the HeldOut and GoodTuring Estimates
15.6 Enhancing Estimation Methods
15.7 The BackOff Language Model
15.8 Additional Reading
References
Name IndeX
Subject IndeX

What People are Saying About This

Professor Hermann Ney

This book fills a long-existing gap in the scientific literature on automatic speech recognition. During the past three decades, statistical methods have had the strongest impact on the whole area of automatic speech recognition, in particular for large-vocabulary systems. This is without doubt the first book giving both a comprehensive overview and an in-depth description of these methods. The authot is one the pioneers who has been active in this field for more than 25 years.

Endorsement

For the first time, researchers in this field will have a book that will serve as the bible for many aspects of language and speech processing. Frankly, I can't imagine a person working in this field not wanting to have a personal copy.

—Victor Zue, MIT Laboratory for Computer Science

From the Publisher

—Professor Hermann Ney, Computer Science Department, RWTH Aachen, University of Technology

Frederick Jelinek is one of the few true pioneers of modern speech recognition technology. This book will be an essential reference book for all students and engineers working in the speech recognition area. More than that, it will also serve as a testament to Frederick Jelinek's own achievements in the field which span more than 25 years and which include so much that is core to modern-day speech recognition technology.

—Steve Young, Professor of Information Engineering, Engineering Department, Cambridge University, England

—Victor Zue, MIT Laboratory for Computer Science

Steve Young

Victor Zue

From the B&N Reads Blog

Page 1 of

Statistical Methods for Speech Recognition

Statistical Methods for Speech Recognition

Paperback

Paperback

Related collections and offers

Overview

Product Details

About the Author

Table of Contents

What People are Saying About This

Customer Reviews

Related collections and offers

Overview

Product Details

About the Author

Table of Contents

What People are Saying About This

Related Subjects

Customer Reviews