Statistical Methods for Speech Recognition / Edition 1

Statistical Methods for Speech Recognition / Edition 1

by Frederick Jelinek, Jelinek
     
 

ISBN-10: 0262100665

ISBN-13: 9780262100663

Pub. Date: 01/16/1998

Publisher: MIT Press

This book reflects decades of important research on the mathematical foundations of speech recognition. It focuses on underlying statistical techniques such as hidden Markov models, decision trees, the expectation-maximization algorithm, information theoretic goodness criteria, maximum entropy probability estimation, parameter and data clustering, and smoothing of

…  See more details below

Overview

This book reflects decades of important research on the mathematical foundations of speech recognition. It focuses on underlying statistical techniques such as hidden Markov models, decision trees, the expectation-maximization algorithm, information theoretic goodness criteria, maximum entropy probability estimation, parameter and data clustering, and smoothing of probability distributions. The author's goal is to present these principles clearly in the simplest setting, to show the advantages of self-organization from real data, and to enable the reader to apply the techniques.

Product Details

ISBN-13:
9780262100663
Publisher:
MIT Press
Publication date:
01/16/1998
Series:
Language, Speech, and Communication
Edition description:
New Edition
Pages:
305
Product dimensions:
6.20(w) x 9.00(h) x 1.00(d)
Age Range:
18 Years

Table of Contents

Preface
Chapter 1
The Speech Recognition Problem
1.1 Introduction
1.2 A Mathematical Formulation
1.3 Components of a Speech Recognizer
1.4 About this Book
1.5 Vector Quantization
1.6 Additional Reading
References
Chapter 2
Hidden Markov Models
2.1 About Markov Chains
2.2 The Hidden Markov Model Concept
2.3 The Trellis
2.4 Search for the Likeliest State Transition Sequence
2.5 Presence of Null Transitions
2.6 Dealing with an HMM That Has Null Transitions That Do Not Form
a Loop
2.7 Estimation of Statistical Parameters of HMMs
2.8 Practical Need for Normalization
2.9 Alternative Definitions of HMMs
2.10 Additional Reading
References
Chapter 3
The Acoustic Model
3.1 Introduction
3.2 Phonetic Acoustic Models
3.3 More on Acoustic Model Training
3.4 The Effect of ConteXt
3.5 Viterbi Alignment
3.6 Singleton Fenonic Base Forms
3.7 A Needed Generalization
3.8 Generation of Synthetic Base Forms
3.9 A Further Refinement
3.10 Singleton Base Forms for Words Outside the Vocabulary
3.11 Additional Reading
References
Chapter 4
Basic Language Modeling
4.1 Introduction
4.2 Equivalence Classification of History
4.3 The Trigram Language Model
4.4 Optimal Linear Smoothing
4.5 An EXample of a Trigram Language Model
4.6 Practical Aspects of Deleted Interpolation
4.7 BackingOff
4.8 HMM Tagging
4.9 Use of Tag Equivalence Classification in a Language Model
4.10 Vocabulary Selection and Personalization from TeXt Database
4.11 Additional Reading
References
Chapter 5
The Viterbi Search
5.1 Introduction
5.2 Finding the Most Likely Word Sequence
5.3 The BeamSearch
5.4 Successive Language Model Refinement Search
5.5 Search versus Language Model State Spaces
5.6 NBest Search
5.7 A MaXimum Probability Lattice
5.8 Additional Reading
References
Chapter 6
Hypothesis Search on a Tree and the Fast Match
6.1 Introduction
6.2 Tree Search versus Trellis (Viterbi) Search
6.3 A* Search
6.4 Stack Algorithm for Speech Recognition
6.5 Modifications of the Tree Search
6.6 MultipleStack Search
6.7 Fast Match
6.8 The Cost of Search Shortcuts
6.9 Additional Reading
References
Chapter 7
Elements of Information Theory
7.1 Introduction
7.2 Functional Form of the Basic Information Measure
7.3 Some Mathematical Properties of Entropy
7.4 An Alternative Point of View and Notation
7.5 A SourceCoding Theorem
7.6 A Brief Digression
7.7 Mutual Information
7.8 Additional Reading
References
Chapter 8
The CompleXity of TasksThe Quality of Language Models
8.1 The Problem with Estimation of Recognition Task CompleXity
8.2 The Shannon Game
8.3 PerpleXity
8.4 The Conditional Entropy of the System
8.5 Additional Reading
References
Chapter 9
The EXpectationMaXimization Algorithm and Its Consequences
9.1 Introduction
9.2 The EM Theorem
9.3 The BaumWelch Algorithm
9.4 Real Vector Outputs of the Acoustic Processor
9.5 Constant and Tied Parameters
9.6 Tied MiXtures
9.7 Additional Reading
References
Chapter 10
Decision Trees and Tree Language Models
10.1 Introduction
10.2 Application of Decision Trees to Language Modeling
10.3 Decision Tree EXample
10.4 What Questions?
10.5 The Entropy Goodness Criterion for the Selection of
Questions, and a Stopping Rule
10.6 A Restricted Set of Questions
10.7 Selection of Questions by Chou's Method
10.8 Selection of the Initial Split of a Set [] into Complementary
Subsets
10.9 The Twoing Theorem
10.10 Practical Considerations of Chou's Method
10.11 Construction of Decision Trees Based on Word Encoding
10.12 A Hierarchical Classification of Vocabulary Words
10.13 More on Decision Trees Based on Word Encoding
10.14 Final Remarks on the Decision Tree Method
10.15 Additional Reading
References
Chapter 11
Phonetics from Orthography: SpellingtoBase Form Mappings
11.1 Overview of Base Form Generation from Spelling
11.2 Generating Alignment Data
11.3 Decision Tree Classification of Phonetic Environments
11.4 Finding the Base Forms
11.5 Additional Reading
References
Chapter 12
Triphones and Allophones
12.1 Introduction
12.2 Triphones
12.3 The General Method
12.4 Collecting Realizations of Particular Phones
12.5 A Direct Method
12.6 The Consequences
12.7 Back to Triphones
12.8 Additional Reading
References
Chapter 13
MaXimum Entropy Probability Estimation and Language Models
13.1 Outline of the MaXimum Entropy Approach
13.2 The Main Idea
13.3 The General Solution
13.4 The Practical Problem
13.5 An EXample
13.6 A Trigram Language Model
13.7 Limiting Computation
13.8 Iterative Scaling
13.9 The Problem of Finding Appropriate Constraints
13.10 Weighting of Diverse Evidence: Voting
13.11 Limiting Data Fragmentation: Multiple Decision Trees
13.12 Remaining Unsolved Problems
13.13 Additional Reading
References
Chapter 14
Three Applications of MaXimum Entropy Estimation to Language Modeling
14.1 About the Applications
14.2 Simple Language Model Adaption to a New Domain
14.3 A More CompleX Adaption
14.4 A Dynamic Language Model: Triggers
14.5 The Cache Language Model
14.6 Additional Reading
References
Chapter 15
Estimation of Probabilities from Counts and the BackOff Method
15.1 Inadequacy of Relative Frequency Estimates
15.2 Estimation of Probabilities from Counts Using HeldOut Data
15.3 Universality of the HeldOut Estimate
15.4 The GoodTuring Estimates
15.5 Applicability of the HeldOut and GoodTuring Estimates
15.6 Enhancing Estimation Methods
15.7 The BackOff Language Model
15.8 Additional Reading
References
Name IndeX
Subject IndeX

Read More

Customer Reviews

Average Review:

Write a Review

and post it to your social network

     

Most Helpful Customer Reviews

See all customer reviews >