Human Factors and Voice Interactive Systems

Human Factors and Voice Interactive Systems

by Daryle Gardner-Bonneau, Gardner-Bonneau

ISBN-10: 0792384679

ISBN-13: 9780792384670

Pub. Date: 01/28/1999

Publisher: Springer-Verlag New York, LLC

Human Factors and Voice Interactive Systems, Second Edition provides in-depth information on current topics of major interest to speech application developers, and updates material from chapters that appeared in the previous edition.

Human Factors and Voice Interactive Systems, Second Edition provides in-depth information on current topics of major interest to speech application developers, and updates material from chapters that appeared in the previous edition.

The first nine chapters of the book cover issues related to interactive voice response systems, including both mobile and multimodal device user interfaces as well as classic automated telephone systems. The remaining chapters cover special topics including synthetic speech and the design of speech applications to enhance accessibility to people with disabilities and the ever-growing population of older adults.

Human Factors and Voice Interactive Systems, Second Edition is a collection of applied research and scholarly synthesis contributions by seasoned professionals in the field that highlight continuing efforts to study human interaction with speech technologies.

Product Details

Springer-Verlag New York, LLC
Publication date:
International Series in Engineering And
Product dimensions:
6.38(w) x 9.54(h) x 0.95(d)

Table of Contents

IVR Usability Engineering Using Guidelines and Analyses of End-To-End Calls   Bernhard Suhm     1
IVR Design Principles and Guidelines     2
A Taxonomy of Limitations of Speech User Interfaces     3
Limitations of Speech Recognition     4
Limitations of Spoken Language     7
Human Cognition     9
Towards Best Practices for IVR Design     10
A Database for Speech User Interface Design Knowledge     10
Compiling Guidelines for IVR Design     11
Applying IVR Design Guidelines in Practice     13
Best Practices for IVR Design?     18
Data-Driven IVR Usability Engineering Based on End-To-End Calls     19
The Flaws of Standard IVR Reports     20
Capturing End-to-End Data from Calls     20
Evaluating IVR Usability based on End-to-End Calls     23
Call-reason Distribution     23
Diagnosing IVR Usability using Caller-Path Diagrams     24
IVR Usability Analysis using Call-Reason Distribution and Caller-Path Diagrams     27
Evaluating IVR Cost-effectiveness     29
Defining Total IVR Benefit     30
Measuring Total IVR Benefit     31
Estimating Improvement Potential     34
Building theBusiness Case for IVR Redesign     35
Summary and Conclusions     38
Acknowledgements     39
References     39
User Interface Design for Natural Language Systems: From Research to Reality   Susan J. Boyce     43
Introduction     43
What is Natural Language?     43
Natural Language for Call Routing     44
Natural Language for Form Filling     45
The Pros and Cons of Natural Language Interfaces     45
What Are the Steps to Building a Natural Language Application?     46
Data Collection     46
Annotation Guide Development     47
Call Flow Development and Annotation     48
Application Code and Grammar/NL Development     49
Testing NL Applications     49
Post-Deployment Tuning     49
When Does it Make Sense to use Natural Language?     50
Distribution of Calls     50
Characteristics of the Caller Population     51
Evidence Obtained from Data with Existing Application     53
Ease of Getting to an Agent     53
Live Caller Environment Versus IVR: What is Being Replaced?     53
The Call Routing Task     54
Design Process      54
Analysis of Human-to-Human Dialogues     55
Anthropomorphism and User Expectations     55
Anthropomorphism Experiment     56
Issues for Natural Dialogue Design     60
Initial Greeting     60
Confirmations     60
Disambiguating an Utterance     61
Reprompts     61
Turn-taking     62
When to Bail Out     62
Establishing User Expectations in the Initial Greeting     62
Initial Greeting Experiment     63
Identifying Recognition Errors Through Confirmations     66
Confirming Digit Strings in Spoken Dialogue Systems     67
Confirmation of Topic in a Spoken Natural Dialogue System     69
Repairing Recognition Errors With Reprompts     72
Reprompt Experiment     73
Turn-Taking in Human-Machine Dialogues     76
Caller Tolerance of System Delay     77
Summary     79
References     79
Linguistics and Psycholinguistics in IVR Design   Osamuyimen T. Stewart   Harry E. Blanchard     81
Introduction     82
Speech Sounds     82
Grammar     83
Words      84
Sentences     84
Meaning     85
ASR Grammars and Language Understanding     86
Morphology     87
Syntax     88
Semantics     93
Synonyms     93
Polysemy     94
Putting it All Together     94
ASR Grammars     95
Natural Language Understanding Models     97
The Semantic Taxonomy     98
Establishing Predicates     100
Dialog Design     102
Putting it All Together     105
Scenario 1     106
Scenario 2     107
Consequences of Structural Simplification     108
Semantic Specificity     111
Syntactic Specificity     112
Conclusion     113
References     113
Designing the Voice User Interface for Automated Directory Assistance   Amir M. Mane   Esther Levin     117
The Business of DA     117
The Introduction of Automation     118
Early Attempts to Use Speech Recognition     119
Issues in the Design of VUI for DA     121
Addressing Database Inadequacies     122
The Solution: Automated Data Cleaning     123
Pronunciation of Names     123
The First Question     124
Finding the Locality     124
Confirming the Locality     125
Determining the Listing Type     126
Handling Business Requests     127
Issues in Grammar Design for Business Listing Automation     127
Business Listings Disambiguation     130
Handling Residential Listings     131
General Dialogue Design Issues     133
Final Thoughts     134
References     134
Spoken Language Interfaces for Embedded Applications   Dragos Burileanu     135
Introduction     135
Spoken Language Interfaces Development     137
Overview. Current Trends     137
Embedded Speech Applications     139
Embedded Speech Technologies     141
Technical Constraints and Implementation Methods     141
Embedded Speech Recognition     143
Embedded Speech Synthesis     149
A Case Study: An Embedded TTS System Implementation     153
A Simplified TTS System Architecture     153
Implementation Issues     155
The Future of Embedded Speech Interfaces      158
References     160
Speech Generation in Mobile Phones   Geza Nemeth   Geza Kiss   Csaba Zainko   Gabor Olaszy   Balint Toth     163
Introduction     163
Speaking Telephone? What is it Good for?     165
Speech Generation Technologies in Mobile Phones     166
Synthesis Technologies     167
Limited Vocabulary Concatenation     167
Unlimited Text Reading - Text-To-Speech     168
Topic-Related Text Preprocessing     170
Exceptions Vocabulary     171
Complex Text Transformation     171
Language Identification     174
How to Port Speech Synthesis on a Phone Platform     178
Limitations and Possibilities Offered by Phone Resources     181
Implementations     183
The Mobile Phone as a Speaking Aid     183
An SMS-Reading Mobile Phone Application     186
Acknowledgements     190
References     190
Voice Messaging User Interface   Harry E. Blanchard   Steven H. Lewis     193
Introduction     193
The Touch-Tone Voice Mail user Interface     196
Common Elements of Touch-tone Transactions     197
Prompts     197
Interruptibility     198
Time-outs and Reprompts     199
Feedback     200
Feedback to Errors     200
Menu Length     200
Mapping of Keys to Options     201
Global Commands     201
Use of the "#" and "*" Keys     202
Unprompted Options     202
Voice and Personality     203
Call Answering     203
Call Answering Greetings     206
The Subscriber Interface     206
Retrieving and Manipulating Messages     206
Sending Messages     209
Voice Messaging User Interface Standards     211
Alternative Approaches to Traditional Touch-tone Design     214
Automatic Speech Recognition and Voice Mail     215
Unified Messaging and Multimedia Mail     219
Fax Messaging     220
Viewing Voice Mail     221
Listening to E-mail     223
Putting it All Together     224
Mixed Media     225
References     226
Silence Locations and Durations in Dialog Management   Matthew Yuschik     231
Introduction      231
Prompts and Responses in Dialog Management     233
Dialog Management     233
Word Selection     234
Word Lists     234
Turn-Taking Cues     236
Time as an Independent Variable - Dialog Model     236
Definition of Terms     237
Examples of Usage     238
User Behavior     238
Transactional Analysis     238
Verbal Communication     239
Directed Dialogs     239
Measurements     240
Barge-In     241
Usability Testing and Results     242
Test Results - United States (early prototype)     244
Test Results - United States (tuned, early prototype)     245
Test Results - United Kingdom     246
Test Results - Italy     247
Test Results - Denmark     249
Observations and Interpretations     250
Lateral Results     250
Learning - Longitudinal Results     251
Conclusions     252
Acknowledgement     252
References     252
Using Natural Dialogs as the Basis for Speech Interface Design   Nicole Yankelovich      255
Introduction     256
Motivation     256
Natural Dialog Studies     257
Natural Dialog Case Studies     258
Study #1: SpeechActs Calendar (speech-only, telephone-based)     259
Purpose of Application     259
Study Design     260
Software Design     262
Lessons Learned     264
Study #2: Office Monitor (speech-only, microphone-based)     264
Purpose of Application     264
Study Design     265
Software Design     267
Lessons Learned     269
Study #3: Automated Customer Service Representative (speech input, speech/graphical output, telephone-based)     269
Purpose of Application     269
Study Design     269
Software Design     275
Lessons Learned     278
Study #4: Multimodal Drawing (speech/mouse/keyboard input, speech/graphical output, microphone-based)     278
Purpose of Application     278
Study Design     279
Software Design     283
Lessons Learned     286
Discussion     286
Refining Application Requirements and Functionality     286
Collecting Appropriate Vocabulary      287
Determining Commonly used Grammatical Constructs     287
Discovering Effective Interaction Patterns     287
Helping with Prompt and Feedback Design     288
Getting a Feeling for the Tone of the Conversations     288
Conclusion     289
Acknowledgements     289
References     290
Telematics: Artificial Passenger and Beyond   Dimitri Kanevsky     291
Introduction     291
A Brief Overview of IBM Voice Technologies     292
Conversational Interactivity for Telematics     293
System Architecture     295
Embedded Speech Recognition     297
Distributed Speech Recognition     299
Evaluating/Predicting the Consequences of Misrecognitions     300
Improving Voice and State Recognition Performance - Network Data Collection, Learning by Example, Adaptation of Language and Acoustic Models for Similar users     303
Artificial Passenger     308
User Modeling Aspects     315
User Model     316
The Adaptive Modeling Process     317
The Control Process     318
Discussion about Time-Lagged Observables and Indicators in a History     319
Gesture-Based Command Interface      320
Summary     322
Acknowledgements     323
References     323
A Language to Write Letter-To-Sound Rules for English and French   Michel Divay     327
Introduction     327
The Historic Evolution of English and French     329
The Complexity of the Conversion for English and French     329
Rule Formalism     334
Examples of Rules for English     340
Examples of Rules for French     345
Conclusions     353
References     354
Appendices for French     356
Appendices for English     359
Virtual Sentences of Spontaneous Speech: Boundary Effects of Syntactic-Semantic-Prosodic Properties   Maria Gosy   Magdolna Kovacs     361
Introduction     361
Method and Material     364
Subjects     364
Speech Material     364
Procedure     365
Results     366
Identification of Virtual Sentences in the Normal and Filtered Speech Samples     366
Pauses of the Speech Sample     368
Pause Perception     370
F0 Patterns     372
Comprehension of the Spontaneous Speech Sample     374
The Factor of Gender     375
Conclusions     375
Acknowledgements     377
References     377
Text-to-Speech Formant Synthesis For French   Michel Divay   Ed Bruckert     381
Introduction     381
Grapheme-to-Phoneme Conversion     382
Normalization: From Grapheme to Grapheme     382
From Grapheme to Phoneme     384
Exception Dictionary     385
Prosody     385
Parsing the Text     385
Intonation     386
Phoneme Duration     391
Acoustics for French Consonants and Vowels     398
Vowels     398
Fricatives (unvoiced:F,S,Ch; voiced: V,Z,J)     400
Plosives (unvoiced:P,T,K; voiced: B,D,G)     401
Nasals (M, N, Gn, Ng)     403
Liquids (L, R)     404
Semivowels (Y, W, Wu)     405
Phoneme Transitions (coarticulation effects)     405
Frame Generation     409
Conclusions for Acoustics     409
From Acoustics to Speech Signal     410
Next Generation Formant Synthesis     412
Singing      414
Conclusions     414
References     415
Accessibility and Speech Technology: Advancing Toward Universal Access   John C. Thomas   Sara Basson   Daryle Gardner-Bonneau     417
Universal Access vs. Assistive Technology     417
Predicted Enhancements and Improvements to Underlying Technology     419
Social Network Analysis, Blogs, Wikis, and Social Computing     420
Intelligent Agents     421
Learning Objects     422
Cognitive Aids     423
Interface Flexibility and Intelligence     423
Current Assistive Technology Applications Employing Speech Technology     423
Applications Employing Automatic Speech Recognition (ASR)     424
Applications of Synthetic Speech     428
Human-Computer Interaction: Design and Evaluation     430
The Role of Technical Standards in Accessibility     433
Standards Related to Software and Information Technology User Interfaces     434
Speech Application Accessibility Standards     434
Accessibility Data and Accessibility Guidance for General Products     437
Conclusions     439
References     440
Synthesized Speech Used for the Evaluation of Children's Hearing and Speech Perception   Maria Gosy     443
Introduction     443
The Background Theory     444
The Production of the Synthesized Word Material     447
Pre-Experiments for the Application of Synthesized Words for Hearing Screening     449
Results     450
Clinical Tests     450
Screening Procedure     453
Evaluation of Acoustic-phonetic Perception     456
Children with Specific Needs     457
Conclusions     458
Acknowledgements     459
References     459
Index     461

