BN.com Gift Guide

The Art and Business of Speech Recognition: Creating the Noble Voice

Paperback (Print)
Buy Used
Buy Used from BN.com
$18.66
(Save 37%)
Item is in good condition but packaging may have signs of shelf wear/aging or torn packaging.
Condition: Used – Good details
Used and New from Other Sellers
Used and New from Other Sellers
from $6.49
Usually ships in 1-2 business days
(Save 78%)
Other sellers (Paperback)
  • All (12) from $6.49   
  • New (3) from $23.97   
  • Used (9) from $6.49   

Overview

Most people have experienced an automated speech-recognition system when calling a company. Instead of prompting callers to choose an option by entering numbers, the system asks questions and understands spoken responses. With a more advanced application, callers may feel as if they're having a conversation with another person. Not only will the system respond intelligently, its voice even has personality.

The Art and Business of Speech Recognition examines both the rapid emergence and broad potential of speech-recognition applications. By explaining the nature, design, development, and use of such applications, this book addresses two particular needs:

  1. Business managers must understand the competitive advantage that speech-recognition applications provide: a more effective way to engage, serve, and retain customers over the phone.
  2. Application designers must know how to meet their most critical business goal: a satisfying customer experience.

Author Blade Kotelly illuminates these needs from the perspective of an experienced, business-focused practitioner. Among the diverse applications he's worked on, perhaps his most influential design is the flight-information system developed for United Airlines, about which Julie Vallone wrote in Investor's Business Daily: "By the end of the conversation, you might want to take the voice to dinner."

If dinner is the analogy, this concise book is an ideal first course. Managers will learn the potential of speech-recognition applications to reduce costs, increase customer satisfaction, enhance the company brand, and even grow revenues. Designers, especially those just beginning to work in the voice domain, will learn user-interface design principles and techniques needed to develop and deploy successful applications. The examples in the book are real, the writing is accessible and lucid, and the solutions presented are attainable today.

0321154924B12242002

Read More Show Less

Product Details

  • ISBN-13: 9780321154927
  • Publisher: Addison-Wesley
  • Publication date: 1/22/2003
  • Edition description: New Edition
  • Pages: 208
  • Product dimensions: 7.32 (w) x 8.89 (h) x 0.55 (d)

Meet the Author

Blade Kotelly is the Creative Director of Interface Design for SpeechWorks International, a leading provider of automated speech-recognition software products and services. In addition to United Airlines, he has worked on applications for Apple Computer, E*TRADE, McKesson, Fidelity Investments IBG, FedEx, and others. A frequent conference speaker and university lecturer, Blade has had his work and ideas featured by The New York Times, The Washington Post, The Wall Street Journal, the BBC, and National Public Radio.

0321154924AB12242002

Read More Show Less

Read an Excerpt

Speech-recognition technology increasingly is being used in a variety of over-the-phone applications in the transportation, financial, telecommunications, and other industries. Most people by now have experienced at least one automated phone system, where questions are posed by a computer to callers seeking help or information, and the callers' spoken answers are understood and acted on by that computer. In the best cases, the computer even sounds human!

Given the rapid emergence of these applications, the time seems right for a concise overview of the topic—a guide for business managers looking for new and better ways to engage and service their customers, a resource for designers getting started with voice applications (or refreshing their current knowledge), a good read for anyone interested in this exciting technology. Having built applications in many of these industries, I feel well-positioned, and eager, to explain important aspects of the art and business of speech recognition, and to illuminate the extraordinary returns a well-designed and well-deployed over-the-phone system can yield—increased revenues, lower costs, customer satisfaction and retention, and brand development. Speech-recognition technology can give a company an identifiable and welcome voice—a noble voice, even—and in this book I mean to show how.

The key to voice-application success, as I demonstrate in a variety of examples, is a well-constructed user interface—interactions between the system and its users that are both pleasing and effective. The best voice interfaces avoid the confusion or annoyance of touchtone systems and the expense of operators or customer representatives. A good voice interface, as the examples will illustrate, can solve critical business problems.

This book certainly is not meant to be exhaustive in its coverage of speech-recognition technology, or to be academically rigorous in its style. Other fine books delve more deeply and exclusively into particular elements of design, such as brainstorming techniques and usability testing principles. Rather, this book is the product of one experienced, business-focused practitioner—me—talking about what works in this domain, and what does not. I speak from the real-world perspective of a designer who has had to fight with deadlines, work around technological limitations, satisfy and excite the system's users, all while meeting contractual obligations to the client and financial expectations. (That experience will sound familiar to all practitioners!) From this perspective, I convey insights and advice about user-interface design, production, testing, and deployment that I hope will help others plan and build their own successful speech-recognition applications.

Readers familiar with other software development processes will see some similarities between the design of systems for over-the-phone speech-recognition systems and the design of systems for other media. This book, however, focuses on that which is unique to designing speech-recognition systems. It is important for client companies and designers alike to understand these unique elements in order to produce a system that works optimally for both the client company and its calling population.

To keep the book's focus on more fundamental design concepts, I have chosen not to discuss specific technologies, algorithms, or programming methodologies that may be popular at the moment, but that likely will change with each passing year, or perhaps even soon become obsolete. Thus, I do not cover Voice

There are two principal audiences for this book.

  1. People who want to be effective speech-recognition user-interface designers
  2. People who want to understand or profit from these systems

Prospective designers should find all chapters of the book useful. Other readers—particularly call center managers, programmers/implementers, and project managers—are likely to benefit most from the chapters that address the design and deployment process, and the ideas that drive the process (Chapters 5, 6, 7, 9, and 10).

After you've read this book, you will have a fundamental understanding of what goes into the design, production, testing, and deployment of over-the-phone speech-recognition applications. You'll have learned design guidelines, tips, and techniques that help ensure an application will work well and that people will enjoy using it. Inasmuch as examples are often the best way to learn, you'll have seen how other designers have dealt with real-life issues to solve real business problems. By addressing the main principles behind the creation of speech-recognition systems, I hope to have shown you the tight connection between the process of solving business problems using speech-recognition technology and the art of designing those systems.

Philosophy

No matter how immersed they become in the minutiae, designers of over-the-phone speech-recognition systems must never lose sight of one, over-arching goal: These systems are made to help people do what they have to do. It doesn't matter if it's mundane (home banking) or flashy (entertainment/infotainment like a voice-portal), the goal is the same: to help people accomplish their tasks swiftly, easily, and unobtrusively.

Every designer should be guided by a philosophy. Having a philosophy gives a designer both a starting position and a compass to point the way—a compass to navigate through the plethora of decisions that must be made along the way. At the very least, having a philosophy enables a designer to answer the "Why?" question at each point in the design process. Why am I writing this question? Why am I using a particular word here in a particular context? Why will this design solve the problem better than another design?

When I was about four years old, my mom taught me that if I was confused at all about whether or not to do something, I could simply think about the Golden Rule: "Do unto others as you would have them do unto you." Her point was to make sure that I considered other people's perspectives before I did something that might affect them. I've applied this rule both consciously and subconsciously throughout my life in a variety of situations.

When I was a teenager, I became driven by the idea that I should design things—at that point I wasn't sure just what things—that improved people's lives. Eventually, this winding road led me to become a speech-recognition system designer. And when I first approached problems and thought about how to develop the best solutions, I wandered back to the Golden Rule as a way to solve design problems on a variety of levels. I've found that this rule is especially applicable in the design of speech-recognition systems.

When designing a system, designers need to put themselves mentally in the place of the people calling into the system. If we can understand what it's like to be those people in their varying moods—happy, angry, confused, rushed, or impatient—we can design a system to accommodate them. While empathizing is not always easy to do, we can often satisfy callers by understanding who they are and why they're calling, and then decide on the most appropriate way to handle different callers and different situations. Some situations are easy to design for, while others present problems that require more work to solve. In both cases knowledge of psychology will aid the designer more than knowledge of technology alone.

0321154924P01032003

Read More Show Less

Table of Contents

Preface.

I. THE BACKGROUND.

1.On Telephones, Touchtones, and Business Needs.

Speech Recognition versus Touchtone Functionality.

Problems with Touchtone, and a Speech Recognition Remedy.

What Kinds of Companies Are Using Speech Recognition?

Why Are Companies Using It?

Speech-Recognition Applications: A Typical Example.

Where We've Been-Where We're Going.

2. Technology Primer: About Speech Recognizers.

What the Recognizer Hears (and the Need for Confirmation)

When the Recognizer Listens.

Why Designing a Speech-Recognition Application Is Challenging.

Where We've Been-Where We're Going.

3. The Psychology of How People Interact with Speech-Recognition Systems.

Social-Psychological Research

Ask “Dr.” Blade.

Where We've Been-Where We're Going.

II. THE PROCESS OF DESIGNING SPEECH-RECOGNITION SYSTEMS.

4. Research.

Clients' Objectives.

Callers' Objectives and Needs.

Aspects of Research.

Assembling a Requirements Specification.

Anticipating Change.

Where We've Been-Where We're Going.

5. Developing the Design.

Conceptualizing and Brainstorming.

Congruence of Style.

Defining the Call Flow.

Vision Clips/Sample Calls.

The Design Specification-Conveying the Details of the Design.

Constructing a Design Specification.

Following Through on the Initial Design Phase.

Where We've Been-Where We're Going.

6. Writing Effective Prompts.

The Language of Asking Questions.

The Art of Writing Perfect Prompts.

Writing Prompts for Elegance, Speed, and Value.

Getting Callers to Focus on the Essentials.

Some Subtleties of Prompt Writing.

Top Five Good Tenets for Writing Prompts.

Top Five Mistakes When Writing Prompts.

Where We've Been-Where We're Going.

7. Production and Branding.

Notes About Implementation and Programming.

Production.

Prompt Creation-Text-to-Speech and Recorded Voices.

Casting.

Directing Voice Talents.

The Art of Recording Prompts.

Other Thoughts on Directing.

Concatenative Prompt Recording.

Some Metrics and Technical Notes.

Audio Icons.

Branding.

Where We've Been-Where We're Going.

8. Usability Testing.

The Value of Usability Testing.

How We Test an Application.

Objectives of Usability Testing.

Preparing for the Test.

The Test Subjects.

How to Get Test Subjects.

The Test Environment.

Types of Tests.

The Test Is Over-Now What?

Interpreting Test Results.

Where We've Been-Where We're Going.

9. Deployment.

The Importance of Multi-Phase Deployment.

The Three Phases of Deployment.

Pilot Deployment.

Partial Deployment.

Full Deployment.

Where We've Been-Where We're Going.

III. APPLIED KNOWLEDGE.

10. Case Studies.

United Airlines: Shortcuts for Frequent Fliers.

United Airlines: Providing Extra Help for Those Who Need It.

Continental Airlines: A Different Approach to Flight Information.

A Top-Five Investment Management Company: Handling Complex Two-Choice Questions.

An Online Brokerage Firm: Managing More Complex Tasks.

An Online Brokerage Firm: Preferences and Other Rarely Used Functions.

A Regional Telephone Company: Dealing with Legal Notices and Disclaimers.

Wildfire: List Navigation.

Wildfire: Small Header, Large Body Lists.

A Top-Five U.S. Bank: Large Header, Small Body Lists.

The “Race Condition”.

FedEx: Scaffolding Prompts.

Amtrak: Implicit Confirmation and the “Ellipses/and” Question Form.

AirTran: Reducing the Information Burden.

Guessing Right.

Semantics: When “Problem” Was a Problem.

Where We've Been-Where We Must Go.

Postscript.

Suggested Reading List.

Glossary.

Index. 0321154924T01032003

Read More Show Less

Preface

Speech-recognition technology increasingly is being used in a variety of over-the-phone applications in the transportation, financial, telecommunications, and other industries. Most people by now have experienced at least one automated phone system, where questions are posed by a computer to callers seeking help or information, and the callers' spoken answers are understood and acted on by that computer. In the best cases, the computer even sounds human!

Given the rapid emergence of these applications, the time seems right for a concise overview of the topic—a guide for business managers looking for new and better ways to engage and service their customers, a resource for designers getting started with voice applications (or refreshing their current knowledge), a good read for anyone interested in this exciting technology. Having built applications in many of these industries, I feel well-positioned, and eager, to explain important aspects of the art and business of speech recognition, and to illuminate the extraordinary returns a well-designed and well-deployed over-the-phone system can yield—increased revenues, lower costs, customer satisfaction and retention, and brand development. Speech-recognition technology can give a company an identifiable and welcome voice—a noble voice, even—and in this book I mean to show how.

The key to voice-application success, as I demonstrate in a variety of examples, is a well-constructed user interface—interactions between the system and its users that are both pleasing and effective. The best voice interfaces avoid the confusion or annoyance of touchtone systems and the expense of operators or customer representatives. A good voice interface, as the examples will illustrate, can solve critical business problems.

This book certainly is not meant to be exhaustive in its coverage of speech-recognition technology, or to be academically rigorous in its style. Other fine books delve more deeply and exclusively into particular elements of design, such as brainstorming techniques and usability testing principles. Rather, this book is the product of one experienced, business-focused practitioner—me—talking about what works in this domain, and what does not. I speak from the real-world perspective of a designer who has had to fight with deadlines, work around technological limitations, satisfy and excite the system's users, all while meeting contractual obligations to the client and financial expectations. (That experience will sound familiar to all practitioners!) From this perspective, I convey insights and advice about user-interface design, production, testing, and deployment that I hope will help others plan and build their own successful speech-recognition applications.

Readers familiar with other software development processes will see some similarities between the design of systems for over-the-phone speech-recognition systems and the design of systems for other media. This book, however, focuses on that which is unique to designing speech-recognition systems. It is important for client companies and designers alike to understand these unique elements in order to produce a system that works optimally for both the client company and its calling population.

To keep the book's focus on more fundamental design concepts, I have chosen not to discuss specific technologies, algorithms, or programming methodologies that may be popular at the moment, but that likely will change with each passing year, or perhaps even soon become obsolete. Thus, I do not cover VoiceXML, SSML, and certain proprietary software packages currently in use; although important and popular today, these changing standards are less central for understanding the basic principles.

There are two principal audiences for this book.

  1. People who want to be effective speech-recognition user-interface designers
  2. People who want to understand or profit from these systems

Prospective designers should find all chapters of the book useful. Other readers—particularly call center managers, programmers/implementers, and project managers—are likely to benefit most from the chapters that address the design and deployment process, and the ideas that drive the process (Chapters 5, 6, 7, 9, and 10).

After you've read this book, you will have a fundamental understanding of what goes into the design, production, testing, and deployment of over-the-phone speech-recognition applications. You'll have learned design guidelines, tips, and techniques that help ensure an application will work well and that people will enjoy using it. Inasmuch as examples are often the best way to learn, you'll have seen how other designers have dealt with real-life issues to solve real business problems. By addressing the main principles behind the creation of speech-recognition systems, I hope to have shown you the tight connection between the process of solving business problems using speech-recognition technology and the art of designing those systems.

Philosophy

No matter how immersed they become in the minutiae, designers of over-the-phone speech-recognition systems must never lose sight of one, over-arching goal: These systems are made to help people do what they have to do. It doesn't matter if it's mundane (home banking) or flashy (entertainment/infotainment like a voice-portal), the goal is the same: to help people accomplish their tasks swiftly, easily, and unobtrusively.

Every designer should be guided by a philosophy. Having a philosophy gives a designer both a starting position and a compass to point the way—a compass to navigate through the plethora of decisions that must be made along the way. At the very least, having a philosophy enables a designer to answer the "Why?" question at each point in the design process. Why am I writing this question? Why am I using a particular word here in a particular context? Why will this design solve the problem better than another design?

When I was about four years old, my mom taught me that if I was confused at all about whether or not to do something, I could simply think about the Golden Rule: "Do unto others as you would have them do unto you." Her point was to make sure that I considered other people's perspectives before I did something that might affect them. I've applied this rule both consciously and subconsciously throughout my life in a variety of situations.

When I was a teenager, I became driven by the idea that I should design things—at that point I wasn't sure just what things—that improved people's lives. Eventually, this winding road led me to become a speech-recognition system designer. And when I first approached problems and thought about how to develop the best solutions, I wandered back to the Golden Rule as a way to solve design problems on a variety of levels. I've found that this rule is especially applicable in the design of speech-recognition systems.

When designing a system, designers need to put themselves mentally in the place of the people calling into the system. If we can understand what it's like to be those people in their varying moods—happy, angry, confused, rushed, or impatient—we can design a system to accommodate them. While empathizing is not always easy to do, we can often satisfy callers by understanding who they are and why they're calling, and then decide on the most appropriate way to handle different callers and different situations. Some situations are easy to design for, while others present problems that require more work to solve. In both cases knowledge of psychology will aid the designer more than knowledge of technology alone.

0321154924P01032003

Read More Show Less

Customer Reviews

Be the first to write a review
( 0 )
Rating Distribution

5 Star

(0)

4 Star

(0)

3 Star

(0)

2 Star

(0)

1 Star

(0)

Your Rating:

Your Name: Create a Pen Name or

Barnes & Noble.com Review Rules

Our reader reviews allow you to share your comments on titles you liked, or didn't, with others. By submitting an online review, you are representing to Barnes & Noble.com that all information contained in your review is original and accurate in all respects, and that the submission of such content by you and the posting of such content by Barnes & Noble.com does not and will not violate the rights of any third party. Please follow the rules below to help ensure that your review can be posted.

Reviews by Our Customers Under the Age of 13

We highly value and respect everyone's opinion concerning the titles we offer. However, we cannot allow persons under the age of 13 to have accounts at BN.com or to post customer reviews. Please see our Terms of Use for more details.

What to exclude from your review:

Please do not write about reviews, commentary, or information posted on the product page. If you see any errors in the information on the product page, please send us an email.

Reviews should not contain any of the following:

  • - HTML tags, profanity, obscenities, vulgarities, or comments that defame anyone
  • - Time-sensitive information such as tour dates, signings, lectures, etc.
  • - Single-word reviews. Other people will read your review to discover why you liked or didn't like the title. Be descriptive.
  • - Comments focusing on the author or that may ruin the ending for others
  • - Phone numbers, addresses, URLs
  • - Pricing and availability information or alternative ordering information
  • - Advertisements or commercial solicitation

Reminder:

  • - By submitting a review, you grant to Barnes & Noble.com and its sublicensees the royalty-free, perpetual, irrevocable right and license to use the review in accordance with the Barnes & Noble.com Terms of Use.
  • - Barnes & Noble.com reserves the right not to post any review -- particularly those that do not follow the terms and conditions of these Rules. Barnes & Noble.com also reserves the right to remove any review at any time without notice.
  • - See Terms of Use for other conditions and disclaimers.
Search for Products You'd Like to Recommend

Recommend other products that relate to your review. Just search for them below and share!

Create a Pen Name

Your Pen Name is your unique identity on BN.com. It will appear on the reviews you write and other website activities. Your Pen Name cannot be edited, changed or deleted once submitted.

 
Your Pen Name can be any combination of alphanumeric characters (plus - and _), and must be at least two characters long.

Continue Anonymously
Sort by: Showing 1 Customer Reviews
  • Anonymous

    Posted February 23, 2003

    Very lucid; don't be scared off by the subject

    Speech recognition as a commercial product is still very new. In 1988, when I was first involved with it, the state of the art did not involve real time capability. You had to record the utterance and then analyse it with a computer. Typically, you also had to train the software with the speaker beforehand. Now, we have commercially available real time, speaker independent products. Some of the largest companies, like United Airlines and ATT, have deployed these, to try and reduce call centre costs, and to improve the user's experience when dialling into such a place. Are you considering installing such software? Of course, you can talk to the vendors. But where can you get objective advice? One possibility is to ask researchers in the field. But they can easily and inadvertantly drown you in jargon, especially if you do not have a technical background. This book attempts to fill that need. You do not need a degree in computer science or maths to understand it. The book does not explain how speech recognition works. Rather the emphasis is at a higher level: Using it in your workplace. The author gives many lucid examples of this. Basically, she outlines a commonsensical appproach that can be understood by anybody. She explains how not to overburden the user with long utterances full of information, but to take advantage of the context of the conversation to omit unnecessary details. She emphasises thorough testing, with a disciplined scaling up to a real life deployment in a call centre. Something that may well have been omitted in other deployments, leading to users gnashing their teeth in frustration at an obtuse dialog, or at busy phone lines. She also discusses why companies should regard this as part of their corporate branding, and how to choose an appropriate "noble" voice as part of that branding. I think the "noble" sounds rather pompous, actually. But that's not her fault! It is a standard phrase in this field, and you too might get used to it.

    Was this review helpful? Yes  No   Report this review
Sort by: Showing 1 Customer Reviews

If you find inappropriate content, please report it to Barnes & Noble
Why is this product inappropriate?
Comments (optional)