Speech Processing for IP Networks: Media Resource Control Protocol (MRCP) / Edition 1

Hardcover (Print)
Buy New
Buy New from BN.com
$103.58
Used and New from Other Sellers
Used and New from Other Sellers
from $86.23
Usually ships in 1-2 business days
(Save 33%)
Other sellers (Hardcover)
  • All (7) from $86.23   
  • New (6) from $86.23   
  • Used (1) from $103.57   

Overview

Media Resource Control Protocol (MRCP) is a new IETF protocol, providing a key enabling technology that eases the integration of speech technologies into network equipment and accelerates their adoption resulting in exciting and compelling interactive services to be delivered over the telephone.  MRCP leverages IP telephony and Web technologies such as SIP, HTTP, and XML (Extensible Markup Language) to deliver an open standard, vendor-independent, and versatile interface to speech engines. 

Speech Processing for IP Networks brings these technologies together into a single volume, giving the reader a solid technical understanding of the principles of MRCP, how it leverages other protocols and specifications for its operation, and how it is applied in modern IP-based telecommunication networks.  Focusing on the MRCPv2 standard developed by the IETF SpeechSC Working Group, this book will also provide an overview of its precursor, MRCPv1.

Speech Processing for IP Networks:

  • Gives a complete background on the technologies required by MRCP to function, including SIP (Session Initiation Protocol), RTP (Real-time Transport Protocol), and HTTP (Hypertext Transfer Protocol).
  • Covers relevant W3C data representation formats including Speech Synthesis Markup Language (SSML), Speech Recognition Grammar Specification (SRGS), Semantic Interpretation for Speech Recognition (SISR), and Pronunciation Lexicon Specification (PLS).
  • Describes VoiceXML - the leading approach for programming cutting-edge speech applications and a key driver to the development of many of MRCP’s features.
  • Explains advanced topics such as VoiceXML and MRCP interworking.

This text will be an invaluable resource for technical managers, product managers, software developers, and technical marketing professionals working for network equipment manufacturers, speech engine vendors, and network operators. Advanced students on computer science and engineering courses will also find this to be a useful guide.

Read More Show Less

Product Details

  • ISBN-13: 9780470028346
  • Publisher: Wiley
  • Publication date: 5/11/2007
  • Edition number: 1
  • Pages: 368
  • Product dimensions: 6.91 (w) x 9.88 (h) x 1.04 (d)

Meet the Author

David Burke is Chief Technology Officer and co-founder of Voxpilot Ltd, UK.  David led Voxpilot to its current position as a leader in VoiceXML interactive services platform technology. His management duties at Voxpilot include executive management and counsel, product vision, direction and management, responsibility for all R&D activities including budgeting, engineering team selection and mentoring, and architecture and design.

He is also member of the World Wide Web Consortium (W3C) Voice Browser Working Group and of the Internet Engineering Task Force (IETF) Speech SC Working Group.

Read More Show Less

Table of Contents

PART I. BACKGROUND.

1. Introduction.

1.1 Introduction to Speech Applications.

1.2 The MRCP Value Proposition.

1.3 History of MRCP Standardisation.

1.3.1 Internet Engineering Task Force.

1.3.2 World Wide Web Consortium.

1.3.3 MRCP: From Humble Beginnings Toward IETF Standard.

1.4 Summary.

2. Basic Principles of Speech Processing.

2.1 Human Speech Production.

2.1.1 Speech Sounds: Phonemics and Phonetics.

2.2 Speech Recognition.

2.2.1 Endpoint Detection.

2.2.2 Mel-Cepstrum.

2.2.3 Hidden Markov Models.

2.2.4 Language Modelling.

2.3 Speaker Verification and Identification.

   2.3.1 Feature Extraction.

   2.3.2 Statistical Modelling.

2.4 Speech Synthesis.

2.4.1 Front-end Processing.

2.4.2 Back-end Synthesis.

2.5 Summary.

3. Overview of MRCP.

3.1 Architecture.

3.2 Media Resource Types.

3.3 Network Scenarios.

3.3.1 VoiceXML IVR Service Node.

3.3.2 IP PBX with Voicemail.

3.3.3 Advanced Media Gateway.

3.4 Protocol Operation.

3.4.1 Establishing Communication Channels.

3.4.2 Controlling a Media Resource.

3.4.3 Walkthrough Examples.

3.5 Security.

3.6 Summary.

PART II. MEDIA AND CONTROL SESSIONS.

4. Session Initiation Protocol.

4.1 Introduction.

4.2 Walkthrough Example.

4.3 SIP URIs.

4.4 Transport.

4.5 Media Negotiation.

4.5.1 Session Description Protocol.

4.5.2 Offer/Answer Model.

4.6 SIP Servers.

4.6.1 Registrars.

4.6.2 Proxy Servers.

4.6.3 Redirect Servers.

4.7 SIP Extensions.

4.7.1 Capability Discovery.

4.8 Security.

4.8.1 Transport and Network Layer Security.

4.8.2 Authentication.

4.8.3 S/MIME.

4.9 Summary.

5. Session Initiation in MRCP.

5.1 Introduction.

5.2 Initiating the Media Session.

5.3 Initiating the Control Session.

5.4 Session Initiation Examples.

5.4.1 Single Media Resource.

5.4.2 Adding and Removing Media Resources.

5.4.3 Distributed Media Source/Sink.

5.5 Locating Media Resource Servers.

5.5.1 Requesting Server Capabilities.

5.5.2 Media Resource Brokers.

5.6 Security.

5.7 Summary.

6. The Media Session.

6.1 Media Encoding.

6.1.1 Pulse Code Modulation (PCM).

6.1.2 Linear Predictive Coding (LPC).

6.2 Media Transport.

6.2.1 Real-Time Protocol (RTP).

6.2.2 DTMF.

6.3 Security.

6.4 Summary.

7. The Control Session.

7.1 Message Structure.

7.1.1 Request Message.

7.1.2 Response Message.

7.1.3 Event Message.

7.1.4 Message Bodies.

7.2 Generic Methods.

7.3 Generic Headers.

7.4 Security.

7.5 Summary.

PART III. DATA REPRESENTATION FORMATS.

8. Speech Synthesis Markup Language (SSML).

8.1 Introduction.

8.2 Document Structure.

8.3 Recorded Audio.

8.4 Pronunciation.

8.4.1 Phonemic/Phonetic Content.

8.4.2 Substitution.

8.4.3 Interpreting Text .

8.5 Prosody.

8.5.1 Prosodic Boundaries.

8.5.2 Emphasis.

8.5.3 Speaking Voice.

8.5.4 Prosodic Control.

8.6 Markers .

8.7 Metadata.

8.8 Summary.

9. Speech Recognition Grammar Specification (SRGS).

9.1 Introduction.

9.2 Document Structure.

9.3 Rules, Tokens, and Sequences.

9.4 Alternatives.

9.5 Rule References.

9.5.1 Special Rules.

9.6 Repeats.

9.7 DTMF Grammars.

9.8 Semantic Interpretation.

9.8.1 Semantic Literals.

9.8.2 Semantic Scripts.

9.9 Summary.

10. Natural Language Semantics Markup Language (NLSML).

10.1 Introduction.

10.2 Document Structure.

10.3 Speech Recognition Results.

10.3.1 Serialising Semantic Interpretation Results.

10.4 Voice Enrollment Results.

10.5 Speaker Verification Results.

10.6 Summary.

11. Pronunciation Lexicon Specification (PLS).

11.1 Introduction.

11.2 Document Structure.

11.3 Lexical Entries.

11.4 Abbreviations and Acronyms.

11.5 Multiple Orthographies.

11.6 Multiple Pronunciations.

11.7 Summary.

PART IV. MEDIA RESOURCES.

12. Speech Synthesiser Resource.

12.1 Overview.

12.2 Methods.

12.2.1 SPEAK.

12.2.2 PAUSE.

12.2.3 RESUME.

12.2.4 STOP.

12.2.5 BARGE-IN-OCCURRED.

12.2.6 CONTROL.

12.2.7 DEFINE-LEXICON.

12.3 Events.

12.3.1 SPEECH-MARKER.

12.3.2 SPEAK-COMPLETE.

12.4 Headers.

12.5 Summary.

13. Speech Recogniser Resource.

13.1 Overview.

13.2 Recognition Methods.

13.2.1 RECOGNIZE.

13.2.2 DEFINE-GRAMMAR.

13.2.3 START-INPUT-TIMERS.

13.2.4 GET-RESULT.

13.2.5 STOP.

13.2.6 INTERPRET.

13.3 Enrollment Methods.

13.3.1 START-PHRASE-ENROLLMENT.

13.3.2 ENROLLMENT-ROLLBACK.

13.3.3 END-PHRASE-ENROLLMENT.

13.3.4 MODIFY-PHRASE.

13.3.5 DELETE-PHRASE.

13.4 Events.

13.4.1 START-OF-INPUT.

13.4.2 RECOGNITION-COMPLETE.

13.4.3 INTERPRETATION-COMPLETE.

13.5 Recognition Headers.

13.6 Enrollment Headers.

13.7 Summary.

14. Recorder Resource.

14.1 Overview.

14.2 Methods.

14.2.1 RECORD.

14.2.2 START-INPUT-TIMERS.

14.2.3 STOP.

14.3 Events.

14.3.1 START-OF-INPUT.

14.3.2 RECORD-COMPLETE.

14.4 Headers.

14.5 Summary.

15. Speaker Verification Resource.

15.1 Overview.

15.2 Methods.

15.2.1 START-SESSION.

15.2.2 END-SESSION.

15.2.3 VERIFY.

15.2.4 VERIFY-FROM-BUFFER.

15.2.5 VERIFY-ROLLBACK.

15.2.6 START-INPUT-TIMERS.

15.2.7 GET-INTERMEDIATE-RESULT.

15.2.8 STOP.

15.2.9 CLEAR-BUFFER.

15.2.10 QUERY-VOICEPRINT.

15.2.11 DELETE-VOICEPRINT.

15.3 Events.

15.3.1 START-OF-INPUT.

15.3.2 VERIFICATION-COMPLETE.

15.4 Headers.

15.5 Summary.

PART V. PROGRAMMING SPEECH APPLICATIONS.

16. Voice eXtensible Markup Language (VoiceXML).

16.1 Introduction.

16.2 Document Structure.

16.2.1 Applications and Dialogs.

16.3 Dialogs.

16.3.1 Forms.

16.3.2 Menus.

16.3.3 Mixed Initiative Dialogs.

16.4 Media Playback.

16.5 Media Recording.

16.6 Speech and DTMF Recognition.

16.6.1 Specifying Grammars.

16.6.2 Grammar Scope and Activation.

16.6.3 Configuring Recognition Settings.

16.6.4 Processing Recognition Results.

16.7 Flow Control.

16.7.1 Executable Content.

16.7.2 Variables, Scopes, and Expressions.

16.7.3 Document and Dialog Transitions .

16.7.4 Event Handling.

16.8 Resource Fetching.

16.9 Call Transfer.

16.10 Summary.

17. VoiceXML and MRCP Interworking.

17.1 Introduction.

17.2 Interworking Fundamentals.

17.2.1 Play Prompts.

17.2.2 Play and Recognise.

17.2.3 Record.

17.3 Application Example.

17.3.1 VoiceXML Scripts.

17.3.2 MRCP Flows.

17.4 Summary.

Appendix A. MRCP Version 1.

A.1 Overview.

A.2 Session Management and Message Transport.

A.3 General Protocol Details.

A.4 Speech Synthesiser Resource.

A.5 Speech Recogniser Resource.

Appendix B. XML Primer.

B.1 Background.

B.2 Basic Concepts.

B.3 Namespaces.

B.4 Document Schemas.

Appendix C. HTTP Primer.

C.1 Background.

C.2 Basic Concepts.

C.2.1 GET Method.

C.2.2 POST Method.

C.3 Caching.

C.4 Cookies.

C.5 Security.

References.

Index.

Acronyms.

Read More Show Less

Customer Reviews

Be the first to write a review
( 0 )
Rating Distribution

5 Star

(0)

4 Star

(0)

3 Star

(0)

2 Star

(0)

1 Star

(0)

Your Rating:

Your Name: Create a Pen Name or

Barnes & Noble.com Review Rules

Our reader reviews allow you to share your comments on titles you liked, or didn't, with others. By submitting an online review, you are representing to Barnes & Noble.com that all information contained in your review is original and accurate in all respects, and that the submission of such content by you and the posting of such content by Barnes & Noble.com does not and will not violate the rights of any third party. Please follow the rules below to help ensure that your review can be posted.

Reviews by Our Customers Under the Age of 13

We highly value and respect everyone's opinion concerning the titles we offer. However, we cannot allow persons under the age of 13 to have accounts at BN.com or to post customer reviews. Please see our Terms of Use for more details.

What to exclude from your review:

Please do not write about reviews, commentary, or information posted on the product page. If you see any errors in the information on the product page, please send us an email.

Reviews should not contain any of the following:

  • - HTML tags, profanity, obscenities, vulgarities, or comments that defame anyone
  • - Time-sensitive information such as tour dates, signings, lectures, etc.
  • - Single-word reviews. Other people will read your review to discover why you liked or didn't like the title. Be descriptive.
  • - Comments focusing on the author or that may ruin the ending for others
  • - Phone numbers, addresses, URLs
  • - Pricing and availability information or alternative ordering information
  • - Advertisements or commercial solicitation

Reminder:

  • - By submitting a review, you grant to Barnes & Noble.com and its sublicensees the royalty-free, perpetual, irrevocable right and license to use the review in accordance with the Barnes & Noble.com Terms of Use.
  • - Barnes & Noble.com reserves the right not to post any review -- particularly those that do not follow the terms and conditions of these Rules. Barnes & Noble.com also reserves the right to remove any review at any time without notice.
  • - See Terms of Use for other conditions and disclaimers.
Search for Products You'd Like to Recommend

Recommend other products that relate to your review. Just search for them below and share!

Create a Pen Name

Your Pen Name is your unique identity on BN.com. It will appear on the reviews you write and other website activities. Your Pen Name cannot be edited, changed or deleted once submitted.

 
Your Pen Name can be any combination of alphanumeric characters (plus - and _), and must be at least two characters long.

Continue Anonymously

    If you find inappropriate content, please report it to Barnes & Noble
    Why is this product inappropriate?
    Comments (optional)