Feedback

  • Contents
 

Introduction to MRCP

Speech-related clients, such as a VoiceXML browser, use Media Resource Control Protocol (MRCP) to control media service resources, including Text-to-Speech (TTS) synthesizers and Automatic Speech Recognizers (ASR). Customer Interaction Center (CIC) supports MRCP version 2 (MRCPv2), which is the current standard.

To connect clients with speech processing servers and manage the sessions between them, MRCP relies on other protocols, such as Session Initiation Protocol (SIP). MRCP uses SIP to set up and tear down media and control sessions with the speech server. Furthermore, MRCP defines the messages to control the media service resources and it also defines the messages that provide the status of the media service resources.

Overview

Media Resource Control Protocol (MRCP) is a standard proposed by the Internet Engineering Task Force (IETF) for controlling media services for resources such as speech synthesizers, recognizers, recorders, and verifiers residing on servers on the network. The current standard is now in version 2, which is the only version supported by Customer Interaction Center.

MRCP is an application layer protocol and relies on another session management protocol, typically Session Initiation Protocol (SIP), to establish a control session between the client and the server. SIP is used to establish not only the control channel to use for MRCP, but also to establish the media sessions and associated parameters between the media source—or sink—and the media server.

Once established, the MRCP protocol exchange operates over the control session, allowing the client to control the media processing resources on the speech resource server.

CIC support for MRCP

CIC supports MRCP for Text-to-Speech (TTS) and Automatic Speech Recognition (ASR) functionality.

MRCP for TTS

MRCP in CIC supports synthesizer resources, and it is intended as an alternative to using SAPI for TTS. The following list lists some advantages of using MRCP for TTS.

  • TTS processing is off-loaded to other servers rather than competing for resources on the CIC server.

  • The MRCP subsystem provides load-balancing capabilities between servers.

  • The MRCPv2 standard is much more transparent than SAPI, and it is not based on Component Object Model (COM).

  • Most speech vendors have implemented MRCP and deprecated their SAPI integration.

  • Audio is streamed from MRCP Servers using Real-time Transport Protocol (RTP) rather than a proprietary method.

  • Support for multiple languages, voices, and, vendors are more efficient with MRCP.

  • Interaction Text to Speech is a native TTS engine in Interaction Media Server and can use MRCP. For more information on Interaction Text to Speech, see CIC Text to Speech Engines Technical Reference in the PureConnect Documentation Library .

CIC can use MRCP TTS in the following instances:

Function Description

Name prompt generation

System handlers will use TTS if there are user name prompts that have not been recorded.

Telephone User Interface (TUI) and Mobile Office

The TUI and Mobile Office will use TTS for dynamic prompts and for email playback.

Handlers

Whenever TTS prompt tools such as Play String, Play Text File, Record String, Record Text File, and Play Prompt Phrase tools—and anytime their extended versions are used.

VoiceXML

TTS prompts in VoiceXML scripts can use MRCP.

Interaction Desktop Personal Rules

Personal Rules, configured in Interaction Desktop, can take advantage of TTS.

MRCP for ASR

The following list provides some of the advantages of using MRCP for ASR.

  • Some speech vendors allow only MRCP integrations.

  • The MRCP standard allows CIC to integrate with new ASR vendors.

  • The MRCP standard is much more transparent than proprietary Application Programming Interfaces (APIs) that are used with other PureConnect ASR server implementations.

CIC can use MRCP ASR in the following instances:

Function Description

TUI and Mobile Office

The TUI and Mobile Office will use ASR for recognizing callers’ utterances.

Handlers

Whenever ASR recognition tools are used, such as Reco Input.

VoiceXML

ASR inputs in VoiceXML scripts can use MRCP.

Note:

Note:
For more information on MRCP for ASR, see ASR Technical Reference in the PureConnect Documentation Library .

Architecture

The following diagram depicts the protocol flow between servers using MRCP and illustrates how a third-party TTS MRCP play occurs. In this flow, all audio is streamed from the MRCP Server directly to Interaction Media Server using RTP. The audio is then streamed directly to the endpoint using RTP.

The following diagram depicts the protocol flow between servers using MRCP with Interaction Text to Speech, which is a part of Interaction Media Server. In this flow, all audio is streamed from Interaction Media Server to the endpoint using RTP.