Telephony¶
DialoX bots can work as interactive, smart IVR systems on traditional phone line. Contact us for more information about this.
The dial
statement¶
By using dial
, you can forward the current phone call to a new destination. In
the calling PBX this is usually implemented using a SIP INVITE.
The phone number given to dial is a full E164 phone number, e.g. including
country code and the +
sign.
dialog main do
say "Please hold."
dial "+31201234567"
end
The bot stays in control of the call. In case the calling party does not pick up
the dialog continues where it left off. The event.payload
variable is then
filled with a string which describes the cause why the caller did not pick up:
dialog main do
say "Please hold."
dial "+31201234567"
branch event.payload do
"BUSY" ->
say "The line is busy."
"NOANSWER" ->
say "The caller did not answer."
true ->
say "We are unable to connect you through due to an unknown error."
end
end
The following call denial reasons are supported:
event.payload |
Description |
---|---|
CHANUNAVAIL |
Channel unavailable (for example in sip.conf, when using qualify=, the SIP chan is unavailable) |
BUSY |
Returned busy |
NOANSWER |
No Answer (i.e SIP 480 or 604 response) |
ANSWER |
Call was answered |
CANCEL |
Call attempt cancelled (i.e user hung up before the call connected) |
DONTCALL |
Privacy manager don’t call |
CONGESTION |
Congestion, or anything else (some other error setting up the call) |
Dial timeout¶
The default timeout of the dial command is 10 seconds, so the call will be tried
for 10 seconds before control is returned to the script. It is possible to
override this using the timeout:
option:
dialog main do
say "Please hold."
dial "+31201234567", timeout: 2000
end
The timeout value is specified in milliseconds.
The refer
statement¶
By using refer
, you can forward the current phone call to a new destination, usually implemented using a SIP REFER. This removes the bot from the call.
The phone number given to dial is a full E164 phone number, e.g. including country code and the +
sign.
dialog main do
say "Please hold."
refer "+31201234567"
end
The refer
statement does not return, as the conversation is stopped once the
forwarding takes place.
Speech adaptation¶
The phone backend uses Google Speech to Text for recognizing a user's voice.
As such, it is possible to hint the recognizer about certain words, phrases or entities that should be recognized better. This process is called speech adaptation.
The adapter takes hints that are extracted from the expecting:
part of ask
.
Certain references to entities that are contained in the expecting clause are
automatically converted to their corresponding class token:
Bubblescript entity | Class token |
---|---|
[amount_of_money] |
$MONEY |
[phone_number] |
$FULLPHONENUM |
[number] |
$OOV_CLASS_DIGIT_SEQUENCE |
[date] |
$FULLDATE |
[time] |
$TIME |
So when you have a script like this:
dialog main do
ask "When is your birthdate?", expecting: entity(match: "[time]")
end
The phone adapter will automatically add $TIME
to its speech context.
Other entities or just text and labels will not be automatically taken as speech_hints. So for labels and text you need to define explicit speech_hints, see paragraph below.
speech_hints:
option to ask¶
It is also possible to override these speech hints by providing a speech_hints:
option to ask
:
dialog main do
ask "Where do you live?", speech_hints: ["i live at $POSTALCODE"]
end
In this case, you would use the class tokens directly as part of the sample phrases.
Special speech hints¶
Having the bot wait quite long, allowing it to capture full sentences, can be
accomplished by adding "PATIENTLY"
as a speech hint:
dialog main do
ask "Please explain the issue", speech_hints: ["PATIENTLY"]
end
DTMF and quick replies¶
Whenever an ask
with quick_replies
is encountered, these quick replies are
automatically mapped onto the DTMF numbers.
So in the following case:
dialog main do
ask "Do you want to continue?", expecting: ["Yes", "No"]
end
Pressing 1 would automatically select "Yes" as the answer.
Note there is no hint ("press 1 for yes") spoken, nor are the actual options (the quick replies) read aloud, you would have to implement this yourself in Bubblescirpt.
Automatic replacements¶
In some cases the text-to-speech engine always makes the same mistake spelling out certain names, et cetera. The phone adapter can be configured to automatically replace strings in the SSML output. Note that this replacement is being done before any Speech Markdown processing.
To do this, create a voice_config
YAML file with the following contents:
output_translate:
$i18n: true
en:
- { from: "Hi", to: "Hai" }
- { from: "Arjan", to: '(Arjan)[sub:"Aryuhn"]' }
This would replace all occurrences of the word Hi
with the word Hai
, and,
more realisticly, annotate the name Arjan with a Speech Markdown sub
tag, so
that it is pronounced correctly.
Turn taking¶
The bot automatically starts recognizing speech as soon as it finishes its sentence.
Turn taking on voice is quite tricky and there are several timeouts involved. When speech hints are given, the timeouts are shortened because a speech hint in the prompt means that we expect a closed answer. The following table shows how the timings are configured:
When | Idle timeout | Active timeout |
---|---|---|
Open question | 8 seconds | 1 second |
Closed question | 4 seconds | 300 ms second |
To make it more clear that the bot expects the user to say something, it is possible to let the user hear short "beep" tones, a low beep before the user is supposed to say something, and a high one once the bot finishing the speech recognition. This setting can be enabled in the settings of the phone connector in the DialoX studio.
Webhook API¶
After connecting a bot to the phone channel in the settings in the studio, a new webhook endpoint is available for your bot which should be called from the PBX voice adapter.
The endpoint is called:
POST https://bsqd.me/phone/webhook/<identifier>/<callerid>/<channelid>`
The URL parameters of which are:
identifier
- the bot ID (in the case when chosen "connect through bot ID" in the settings), or otherwise<pool>-<number>
.callerid
- The E.164-formatted phone number of the person who is calling (no leading+
sign). In the case of an anonymous call, it should be a random string starting withx_
.channelid
- A unique string identifying the current call. The channel ID must stay the same for the duration of the call.
The webhook POST body is an JSON-formatted payload, identical to the Chat REST API chat input payload. read more.
Webhook API request payload examples¶
Most chatbot conversations start with an initial, empty, request:
{
}
Request which includes recognized speech:
{
"action": {
"type": "user_message",
"payload": {
"text": "Hello my name is John",
"input_type": "voice"
}
}
}
Request which includes recognized speech plus the user's recorded voice:
{
"action": {
"type": "user_message",
"payload": {
"text": "Hello my name is John",
"data": "data:audio/mp3;base64,JLKjflsdjfiowajefwua8ssdu9af9uwe898ufo...",
"input_type": "voice"
}
}
}
DTMF request which includes a single DTMF character (e.g. for 'press 1 to...'); only use when get_dtmf
was NOT set:
{
"action": {
"type": "user_message",
"payload": {
"text": "1",
"input_type": "dtmf"
}
}
}
Request as a response to the get_dtmf
request:
{
"action": {
"type": "user_message",
"payload": {
"text": "1234",
"data": "1234",
"type": "numeric"
}
}
}
Request as a response to the get_audio
request:
{
"action": {
"type": "user_attachment",
"payload": {
"url": "data:audio/mp3;base64,JLKjflsdjfiowajefwua8ssdu9af9uwe898ufo...",
"type": "audio",
"metadata": {
"content_type": "audio/mp3"
}
}
}
}
The following requests send events instead of user messages:
The $no_input
event is used to indicate to the bot that no speech was captured
while it was listening:
{
"action": {
"type": "user_event",
"payload": {
"name": "$no_input"
}
}
}
The $dial_return
event is used to indicate the result of a previous dial
command; e.g. when the called party does not respond or is busy:
{
"action": {
"type": "user_event",
"payload": {
"name": "$dial_return",
"payload": "BUSY"
}
}
}
The $hangup
event is used to indicate that the user has hung up the phone. The
payload
is used to indicate the reason for the hangup.
{
"action": {
"type": "user_event",
"payload": {
"name": "$hangup",
"payload": "user hangup"
}
}
}
Webhook API response payload¶
The https://bsqd.me/phone/webhook/<identifier>/<callerid>/<channelid>
endpoint
has the following return value:
{
// whether or not this is the final response
"is_final": false,
// the locale in which the bot speaks
"locale": "nl",
// the text that the bot says, as SSML.
"ssml": "<speak><s>Hello...</s></speak>",
// the configured Text-to-speech voice (optional)
"voice": {
"type": "google",
"name": "nl-NL-Standard-A",
"gender": "FEMALE",
"locale": "nl"
},
// Text-to-speech payload only available via phone channel
"tts": {
// the MP3 URL to the text-to-speech file that needs to be played
"url": "https://...",
// Informational display text to be displayed while speech is playing
"display_text": "Hello...",
},
// Any speech context elements to be passed into Google's speech-to-text `speechContexts` object (optional)
"speechContexts": [{
"phrases": ["I want a cookie"]
}],
// if call control is set, is_final is true and we need to perform something based on the contents of the call control variable.
"call_control": {
// see below
},
// when beep is true, play short beeps around user speech-to-text capturing
"beep": true,
// when record is true, the user's speech needs to be recorded while doing speech-to-text
"record": true,
// when we request DTMF input
"get_dtmf": {
// the maximum nr of digits
"num_digits": 10,
// send when pressing this char (result is sent excluding this char)
"finish_on_key": "#",
},
// when we request audio recording
"get_audio": {
// stop recording when pressing this char
"finish_on_key": "#",
},
}
Call control - dial¶
Set up a new call leg with an invite and connect the first leg of the call to
it, when it is picked up. The timeout
parameter tells how long to wait for the
party to pick up, before returning the call to the bot.
"call_control": {
"type": "dial",
"number": "+3164123456",
"timeout": 10000
}
If the call is returned to the bot because it is unsuccesful or the second party
hung up, a $dial_return
event must be sent to the bot to indicate that the bot
is back in control. (see above)
Dial with announce¶
"call_control": {
"type": "dial",
"number": "+3164123456",
"announce": {
"ssml": "<s>Hello this is the announcement</s>"
}
}
Before the called party gets connected to the calling party, the given announcement message should be played.
Call control - refer¶
Refer the current call leg (SIP REFER) to the given number. The bot goes "out of the loop" and is no longer in control of the call.
"call_control": {
"type": "refer",
"number": "+3164123456"
}
Call control - hangup¶
Hang up the call
"call_control": {
"type": "hangup"
}