LLM / ChatGPT support
The DialoX platform has builtin support for ChatGPT and other large language models (LLMs).
Prompt example¶
By creating a script of the LLM Prompt, it is possible to define one or more prompts that can be used at runtime in the bot.
A prompt file looks at minimum like this:
prompts:
- id: rhyme
text: |
make a sentence that rhymes with: {{ text }}
In bubblescript this exposes a constant called @prompts.rhyme
, that can then be used like this:
dialog main do
ask "Enter a sentence and I will make it rhyme for you"
_result = LLM.complete(@prompts.rhyme, text: answer.text)
say _result.text
end
resulting in a conversation like this:
bot: Enter a sentence and I will make it rhyme for you
user: I want to fly away!
bot: Today is Sunday Funday, let's go play!
Prompt parameter documentation¶
A full specification of a prompt yaml is this:
prompts:
- # Unique identifier for the prompt, exposed in bubblescript as @prompts.[id]
id: summarize
# Human-readable label, used when the prompt is included in the CMS or Inbox widget
label: Summarize
# LLM provider: openai, microsoft_openai, or google_ai (gemini)
provider: openai
# The LLM model used. It is dependent on the provider.
# For Google AI, use gemini-1.5-flash for instance.
model: gpt-3.5-turbo
# The actual text of the prompts. Can be a simple string or a $i18n structure for
# translation. The prompt text is a Liquid template, so `{{ }}` bindings can be
# specified which need to be passed in when calling `LLM.complete()`.
text:
$i18n: true
nl: |
system: Gegeven de volgende tekst, maak een korte en bondige samenvatting die
alleen de meest noodzakelijke punten teruggeeft. Gebruik hooguit 50
woorden:
user: {{text}}
en: |
system: Given the following text, create a short summary that only highlights the
most relevant parts of the text. Use at most 50 words:
user: {{text}}
# Additional request parameters passed to the API endpoint
endpoint_params:
some_extra_param: 1
# Expected format of the response: text, json_object, or json_schema
response_format: text
# JSON schema for structured responses (ONLY when response_format is json_schema)
response_json_schema:
type: object
properties:
summary:
type: string
description: "A concise summary of the input text"
# Whether to return the response log probability for each generated token
logprobs: false
# Maximum number of tokens for the completion
max_completion_tokens: 100
# Number of alternative completions to generate
candidate_count: 1
# Penalty for token frequency (between -2.0 and 2.0)
frequency_penalty: 0.0
# Penalty for token presence (between -2.0 and 2.0)
presence_penalty: 0.0
# Seed for deterministic completions (optional)
seed: 42
# Randomness of the output (0.0 to 2.0)
temperature: 1.0
# Nucleus sampling parameter
top_p: 1.0
# List of sequences where the API should stop generating (optional)
stop:
- "END"
# List of tools available to the model (optional)
tools:
- type: function
function:
name: get_current_weather
description: Get the current weather in a given location
parameters:
type: object
properties:
location:
type: string
description: The city and state, e.g. San Francisco, CA
unit:
type: string
enum: [celsius, fahrenheit]
required: [location]
This YAML structure defines all possible fields for a prompt, including advanced options like response schemas, completion parameters, and tool definitions. Not all fields are required for every prompt, and the specific fields used may depend on the provider and use case.
Executing prompts¶
By executing the LLM.complete(prompt, bindings)
function, a call to the
LLM API is done with the given prompt and its bindings. The prompt
argument typically comes from a constant defined in a prompt YAML file,
for instance @prompts.summarize
.
The bindings is a map or keyword list that needs to contain the bindings
that the prompt needs; in the summarize example only one binding is
created named text
. So a call to that prompt would be done like this:
_result = LLM.complete(@prompts.summarize, text: "this is a long article ...")
The full result of the LLM.complete
call is a map array which contains the following:
text
- The output text that LLM producedjson
- A JSON deserialized version of the text; the runtime detects whether JSON is available in the result and, if so, parses it. The JSON message itself can be padded with arbitrary other texts.usage
- The total tokens that were used for this API callrequest_time
- The nr of milliseconds this request tookraw
- The raw OpenAPI response
User / bot / assistant roles¶
The prompt text can contain user:
, assistant:
or system:
strings,
which will be used for determining the different parts of the prompt
(e.g. constructing the messages part of the OpenAPI request payload).
Automatic bindings¶
Some prompt bindings are done automatically.
In the case of Bubblescript LLM.complete
calls, the following bindings
are filled automatically:
locale
- The conversation's localetranscript
- The last 5 turns of the bot / user. This is typically used to make a generic chatbot that responds to the previous conversation in a natural way.bot
- The metadata of the bot, for instance{{ bot.title }}
is exposed.conversation
- Some metadata of the conversation, likeaddr
,tags
,frontend
.
The
transcript
binding is an array binding and needs to be specified as[[ transcript ]]
so with square brackets, and on a line by itself!
Charging¶
For every LLM.complete call, a charge event (of type llm.complete
) is
created and is taken into account in the customer's billing cycle.