LLM Knowledge base support¶
It is possible to use a knowledge base to improve the accuracy of the LLM's responses.
Knowledge bases can be configured in the prompts file, under the knowledge key.
Each knowledge base has its own retrieval strategy. Depending on the retrieval strategy, the knowledge base will function differently.
Configuring a knowledge base¶
WARNING: This strategy is for internal use only. It is not possible to use Assistants from a "bring your own" OpenAI account.
In an LLM prompts file, a knowledge base is configured by adding a knowledge key:
prompts:
- id: faq
text: |
system: use the given knowledge base to answer the users question.
[!kb]
user: {{ question }}
knowledge:
- id: "my_knowledge_base"
label: "My Knowledge Base"
managed_openai_assistant: {}
This can then be used in bubblescript as follows:
dialog main do
ask "What is the capital of France?"
answer = LLM.complete(@prompts.faq, kb=@knowledge.my_knowledge_base, question=answer.text)
say answer.text
end
In this case the AI section of the studio will have a 'knowledge bases' section, where you can upload files to the knowledge base.
Retrieval strategies¶
There are several strategies to use a knowledge base.
Managed OpenAI Vector store (Responses API)¶
The managed_openai_responses strategy automatically creates and manages an OpenAI Vector store for your knowledge base. Files uploaded to the bot's file system are automatically synchronized with the vector store.
knowledge:
- id: "my_knowledge_base"
label: "My Knowledge Base"
managed_openai_responses: {}
This strategy will:
- Automatically create an OpenAI Vector store for your bot
- Synchronize files from the bot's file system to the vector store
- Use the OpenAI Responses API with the file_search tool to search the vector store and incorporate the results in the LLM response.
It is possible to configure the number of results to return with the max_num_results field:
knowledge:
- id: "my_knowledge_base"
label: "My Knowledge Base"
managed_openai_responses:
max_num_results: 5
This example will retrieve up to 5 results from the vector store, instead of the default of 3.
Managed OpenAI Assistant (to be deprecated)¶
The managed_openai_assistant strategy automatically creates and manages an OpenAI Assistant for your knowledge base. Files uploaded to the bot's file system are automatically synchronized with the assistant's vector store.
WARNING: This strategy is deprecated and will be removed in the future. From release 2.46, newly created bots will use the
managed_openai_responsesstrategy instead.
knowledge:
- id: "my_knowledge_base"
label: "My Knowledge Base"
managed_openai_assistant:
provider: "openai"
This strategy will: - Automatically create an OpenAI Assistant for your bot - Create a vector store attached to the assistant - Synchronize files from the bot's file system to the vector store - Use the assistant's file search capabilities to find relevant documents
The provider field should be set to openai or microsoft_openai and must match the provider of the LLM prompt that uses this knowledge base.
Files are automatically synchronized when: - Files are uploaded to the bot's file system - Files are renamed or moved - Files are deleted
The assistant and vector store are created automatically when first needed, and are named based on your bot ID and knowledge base ID.
Scripts collection¶
The scripts_collection strategy uses a collection of scripts as a knowledge base:
knowledge:
- id: "my_knowledge_base"
label: "My Knowledge Base"
scripts_collection:
collection: "kb/"
This will read all the scripts in the kb/ directory and use them as a knowledge base, by directly including the content of the scripts in the prompt, on the place where the [!kb] tag is set in the prompt definition.
Internal retrieval strategies¶
WARNING: These strategies are for internal use only. It is not possible to use Assistants from a "bring your own" OpenAI account.
OpenAI Responses API (hardcoded)¶
The fixed_openai_responses strategy is used to connect to an OpenAI Vector store by hardcoding the vector store ID.
knowledge:
- id: "my_knowledge_base"
label: "My Knowledge Base"
fixed_openai_responses:
vector_store_id: "vs_1234567890"
max_num_results: 1
OpenAI Assistant through integration¶
The external_openai_assistant strategy is used to connect to an OpenAI Assistant by configuring the assistant ID in an integration secret.
knowledge:
- id: "my_knowledge_base"
label: "My Knowledge Base"
external_openai_assistant:
assistant_integration_alias: "my_assistant"
# optional reference to alias where the project ID is stored
project_integration_alias: "my_project"
And then in the integrations file, add a new entry like this:
- provider: secret
alias: my_assistant
context: bot
description: Assistant ID for 'My Knowledge Base'
- provider: secret
alias: my_project
context: bot
description: Project ID for 'My Knowledge Base'
No automatic synchronization of files is done from the platform to the assistant. With this retrieval strategy, you need to manually add files to the assistant on the OpenAI platform.
OpenAI Assistant (hardcoded)¶
The "fixed" OpenAI Assistant strategy. This strategy is configured using a hardcoded OpenAI Assistant ID to find the most relevant documents to include in the prompt.
knowledge:
- id: "my_knowledge_base"
label: "My Knowledge Base"
fixed_openai_assistant:
provider: openai
assistant_id: "asst_1234567890"
To use this strategy, you need to provide the assistant_id of an OpenAI Assistant. For the openai provider, you can create an assistant here.
For the microsoft_openai provider, you can create an assistant here. The provider should be set to openai or microsoft_openai; it decides which API endpoint to use. The provider of the knowledge base must match the provider of the LLM prompt
that is using the knowledge base.
No automatic synchronization of files is done from the platform to the assistant. With this retrieval strategy, you need to manually add files to the assistant on the OpenAI platform.