Skip to content

LLM Knowledge base support

It is possible to use a knowledge base to improve the accuracy of the LLM's responses. Knowledge bases can be configured in the prompts file, under the knowledge key.

Each knowledge base has its own retrieval strategy. Depending on the retrieval strategy, the knowledge base will function differently.

Configuring a knowledge base

WARNING: This strategy is for internal use only. It is not possible to use Assistants from a "bring your own" OpenAI account.

In an LLM prompts file, a knowledge base is configured by adding a knowledge key:

prompts:
  - id: faq
    text: |
      system: use the given knowledge base to answer the users question.
      [!kb]
      user: {{ question }}
knowledge:
  - id: "my_knowledge_base"
    label: "My Knowledge Base"
    fixed_openai_assistant:
      assistant_id: "asst_1234567890"
      # optional OpenAI Project ID
      project_id: "prj_1234567890"

This can then be used in bubblescript as follows:

dialog main do
  ask "What is the capital of France?"
  answer = LLM.complete(@prompts.faq, kb=@knowledge.my_knowledge_base, question=answer.text)
  say answer.text
end

Retrieval strategy: OpenAI Assistant through integration

WARNING: This strategy is for internal use only. It is not possible to use Assistants from a "bring your own" OpenAI account.

The external_openai_assistant strategy is used to connect to an OpenAI Assistant by configuring the assistant ID in an integration secret.

knowledge:
  - id: "my_knowledge_base"
    label: "My Knowledge Base"
    external_openai_assistant:
      assistant_integration_alias: "my_assistant"
      # optional reference to alias where the project ID is stored
      project_integration_alias: "my_project"

And then in the integrations file, add a new entry like this:

- provider: secret
  alias: my_assistant
  context: bot
  description: Assistant ID for 'My Knowledge Base'
- provider: secret
  alias: my_project
  context: bot
  description: Project ID for 'My Knowledge Base'

No automatic synchronization of files is done from the platform to the assistant. With this retrieval strategy, you need to manually add files to the assistant on the OpenAI platform.

Retrieval strategy: OpenAI Assistant (hardcoded)

The "fixed" OpenAI Assistant strategy. This strategy is configured using a hardcoded OpenAI Assistant ID to find the most relevant documents to include in the prompt.

knowledge:
  - id: "my_knowledge_base"
    label: "My Knowledge Base"
    fixed_openai_assistant:
      provider: openai
      assistant_id: "asst_1234567890"

To use this strategy, you need to provide the assistant_id of an OpenAI Assistant. For the openai provider, you can create an assistant here. For the microsoft_openai provider, you can create an assistant here. The provider should be set to openai or microsoft_openai; it decides which API endpoint to use. The provider of the knowledge base must match the provider of the LLM prompt that is using the knowledge base.

No automatic synchronization of files is done from the platform to the assistant. With this retrieval strategy, you need to manually add files to the assistant on the OpenAI platform.

Retrieval strategy: Scripts collection

The scripts_collection strategy uses a collection of scripts as a knowledge base:

knowledge:
  - id: "my_knowledge_base"
    label: "My Knowledge Base"
    scripts_collection:
      collection: "kb/"

This will read all the scripts in the kb/ directory and use them as a knowledge base, by directly including the content of the scripts in the prompt, on the place where the [!kb] tag is set in the prompt definition.