Credit: (Brookhaven National Laboratory/Digital Discovery)
Workflow for generating domain-tailored chatbot response. The user query is first sent to a ML embedding model, which computes an embedding vector that captures the semantic content of the input. This vector is used to query a pre-computed database of text chunks. Text snippets that are similar to the query (“close” in the embedding space) are prepended to the user query to construct a prompt. The prompt is sent to a large language model (LLM), which generates a text response for the user.