LLM chatState Magic Predicate Introduction
The chatState predicate implements a chatbot with a long-term memory of conversation history. The LLM process defined here is called 'Retrieval Augmented Generation with Feedback', or 'RAG with Feedback'. The term 'Feedback' in this case refers to storing short-term memories of the recent dialog in the long-term memory of a vector database.
When responding to a human input, the chatState predicate draws upon two sources of information:
- The long- and short-term memory of the conversation with that individual, called the History
- All other facts, memories and experiences the bot knows, called the Expertise.
The chatState predicate retrieves this content from two vector databases, referred to as the Expertise Vector Database and History Vector Database respectively. As the dialog progresses with each subsequent execution of chatState, RAG with Feedback embeds windows of conversation history into the History Vector Database, in real-time.
We query the vector database with a string called the match string. The match string may contain arbitrary text, optionally including the bot's name, the human's name or role, the most recent human input, and an ingredient called the feed or dialog feed.
The feed is simply a transcript of the last few lines of dialog. Conceptually it may be easy to think of the feed in the context of texting on a phone. The feed is analogous to what you can see on the screen, before it scrolls off the top. The number of dialog lines in the feed is the same as windowSize (see below).
With 8 inputs and 14 outputs, the general form of the chatState predicate looks complicated,
(?response
?score
?citationId
?citedText
?source
?feed
?story
?expertiseMatchString
?expertiseMatches
?expertiseHints
?historyMatchString
?historyMatches
?historyHints
?prompt)
llm:chatState
(?text
?expertiseRepoSpec
?expertiseMaxMatches
?expertiseMinScore
?historyRepoSpec
?historyMaxMatches
?historyMinScore
?botId).
This complexity diminishes when the predicate utilizes only the 3 required variables:
?response llm:chatState (?text ?expertiseRepoSpec).
We'll come back to the meaning of that simple statement in a moment, but first let us review the meaning of each chatState variable.
On the input side, the required variable ?text binds the text of the human input. The required variable ?expertiseRepoSpec binds the name of the Expertise vector database.
We describe the remaining optional input values in this table:
Variable | Optional | Description | Default Value |
---|---|---|---|
?text | No | The natural language text input query. | N/A |
?expertiseRepoSpec | No | Repo spec of Expertise vector database. | N/A |
?expertiseMaxMatches | Yes | The maximum number of Expertise matches. | 4 |
?expertiseMinScore | Yes | The minimum Expertise matching score. | 0.8 |
?historyRepoSpec | Yes | The name of the History vector database. | nil |
?historyMaxMatches | Yes | The maximum number of History matches. | 4 |
?historyMinScore | Yes | The minimum History matching score. | 0.8 |
?botId | Yes | The ID of a bot if more than one bot profile exists in the same repo. | nil |
For the optional variables in the input table above, the value of each depends on an order of precedence:
Order | Value |
---|---|
First choice | The value bound to the input variable, e.g. ?expertiseMinScore, if provided in the query. |
Second choice | The value of Profile Configuration property, e.g. the object of the Profile triple whose predicate is <http://franz.com/chatState/expertiseMinScore>, if the triple exists. |
Third choice | The default value from the Profile Configuration Properties table below, e.g. the value 0.8. |
On the output side, the predicate binds the LLM response to the required parameter ?response.
The optional variables provide the SPARQL query author or repo admin with useful feedback on the construction of the LLM prompt. A typical query in production may include only ?response, ?score, ?citationIdId, ?citedText and ?source.
The complete set of output-side variables:
Variable | Optional | Description |
---|---|---|
?response | No | The natural language response from the LLM. |
?score | Yes | The matching score of the History or Expertise text. |
?citationId | Yes | The ID of a citation, from either History or Expertise, that actually contributed to the response. |
?citedText | Yes | The text associated with the citation. |
?source | Yes | The name of the source repo (History or Expertise) containing the matching text. |
?feed | Yes | The dialog feed used in the prompt. |
?story | Yes | Statement about the feed, used in the prompt. |
?expertiseMatchString | Yes | A string used to query the Expertise vector database. |
?expertiseMatches | Yes | Citation IDs and associated text matched in Expertise vector database. |
?expertiseHints | Yes | Statement about the Expertise matches, to be included in the prompt. |
?historyMatchString | Yes | A string used to query the History vector database. |
?historyMatches | Yes | Citation IDs and associated text matched in History vector database. |
?historyHints | Yes | Statement about the History matches, to be included in the prompt. |
?prompt | Yes | The fully formed LLM prompt. |
In order to achieve RAG with Feedback, the optional variables ?expertiseMaxMatches,
Let's return to the simpler expression using chatState, from above.
?response llm:chatState (?text ?expertiseRepoSpec).
What does this expression mean? The vector database ?expertiseRepoSpec provides the Expertise, but this query specifies no repo for the History vector database ?historyRepoSpec. Therefore, the 'Feedback' part of 'RAG with feedback' does not come into play, and the query is the same as RAG.
In fact, the query
(?response ?score ?citationId ?citedText) llm:askMyDocuments (?text ?expertiseRepoSpec ?topN ?minScore)
is equivalent to
(?response ?score ?citationId ?citedText) llm:chatState (?text ?expertiseRepoSpec ?topN ?minScore)
as long as the stateFmtString property value (see above Profile Configuration Properties table is the same as the askMyDocuments prompt:
Here is a list of citation IDs and content related to the query '\{QUERY\}': \{EXPERTISE\_MATCHES\}.
Respond to the query '\{QUERY\}' as though you wrote the content. Be brief. You only have 20 seconds to reply.
Place your response to the query in the 'response' field.
Insert the list of citations whose content informed the response into the 'citation_ids' array.
chatState Configuration
In any repo where you use the chatState predicate in a query, you may include a set of triples to configure the bot and design a template for the LLM prompt, overriding the default Profile properties.
These triples, called Profile triples, define properties and values to build what might be a very long prompt, and several of the Profile property values specify substrings of that prompt, so that the full prompt can be built in smaller steps.
The property values may also contain string replacement expressions to be populated with the the values of other previously instantiated properties. For example the expression {BOT} appearing in an property value is replaced with the value set by the bot property. Each property value may contain only the string replacements set previously. The table column labeled May Include in the table below, lists the replacement expressions permitted in the corresponding property value.
The profile triples are all utilize the namespace <http://franz.com/chatState/>. The subject of each is <http://franz.com/chatState/profile> or, if there is more than one profile in the repo, <http://franz.com/chatState/profile#bot-id>, were bot-id is the name of the bot. The predicate names consist of the the property name appended to the namespace.
Some examples from the table below, in ntriple format:
<http://franz.com/chatState/profile> <http://franz.com/chatState/windowSize> "8"^^<http://www.w3.org/2001/XMLSchema#integer> .
and <http://franz.com/chatState/profile> <http://franz.com/chatState/historyMatchFmtString> " {FEED} {HUMAN}: {QUERY}" .
When a profile triple exists, the value of the Profile property in the triple takes precedence over the default value.
The following table describes the Profile Configuration properties.
Profile Property | Description | May Include | Default Value | Binding |
---|---|---|---|---|
bot | Name or role of the bot | Bot | {BOT} | |
human | Name or role of the human | Human | {HUMAN} <(Note: {QUERY} is the input from {HUMAN}) | |
windowSize | Number of dialog lines in an embedded memory window | 8 | Determines the size of {FEED}, a window over the last few lines of dialog. | |
windowOverlap | Number of lines overlapping between adjacent embedded windows | 2 | Determines the overlap between {FEED} windows. | |
storyFmtString | Statement about the feed. | {HUMAN}, {BOT}, {FEED}, {QUERY} | Here is a transcript of the current conversation so far: {FEED} | {STORY} when feed exists. |
noStoryString | Statement about an empty feed. | {HUMAN}, {BOT}, {FEED}, {QUERY} | (blank) | {STORY} when feed is empty |
historyMatchFmtString | Match string for History vector lookup | {HUMAN}, {BOT}, {FEED}, {QUERY} | {FEED} {HUMAN}: {QUERY} | Bind matches to {HISTORY_MATCHES} |
historyHintsFmtString | Statement about the history matches. | {HUMAN}, {BOT}, {FEED}, {HISTORY_MATCHES} | Here is a list of citation IDs and hints about how {BOT} should respond, based on his previous conversation with {HUMAN}: {HISTORY_MATCHES}. When appropriate {BOT} should mention recollections of these prior conversations. | {HISTORY_HINTS} when history matches found |
noHistoryHintsString | Statement when no history matches found | {HUMAN}, {BOT}, {FEED}, {STORY}, {QUERY} | The current conversation is unrelated to any prior conversations with {HUMAN}. | {HISTORY_HINTS} when no history matches found |
expertiseMatchFmtString | Match string for expertise vector lookup | {HUMAN}, {BOT}, {FEED}, {STORY}, {QUERY} | {FEED} {HUMAN}: {QUERY} | Bind matches to {EXPERTISE_MATCHES} |
expertiseHintsFmtString | Statement about the expertise matches. | {HUMAN}, {BOT}, {FEED}, {STORY}, {QUERY}, {EXPERTISE_MATCHES} | Here is a list of citation IDs and hints about how {BOT} should respond, based on his knowledge and experience. Base your response on the style of the {BOT}'s responses in these hints: {EXPERTISE_MATCHES} | {EXPERTISE_HINTS} when expertise matches found |
noExpertiseHintsString | Statement when no expertise matches found | {HUMAN}, {BOT}, {FEED}, {STORY}, {QUERY} | No Expertise Found. | {EXPERTISE_HINTS} when no expertise matches found |
stateFmtString | Complete LLM Prompt for chatState. 'citation_ids' refers to a value in the JSON returned the LLM | {HUMAN}, {BOT}, {FEED}, {QUERY}. {STORY}, {EXPERTISE_MATCHES}. {EXPERTISE_HINTS}, {HISTORY_MATCHES}, {HISTORY_HINTS} | {HUMAN} is conducting and interview with {BOT}. {EXPERTISE_HINTS}, {HISTORY_HINTS}. {STORY}. Now {HUMAN} says '{QUERY}'. State an appropriate response for {BOT} in the first person. Use any available information from the transcript in the response. Include only {BOT}'s utterance in the response. Be brief. You only have 20 seconds to reply. Insert the list of citations whose hints informed the response into the 'citation_ids' array. | LLM response binds chatState's ?response |
noResponseString | When the chatState query returns no rows, create a row with this valued bound to ?response. | {HUMAN}, {BOT}, {FEED}, {QUERY}. {STORY}, {HISTORY_MATCHES}, {HISTORY_HINTS}, {EXPERTISE_MATCHES}, {EXPERTISE_HINTS} | I have no answer for that based on my expertise or experience. | Binds chatState's ?response |
Note 1: The value of {FEED} is a string of concatenated exchanges, where each exchange is string of the form {HUMAN}: {QUERY} {BOT}: {RESPONSE}
and {QUERY} is the input from {HUMAN}, and {RESPONSE} is the reply from {BOT}.
Note 2: the value of {EXPERTISE_MATCHES} and {HISTORY_MATCHES} is a string of concatenated matches, where each match has the form citation_id: {CITATION_ID} Hint: {TEXT}
and {CITATION_ID} is the URI of the match from the vector database, and {TEXT} is the text of the match.
Note 3: In Note 1 and Note 2, "concatenated" means the strings appended together with a newline inserted between. These values all have to be strings, rather than data structures, because they become part of the LLM prompt.
Note 4: llm:chatState may utilize keyword syntax.
chatState Example
Follow these directions to try a chatState with a sample configuration and vector databases.
Follow the instructions in Chomsky RAG Example: Download the chomsky47.nq file and create the repos chomsky47 and chomsky47-vec. chomsky47-vec is your Expertise repo. Note that the AllegroGraph.cloud server has the chomsky and chomsky-vec repos already loaded and set up (the names do not include the 47 but are otherwise identical), allowing you to run the example here with much less initial work.
Create a vector database called chomsky47-history, your History repo. It will be empty when created. You can use agtool repo create --vector-store ... or use the New WebView interface interface.
You can use all the default values from the Profile Configuration Properties above, with two exceptions:
Load these two profile triples into chomsky47:
<http://franz.com/chatState/profile> <http://franz.com/chatState/bot> "Noam Chomsky" .
<http://franz.com/chatState/profile> <http://franz.com/chatState/human> "Interviewer" .
These values, specific to the Chomsky bot, will override the default values for bot and human.
Now, in the repo chomsky47, execute the following SPARQL query:
PREFIX llm: <http://franz.com/ns/allegrograph/8.0.0/llm/>
# When using the chomsky/chomsky-vec store in AllegroGraph.cloud
# uncomment the following line with a valid openaiApiKey:
# PREFIX franzOption_openaiApiKey: <franz:sk-XXvalid key hereXX>
SELECT ?response {
?response llm:chatState ("Hello" "chomsky47-vec" 4 0.75 "chomsky47-history" 4 0.0)
# In AllegroGraph.cloud the vector store is named chomsky-vec. Modify
# the above line.
}
The Chomsky bot's response to "Hello" will be bound to ?response.
Now repeat the same query, replacing "Hello" with a series of different queries:
"My name is Lex. I was born in 1983 in the Soviet Union. I graduated with a PhD from Drexel University"
"Define Universal Grammar in 10 words or less."
"Do you believe animals can understand language?"
"What about my cat. I say 'Come here, Kitty' and she responds."
"What about humpback whales?"
Next, ask
"What do you know about me?"
The Chomsky bot should respond with something like
"I recall you mentioned, Lex, that you were born in 1983 in the Soviet Union and later pursued a PhD at Drexel University."
Now, if we query the repo chomsky47-history with
SELECT ?human ?bot
{
?s rdf:type <http://franz.com/chatState/Exchange>.
?s <http://franz.com/chatState/user> ?human.
?s <http://franz.com/chatState/bot> ?bot.
?s <http://franz.com/chatState/exchange-index> ?index
} ORDER BY ?index
we can retrieve a full transcript of the interview:
Sample Transcript
human bot
Hello Hello. It's indeed remarkable that it's been thirty years since our first interview. Over the years, I've tried to inject critical thinking into these conversations and challenge assumptions. What's on your mind today? My name is Lex. I was born in 1983 in the Soviet Union. I graduated with a PhD from Drexel University. Pleasure to meet you, Lex. You've had quite a journey from the Soviet Union to earning a PhD in Drexel. How can I assist you today? Define Universal Grammar in 10 words or less. Universal Grammar: genetic basis allowing infants to acquire language. Do you believe animals can understand language? Animals show remarkable communication abilities, but human language, with its complexity and generality, is uniquely human. What about my cat. I say 'Come here, Kitty' and she responds. Your cat responds to cues, but that's not the same as understanding language in its full complexity. It's more about associative learning than grammar. What about humpback whales? Humpback whales, fascinating creatures, have complex communication methods. Yet, it doesn't equate to the abstract human language capacity. What do you know about me? I recall you mentioned, Lex, that you were born in 1983 in the Soviet Union and later pursued a PhD at Drexel University.
Because of the random nature of LLMs, your transcript may not be exactly the same, but generally the conversation should follow the same flow.
The same questions posed to the chomsky bot described in the Chomsky RAG Example, which does not maintain a history, gives similar answers to the questions up to the last, but the answer to the last question is completely different with less useful content.