ToC

DocOverview

CGDoc

RelNotes

FAQ

Index

PermutedIndex

Allegro CL version 11.0

gpt operators

ask-chat

Function, gpt package

Arguments: prompt-or-messages &key frequency-penalty function-call functions logit-bias max-tokens model n output-format presence-penalty stop temperature timeout top-p user verbose An interface to the OpenAI Chat Completions API.

See the OpenAI API Reference document for descriptions of the arguments to this function, in places like https://platform.openai.com/docs/api-reference/chat/create. A general introduction can be found at OpenAI documentation.

frequency-penalty - Avoid using the same tokens. Defaults to Defaults to the value of *openai-default-frequency-penalty*.
function-call - JSON object (in st-json form) to force OpenAI to respond with an instantiated function. Defaults to the value of *openai-default-function-call*.
functions - JSON object (in st-json form) describing function signatures utilized by the OpenAI function-calling API. Defaults to the value of *openai-default-functions*.
logit-bias - Defaults to the value of *openai-default-logit-bias*.
max-tokens - Maximum number of tokens in prompt + response. Defaults to the value of *openai-default-max-tokens*.
model - OpenAI GPT model to use. Defaults to the value of *openai-default-ask-chat-model*.
n - Number of responses. Defaults to the value of *openai-default-n*.
output-format - When equal to :text, return the text of the first response (n has no effect). Otherwise, return the list of n responses. Defaults to the value of *openai-default-output-format*.
presence-penalty - Encourage using different tokens. Defaults to the value of *openai-default-presence-penalty*.
prompt-or-messages - the input text or association list.
stop - Completion stop sequence. Defaults to the value of *openai-default-stop*.
temperature - Control randomness of generated text. Defaults to the value of *openai-default-temperature*.
timeout - API call timeout. Function throws an exception when API times out. Defaults to the value of *openai-default-timeout*.
top-p - Defaults to the value of *openai-default-top-p*.
user - The API user. Defaults to the value of *openai-default-user*.
verbose - when true, display some debugging information. Defaults to nil.

Use this function to interact with GPT-3.5 and 4 models. Model should be one of: "gpt3-3.5-turbo", "gpt3-3.5-turbo-0301" or "gpt-4".

prompt-or-messages can be either a simple string or a transcript in the form of an alist ((role . content) ...) where role is one of "user", "system", "assistant" or "function".

Simple chatbot functionality

Examples:

gpt> (ask-chat "Hello!")
"Hello! How can I assist you today?"

Complex chatbot functionality

The API for GPT-3.5 and -4 API models works a little differently than it did for earlier models. This function still accepts a simple string input, but it also supports a prompt in the form of a transcript. For example, you can send a transcript alternating between "system" and "user" roles, where the last line is is a "user" input.

gpt> (ask-chat
  '(("user" . "Maine")
    ("assistant" . "Augusta")
    ("user" . "California")
    ("assistant" . "Sacramento")
    ("user" . "Pennsylvania")))
"Harrisburg"

gpt> (ask-chat
  '(("user" . "Fill in the blank with one possible appropriate verb phrase or preposition: Gravity ________ Justice.")
    ("assistant" . "is unrelated to")
    ("user" . "Fill in the blank with one possible appropriate verb phrase or preposition: A tall woman ________ a short man.")
    ("assistant" . "standing beside")
    ("user" . "Fill in the blank with one possible appropriate verb phrase or preposition: Solar energy ________ clean energy.")
    ("assistant" . "is")
    ("user" . "Fill in the blank with one possible appropriate verb phrase or preposition: Empire State Building ________ Eiffel Tower.")
    ("assistant" . "is taller than")
    ("user" . "Fill in the blank with one possible appropriate verb phrase or preposition: An excited Jane ________ an apologetic Tom.")))
"confronted"

Error Handling

ask-chat makes every effort to pass through API error messages into the results. This helps upstream applications easily understand configuration errors.

For example:

gpt> (ask-chat "Hello")
"Incorrect API key provided: missing. You can find your API key at
https://platform.openai.com/account/api-keys."

gpt> (ask-chat "Hello" :n 0 :output-format :list)
("0 is less than the minimum of 1 - 'n'")

See llm-api.html for general information on support for large language models in Allegro CL.

ask-embedding

Function, gpt package

Arguments: text &key log-progress model timeout verbose

text - The text string to be embedded.
log-progress - When true print the text item + time for API call.
model - Specifies the OpenAI model used to calcluate the embedding (defaults to "text-embedding-ada-002")
timeout - API call timeout. Function returns a JSON object containing an error message when API times out. Defaults to the value of *openai-default-timeout*.
verbose - Passed down to the call to call-openai.

Called by embed.

Calls call-openai to retrieve a JSON object that includes the array representing the embedding of text.

Normally returns an object of type st-json::jso. (Note: st-json::jso is an internal Lisp implementation of a JSON object).

See llm-api.html for general information on support for large language models in Allegro CL.

ask-for-list

Function, gpt package

Arguments: prompt-or-messages &key frequency-penalty logit-bias max-tokens model presence-penalty temperature timeout top-p user verbose

Return a list of strings from OpenAI model.

frequency-penalty - Avoid using the same tokens. Defaults to Defaults to the value of *openai-default-frequency-penalty*.
logit-bias - Defaults to the value of *openai-default-logit-bias*.
max-tokens - Maximum number of tokens in prompt + response. Defaults to the value of *openai-default-max-tokens*.
model - OpenAI GPT model to use. Defaults to the value of *openai-default-ask-chat-model*.
presence-penalty - Encourage using different tokens. Defaults to the value of *openai-default-presence-penalty*.
prompt-or-messages - the input text or association list.
temperature - Control randomness of generated text. Defaults to the value of *openai-default-temperature*.
timeout - API call timeout. Function throws an exception when API times out. Defaults to the value of *openai-default-timeout*.
top-p - Defaults to the value of *openai-default-top-p*.
user - The API user. Defaults to the value of *openai-default-user*.
verbose - when true, display some debugging information. Defaults to nil.

Use this function to interact with GPT-3.5 and 4 models. Model should be one of: "gpt3-3.5-turbo", "gpt3-3.5-turbo-0301" or "gpt-4".

prompt-or-messages can be either a simple string or a transcript in the form of an alist ((role . content) ...) where role is one of "user", "system", "assistant" or "function".

This function calls the OpenAI API with a request body that includes these attribute-value pairs in the JSON:

functions:
    [{"name":"array_of_strings",
      "description":"function to list an array of specified items","parameters":
      {"type":"object",
       "properties":
       {"array":
        {"description":"the list of items",
         "type":"array",
         "items":{"type":"string"}}}}}]

function_call: {"name":"array_of_strings"}

This function signature forces the API to return a JSON list of strings, which the Lisp function then parses into a Lisp list.

Examples

gpt> (ask-for-list "List 2 colors.")
("Blue" "Red")
gpt> (ask-for-list "Suggest 5 ways to cook shrimp.")
("Grilling" "SautÃ©ing" "Steaming" "Boiling" "Baking")
gpt> (ask-for-list "Name the New England States in alphabetical order.")
("Connecticut" "Maine" "Massachusetts" "New Hampshire" "Rhode Island"
 "Vermont")

ask-for-map

Function, gpt package

Arguments: prompt-or-messages &key frequency-penalty logit-bias max-tokens model presence-penalty temperature timeout top-p user verbose

Generate a list of key-value pairs.

frequency-penalty - Avoid using the same tokens. Defaults to Defaults to the value of *openai-default-frequency-penalty*.
logit-bias - Defaults to the value of *openai-default-logit-bias*.
max-tokens - Maximum number of tokens in prompt + response. Defaults to the value of *openai-default-max-tokens*.
model - OpenAI GPT model to use. Defaults to the value of *openai-default-ask-chat-model*.
presence-penalty - Encourage using different tokens. Defaults to the value of *openai-default-presence-penalty*.
prompt-or-messages - the input text or association list.
temperature - Control randomness of generated text. Defaults to the value of *openai-default-temperature*.
timeout - API call timeout. Function throws an exception when API times out. Defaults to the value of *openai-default-timeout*.
top-p - Defaults to the value of *openai-default-top-p*.
user - The API user. Defaults to the value of *openai-default-user*.
verbose - when true, display some debugging information. Defaults to nil.

This function calls the OpenAI API with a request body that includes these attribute-value pairs in the JSON:



functions: [
    {'name':'array_of_key_val',
    'description':'function to list an array of key-value pairs.',
    'parameters':
      {'type':'object',
       'properties':
        {'array':
          {'description':'the list of key-value pairs',
           'type':
           'array',
           'items':
            {
            'type': 'object',
            'properties': {
              'key': {
                'type': 'string',
                'description': 'Unique identifier of the object.'
             },
              'value': {
                'type': 'string',
                'description': 'Value associated with the object.'
              }
            }
    }}}}}]
function_call: {'name':'array_of_key_val'}

Use this function to interact with GPT-3.5 and 4 models. Model should be one of: "gpt3-3.5-turbo", "gpt3-3.5-turbo-0301" or "gpt-4".

prompt-or-messages can be either a simple string or a transcript in the form of an alist ((role . content) ...) where role is one of "user", "system", "assistant" or "function".

Returns a list of pairs where the first element is the key and the second element of each pair is the value.

gpt> (ask-for-map "Arabic to Roman Numerals")
(("1" "I") ("2" "II") ("3" "III") ("4" "IV") ("5" "V") ("6" "VI")
 ("7" "VII") ("8" "VIII") ("9" "IX") ("10" "X"))
gpt> (ask-for-map "The capitols of the New England states.")
(("Connecticut" "Hartford") ("Maine" "Augusta")
 ("Massachusetts" "Boston") ("New Hampshire" "Concord")
 ("Rhode Island" "Providence") ("Vermont" "Montpelier"))
gpt> (ask-for-map "The leading female character in each of Shakespeare's tragedies.")
(("Romeo and Juliet" "Juliet Capulet") ("Macbeth" "Lady Macbeth")
 ("Hamlet" "Ophelia") ("Othello" "Desdemona") ("King Lear" "Cordelia")
 ("Antony and Cleopatra" "Cleopatra") ("Julius Caesar" "Portia")
 ("Timon of Athens" "Flavia") ("Troilus and Cressida" "Cressida")
 ("Coriolanus" "Volumnia") ("Titus Andronicus" "Lavinia"))

See llm-api.html for general information on support for large language models in Allegro CL.

ask-for-table

Function, gpt package

Arguments: prompt-or-messages &key frequency-penalty logit-bias max-tokens model presence-penalty temperature timeout top-p user verbose

Generate tabular data as a list-of-lists.

frequency-penalty - Avoid using the same tokens. Defaults to Defaults to the value of *openai-default-frequency-penalty*.
logit-bias - Defaults to the value of *openai-default-logit-bias*.
max-tokens - Maximum number of tokens in prompt + response. Defaults to the value of *openai-default-max-tokens*.
model - OpenAI GPT model to use. Defaults to the value of *openai-default-ask-chat-model*.
presence-penalty - Encourage using different tokens. Defaults to the value of *openai-default-presence-penalty*.
prompt-or-messages - the input text or association list.
temperature - Control randomness of generated text. Defaults to the value of *openai-default-temperature*.
timeout - API call timeout. Function throws an exception when API times out. Defaults to the value of *openai-default-timeout*.
top-p - Defaults to the value of *openai-default-top-p*.
user - The API user. Defaults to the value of *openai-default-user*.
verbose - when true, display some debugging information. Defaults to nil.

This function calls the OpenAI API with a request body that includes these attribute-value pairs in the JSON:

functions:
    [
    {'name':'table',
    'description':'function to return tabular data.',
    'parameters':
      {'type':'object',
       'properties':
        {'rows':
          {'description':'the list of table rows',
           'type':
           'array',
           'items':
            {
            'type': 'array',
            'items': {
               'type': 'string',
               'description' : 'the value in each column'}
    }}}}}]
function_call: {'name':'table'}

Use this function to interact with GPT-3.5 and 4 models. Model should be one of: "gpt3-3.5-turbo", "gpt3-3.5-turbo-0301" or "gpt-4".

prompt-or-messages can be either a simple string or a transcript in the form of an alist ((role . content) ...) where role is one of "user", "system", "assistant" or "function".

Returns a list of lists where each list contains a row of tablular data.

gpt> (ask-for-table "A 3x3 matrix")
(("1" "2" "3") ("4" "5" "6") ("7" "8" "9"))
gpt> (ask-for-table "The New England states, their capitals, area, population and year admitted to the Union.  Omit headers.")
(("Connecticut" "Hartford" "5,543 sq mi" "3,605,944" "1788")
 ("Maine" "Augusta" "35,385 sq mi" "1,344,212" "1820")
 ("Massachusetts" "Boston" "10,555 sq mi" "6,892,503" "1788")
 ("New Hampshire" "Concord" "9,350 sq mi" "1,359,711" "1788")
 ("Rhode Island" "Providence" "1,034 sq mi" "1,059,361" "1790")
 ("Vermont" "Montpelier" "9,616 sq mi" "623,989" "1791"))

See llm-api.html for general information on support for large language models in Allegro CL.

ask-my-documents

Function, gpt package

Arguments: prompt-or-messages &key frequency-penalty logit-bias max-tokens min-score model presence-penalty temperature timeout top-n top-p user vector-database-name verbose

frequency-penalty - Avoid using the same tokens. Defaults to Defaults to the value of *openai-default-frequency-penalty*. Defaults to the value of *openai-default-functions*.
logit-bias - Defaults to the value of *openai-default-logit-bias*.
max-tokens - Maximum number of tokens in prompt + response. Defaults to the value of *openai-default-max-tokens*.
min-score - Minimum score for nearest-neighbor matching. Defaults to the value of *openai-default-min-score*.
model - OpenAI GPT model to use. Defaults to the value of *openai-default-ask-chat-model*.
presence-penalty - Encourage using different tokens. Defaults to the value of *openai-default-presence-penalty*.
prompt-or-messages - the input text or association list.
temperature - Control randomness of generated text. Defaults to the value of *openai-default-temperature*.
timeout - API call timeout. Function throws an exception when API times out. Defaults to the value of *openai-default-timeout*.
top-n - Maximum number of nearest neigbors emeddings. Defaults to the value of *openai-default-top-n*.
top-p - Defaults to the value of *openai-default-top-p*.
user - The API user. Defaults to the value of *openai-default-user*.
vector-database-name - Name of vector database resource providing text embeddings.
verbose - when true, display some debugging information. Defaults to nil.

Use this function to interact with GPT-3.5 and 4 models. Model should be one of: "gpt3-3.5-turbo", "gpt3-3.5-turbo-0301" or "gpt-4".

prompt-or-messages can be either a simple string or a transcript in the form of an alist ((role . content) ...) where role is one of "user", "system", "assistant" or "function".

An imlementation of Retrieval-Augmented Generation (RAG).

Returns a list of items where each item lists the response, the matching citation-id, a score repressting match confidence, and the original text of the matching citation.

The explanation of this function only makes sense when you already have a vector database. The vector database contains a set of ids, called citation ids, and a fragment of text associated with that id. In addition, the vector database stores an embeddeing vector, a high-dimensional array of floating point numbers, representing the semantic meaning of the text in the corresponding vector space.

The source of text for the vector database, in real applications, is expected to include large databases of books, articles, documents, contracts, transcripts or other text-based material. But for our expository purposes here, let's start with a very small toy example.

It's actually quite easy to create a toy vector database with the tools included in this package. Excute this form to generate and store a trivial vector database of historical figures:

(let* ((vector-database (make-vector-database :name "historicalFigures" :embedder
'embed))
       (key-val (ask-for-map
                 "List the numbers from 1 to 100 and associate a different
historical figure with each one.")))
  (dolist (item key-val)
    (let* ((key (car item))
           (val (cadr item))
           (embedding (embed val)))
      (push (list key val) (vector-database-property-vectors vector-database))
      (push embedding (vector-database-embedding-vectors vector-database))))
  (write-vector-database vector-database))

Now you can query the database with ask-my-documents:

gpt> (ask-my-documents "A funny name for a cat based on a portmanteau of two famous
scientsists.  Respond with the cat name only."
                  :vector-database-name "historicalFigures"
                  :min-score 0.0 :top-n 10))
(("Meowrie Purrstein" 0.77628237 "32" "Marie Curie")
 ("Meowrie Purrstein" 0.7667899 "8" "Albert Einstein"))

Behind the scenes, the function format-ask-my-documents-prompt created a big prompt based on the value of query:

Here is a list of citation IDs and content related to the query 'A funny
name for a cat based on a portmanteau of two famous scientists.  Respond with the
cat name only.':

citation-id:32 content:'Marie Curie'
citation-id:45 content:'Plato'
citation-id:14 content:'Charles Darwin'
citation-id:17 content:'Christopher Columbus'
citation-id:21 content:'Socrates'
citation-id:63 content:'Louis Pasteur'
citation-id:8 content:'Albert Einstein'
citation-id:9 content:'Galileo Galilei'
citation-id:10 content:'Leonardo da Vinci'
citation-id:59 content:'Benjamin Franklin'

Respond to the query 'A funny name for a cat based on a portmanteau of two famous
scientists.  Respond with the cat name only.' as though you wrote the content.
Be brief.  You only have 20 seconds to reply.
Place your response to the query in the 'response' field.
Insert the list of citations whose content informed the repsonse into the
'citation_ids' array.

The nearest-neighbor function nn provides the top 10 citation ids and associated content related to the original query. The last sentence in the prompt specifically asks for only the citations that contributed to the response. From these ten, the ask-my-documents prompt selects two.

Not only does the function provide a response, "Meowrie Purrstein", it also cites the sources of that response selected from the matching data: "Marie Curie" and "Albert Einstein".

Error Handling

ask-my-documents makes every effort to pass through API error messages into the results. This helps upstream applications easily understand configuration errors.

For example,

gpt>  (ask-my-documents "A funny name for a cat based on a portmanteau of two famous
scientsists.  Respond with the cat name only." :vector-database-name "emptyDatabase")
(("Vector store emptyDatabase is empty." 0.0 "error" "error"))

See llm-api.html for general information on support for large language models in Allegro CL.

ask-serp

Function, gpt package

Arguments query-phrase &key verbose top-n ordered-include-fields ordered-error-fields exclude-fields path-regex

top-n - Maximum number of results to return. Defaults to the value of *serp-default-top-n*.
exclude-fields - JSON key names whose values should be ignored. Defaults to the value of *serp-exclude-fields*.
ordered-error-fields - JSON key names whose value holds an error message. Defaults to '("error").
ordered-include-fields - JSON key names whose value should be included in the results, ordered by key preference. Defaults to the value of *serp-ordered-include-fields*.
path-regex - A regular expression applied to the JSON paths to filter the results. Defaults to ".".

SERP API provides a REST service to scrape search engine results pages (SERPs) in real-time and return structured data. It supports various search engines like Google, Bing, Yahoo, Baidu, and Yandex. The data is returned in JSON format, allowing access to natural language text information like answers, definitions, prices, organic results, descriptions, snippets and titles.

ask-serp provides a Lisp API to return natural language results, the source citations (where available), and the JSON path of the corresponding result. Behind the scenes, the function traverses the JSON object returned by the API call, selecting values to include in the results based on ordered-error-fields, ordered-include-fields, and exclude-fields.

In this example the first result comes from an "answer" key, and the next two come from "snippet", becauase in the ordered-include-fields the key "answer" comes before the key "snippet":

gpt> (ask-serp "Who discovered relativity?" :top-n 3)
(("Einstein"
  "https://www.smithsonianmag.com/innovation/theory-of-relativity-then-and-now-..."
  "[answer_box][answer]")
 ("In 1905 Einstein discovered the special theory of relativity, establishing the famous dictum that nothingno object or signalcan travel faster than the speed of light."

"https://www.smithsonianmag.com/innovation/theory-of-relativity-then-and-now-..."
  "[answer_box][snippet]")
 ("In 1915, Einstein visited Hilbert in Gottingen, and Hilbert convinced him that the goal of a fully general relativistic theory was achievable, ..."
  "https://press.princeton.edu/ideas/was-einstein-the-first-to-discover-general-relativity"
  "[organic_results][0][snippet]"))

Error Handling

ask-serp makes every effort to pass through API error messages into the results. This helps upstream applications easily understand configuration errors.

gpt> (set-serp-api-key nil)
nil
gpt> (ask-serp "Hello")
(("Invalid API key. Your API key should be here: https://serpapi.com/manage-api-key"
  "https://serpapi.com" "[error]"))

call-openai

Function, gpt package

Arguments: cmd &key method content timeout content-type extra-headers query retries delay verbose

Note that all of the https://api.openai.com/v1/ API commands are documented in the OpenAI API Reference.

cmd - openAI API function name following the URL prefix. The OpenAI API URL ishttps://api.openai.com/v1/<cmd>. Example values of cmd include "models" (have it list models), "files" (have it list files), "completions" or "chat/completions" (for chat).
method - HTTP method. Default value is :get.
timeout - HTTP request timeout. Function throws an exception when API times out. Defaults to the value of *openai-default-timeout*.
content-type - HTTP content-type header. Default value is "application/json".
extra-headers - Association list of any additional HTTP headers and values. Default is nil.
query - This is a query alist of the form suitable for query-to-form-urlencoded.
retries - number of times to retry HTTP request when the server returns a 429 status. Defaults to the value of *openai-default-retries*.
delay - initial exponential backoff delay in seconds for first retry, doubled on each retry. Defaults to the value of *openai-default-initial-delay*.
verbose - when true, display some debugging information.

call-openai is a generic interface to all openai API functions call-openai attempts to return an object of type st-json::jso.

Because call-openai calls net.aserve:do-http-request it may throw an exception (for example if the API connection times out) and it is up to the caller to wrap a suitable handler around the call (see the example at the end of the description of ask-chat).

Define a function for a new API

A developer can use call-openai to access new OpenAI API v1 endpoints as they become available. For instance, here is a simplified version of our function to request text embeddings:

(defun ask-embedding (text &key (model "text-embedding-ada-002") (timeout 120))
  "May throw an exception if API call fails.  Suggest using long timeout."
  (let* ((jso (jso)))
    (gpt::pushjso "input" text jso)
    (gpt::pushjso "model" model jso)
    (call-openai "embeddings" :method :post :timeout timeout
                 :content (json-string jso))))

The unexported function GPT::PUSHJSO pushes a key-value pair onto a JSON object. If the key already exists, it adds value to the list of values associated with key (or creates a list of 2 items if the key is associated with only a single atomic value). If the key does not exist in the object, it adds the key value pair to the object. So:

(progn
(setf *json* (jso))
(pushjso "key" "A" *json*)
(format t "~a~%" (json-string *json*))
(pushjso "key" "B" *json*)
(format t "~a~%" (json-string *json*))
)

prints

{"key":"A"}
{"key":["B","A"]}

See llm-api.html for general information on support for large language models in Allegro CL.

cancel-fine-tune

Function, gpt package

Arguments: ftid

ftid is the ID of the fine tune process. This function cancels a running or pending fine tune.

Returns an object of type st-json::jso subject to the exception condition described for call-openai.

See llm-api.html for general information on support for large language models in Allegro CL.

chat

Function, gpt package

Arguments: prompt &key best-of echo frequency-penalty logit-bias max-tokens model presence-penalty stop suffix temperature top-p user verbose

An interface to the OpenAI Legacy Completions API.

best-of - Defaults to the value of *openai-default-best-of*.
echo - Defaults to the value of *openai-default-echo*.
frequency-penalty - Avoid using the same tokens. Defaults to the value of *openai-default-frequency-penalty*.
logit-bias - Defaults to the value of *openai-default-logit-bias*.
logprobs - When verbose is true and logprobs is an integer, prints debugging information from the OpenAI logprobs attribute. Defaults to the value of *openai-default-logprobs*.
max-tokens - Maximum number of tokens in prompt + response. Defaults to the value of *openai-default-max-tokens*.
model - OpenAI GPT model to use. Defaults to the value of *openai-default-chat-model*.
n - Number of responses. Defaults to the value of *openai-default-n*.
output-format - when equal to :text, return the text of the first response (n has no effect). Otherwise, return the list of n responses. Defaults to the value of *openai-default-output-format*.
presence-penalty - Encourage using different tokens. Defaults to the value of *openai-default-presence-penalty*.
prompt - the input text.
stop - Completion stop sequence. Defaults to the value of *openai-default-stop*.
suffix - Defaults to the value of *openai-default-suffix*.
temperature - Control randomness of generated text. Defaults to the value of *openai-default-temperature*.
timeout - API call timeout. Function throws an exception when API times out. Defaults to the value of *openai-default-timeout*.
top-p - Defaults to the value of *openai-default-top-p*.
user - The API user. Defaults to the value of *openai-default-user*.
verbose - when true, display some debugging information. Defaults to nil.

Use this function to interact with OpenAI models ada, babbage, and davinci.

Simple chatbot functionality

Example:

gpt> (chat "Hello, robot.")
"Hello, human! How can I help you?"

The function attempts to return either a text string or a list of strings, depending on the value of output-format.

See llm-api.html for general information on support for large language models in Allegro CL.

delete-fine-tuned-model

Function, gpt package

Arguments: model

Delete a fine-tuned model from your OpenAI account. model should be the name of the fine-tuned model.

Returns an object of type st-json::jso subject to the exception condition described for call-openai.

See llm-api.html for general information on support for large language models in Allegro CL.

embed

Function, gpt package

Arguments: text &key log-progress model timeout verbose

text - The text string to be embedded.
log-progress - When true, print the text item + time for API call. Defaults to t.
model - Specifies the OpenAI model used to calcluate the embedding (defaults to "text-embedding-ada-002")
timeout - API call timeout. Function throws an exception when API times out. Defaults to the value of *openai-default-timeout*.
verbose - Passed down to the call to call-openai.

Return an embedding vector for the text text. An embedding vector is a normalized unit vector of N dimensions represented as an array of single precision floats where N is determined by the model.

model should be the name of a model that supports embedding.

The function embed calls ask-embedding and converts the embedding vector in the JSON object into a Lisp array of single floats.

In case the call to ask-embedding fails, the embed function returns a vector of size N filled with all 0s.

See the description the nearest-neighbor function nn for an example using embed.

See llm-api.html for general information on support for large language models in Allegro CL.

format-ask-my-documents-prompt

Function, gpt package

Arguments: query id-content

This function works behind the scenes with ask-my-documents. The function returns a big prompt contaning a smaller prompt, query plus a collection of background info given by id-content.

The query argument of ask-my-documents supplies the query argument here. The source of the id-content data the output of a call that ask-my-documents makes to nn.

Because the user may wish to customize the language of the prompt, we present the full implementation here:

(defun format-ask-my-documents-prompt (query id-content)
  (let* ((formatted-content
           (mapcar (lambda (u)
                     (format nil "citation-id:~a content:'~a'"
                             (car u) (cadr u))) id-content))
         (prompt (format nil "Here is a list of citation IDs and content related to the
query '~a':~%~{~a~%~}.
Respond to the query '~a' as though you wrote the content.
Be brief.  You only have 20 seconds to reply.
Place your response to the query in the 'response' field.
Insert the list of citations whose content informed the repsonse into the 'citation_ids'
array." query formatted-content query)))
    (setf prompt (remove-if (lambda (ch) (> (char-code ch) 127)) prompt))
    prompt))

The function format-ask-my-documents-prompt may be redefined in any fashion, so long as it accepts the two arguments query and id-content and returns a text string.

See llm-api.html for general information on support for large language models in Allegro CL.

delete-openai-file

Function, gpt package

Arguments: file

Delete the file in your OpenAI directory named file.

Returns an object of type st-json::jso subject to the exception condition described for call-openai.

See llm-api.html for general information on support for large language models in Allegro CL.

fine-tune

Function, gpt package

Arguments: file

Start a fine-tuning process. file should be the name of the uploaded fine-tuning file.

Returns an object of type st-json::jso subject to the exception condition described for call-openai.

See llm-api.html for general information on support for large language models in Allegro CL.

fine-tune-report

Function, gpt package

Arguments: &key ftid

Prints a report after a fine-tuning process.

ftid should be the ID of the fine tune process. It defauls to the value returned by (fine-tune-status).

Returns an object of type st-json::jso subject to the exception condition described for call-openai.

See llm-api.html for general information on support for large language models in Allegro CL.

fine-tune-status

Function, gpt package

Arguments: &key full

Get the status of fine-tuning processes. full can be true or nil. When full true, a more detailed status is generated. full defaults to nil.

Returns multiple (two) values: the fine-tuining process ID and the status.

See llm-api.html for general information on support for large language models in Allegro CL.

list-openai-files

Function, gpt package

Arguments: &key sort-key stream

sort-key specifies the sort by column key. Possible values are "id", "filename", "created-at" and "bytes". Default value is "created-at".

stream specifies the output stream for file listing. Default is t (standard output).

Prints a list of files in your OpenAI directory by calling (call-openai "files").

Returns nil.

See llm-api.html for general information on support for large language models in Allegro CL.

list-openai-models

Function, gpt package

Arguments:

Prints a human readable list of file names and returns nil.

The source of this function illustrates processing a JSON object into a list of file names:

(defun list-openai-models ()
 "Simple Lisp example of calling openai API to list available models."
 (let* ((jso (call-openai "models")))
  (mapcar
   'print
   (sort
   (remove-if
    'null
    (mapcar
    (lambda (u)
     (let ((id (cdr (assoc "id" (st-json::jso-alist u) :test 'string=))))
      id
      ))
    (cdr (assoc "data" (st-json::jso-alist jso) :test 'string=))))
   'string<))
  nil))

See llm-api.html for general information on support for large language models in Allegro CL.

make-vector-database

Function, gpt package

Arguments: &key name embedder properties property-vectors embedding-vectors

name - The name of the vector database.
embedder - An embedding function.
properties - A optional property list of metdata.
property-vectors - A list of the properties of each embedding.
embedding-vectors - A list of embedding vectors.

This is the constructor function for the vector-database class.

See the description the nearest-neighbor function nn for an example using a make-vector-database.

See llm-api.html for general information on support for large language models in Allegro CL.

nn

Function, gpt package

Arguments: vector-database text &key min-score top-n

vector-database - The vector database to be searched for nearest neighbor matches.
text - The text string to be matched.
min-score - Minimum cosine similarity score to include a match in the result. Defaults to 0.8.
top-n - The maximum number of matches to return, where each match scores higher than min-score. Defaults to 10.

Select the nearest-neighbor matches from a vector database. Returns a list of matching items up to length top-n.

Embeddings are an interesting way to represent fragments of natural language text as high dimensional vectors. At first it may seem counterintuitive that a large vector can capture semantic information, but recall the very simple example of word2vec: the vector difference between the embeddings of "King" and "Queen" equals the difference between "Man" and "Woman". Word2vec was an early example of vector based embeddings. Modern LLMs have expanded the length of the text fragments, up to paragraphs and entre documents, and increased the size of the emeddings, up to thousands of dimensions.

The function sample-vector-database (defined below) demonstrates the use of embeddings to match items from a simple set, as well as some interesting ways to combine embeddings with other LLM functions. Let's say we want to generate embeddings for a set of mammals. We first ask our chat function to pluralize the type name "Mammals". The ask-chat function returns "Mammals", so we can then ask the robot, in a grammatically correct way, to "List 100 members of the set of all Mammals."

(defun sample-vector-database ()
  (declare (special *default-vector-database-name*))
  (let ((name *default-vector-database-name*)
        (dim *ada-002-dimensions*))
    (let* ((vector-database (make-vector-database :name *default-vector-database-name*
                                                  :embedder #'embed
                                                  :properties `(("dim" . ,dim)
                                                                ("name" . ,name))))
           (type-name "Mammal")
           (type-plural (ask-chat (format nil "pluralize ~a" type-name)))
           (elements (ask-for-list (format nil "List 100 members of the set of all ~a" type-plural))))
      (setf elements (remove-duplicates elements :test 'string-equal))
      (format t "~a:~%" type-plural)
      (dolist (elt elements)
        (let* ((vec (embed elt))
               (id (gentemp "id-"))
               (properties (list id elt type-name)))
          (format t "~a~%" elt)
          (push properties (vector-database-property-vectors vector-database))
          (push vec (vector-database-embedding-vectors vector-database))))
      (write-vector-database vector-database)
      vector-database)))

For each element of the list returned, we ask for an embedding of that element. The embedding is vector of single precision floats, whose dimensionality depends on the LLM model. For OpenAI embeddings, the embed function runs with a default values for the model name ("text-embedding-ada-002") We also associate a list of property values with each embedding. Only 2 properties are required: the ID of the matching element, and the source text of the embedded element. The remaining items of the list may be populated in an arbitrary way. In the test function the items chosen are: a predicate name, the type name and pluralized type name. In our upstream RDF database application, the properties are always: a predicate URI, and a class type.

The loop finishes by pushing each embedding and property vector into the associated field of the vector database structure. The test function concludes by writing the vector database, utilizing fast vector writing.

gpt> (setf *sample-database* (sample-vector-database))
Human
Chimpanzee
Gorilla
Orangutan
Gibbon
Old World Monkey
New World Monkey
Tarsier
Lemur
Loris
Aye-aye
Sloth
Anteater
...
#<vector-database sample-vector-database ((dim . 1536)
                                          (name
                                           . sample-vector-database)) 100 100>

After creating the database, we are now in a position to query it using the nearest-neighbor function nn, which by default selects the top 10 matches with cosine similarity above a minimum score of 0.8:

gpt> (nn *sample-database* "Lives in water")                                                    
((id-188 0.8317959 "Sea Lion" "Mammal")                                                         
 (id-186 0.82970023 "Porpoise" "Mammal")                                                        
 (id-190 0.82740366 "Manatee" "Mammal")                                                         
 (id-185 0.82036865 "Dolphin" "Mammal")                                                         
 (id-184 0.81799465 "Whale" "Mammal")                                                           
 (id-161 0.8152693 "Platypus" "Mammal")                                                         
 (id-193 0.81464326 "Harbor Seal" "Mammal")                                                     
 (id-189 0.8144374 "Walrus" "Mammal")                                                           
 (id-197 0.8116767 "Otter" "Mammal")                                                            
 (id-199 0.80988634 "Hippopotamus" "Mammal"))

In real applications, the set of items might be very large, the length of the embedded text very long, and the query string for nearest neighbor very lengthy and descriptive.

Error Handling

nn makes every effort to pass through API error messages into the results. This helps upstream applications easily understand configuration errors.

For example,

gpt> (nn (make-vector-database :embedder 'embed) "Lives in water")
(("error" 0.8 "Vector store sample-vector-database is empty."))
gpt> (set-openai-api-key nil)
"nil"
gpt> (nn *sample-database* "Lives in water")
(("error" 0.8
  "Incorrect API key provided: nil. You can find your API key at https://platform.openai.com/account/api-keys."))

See llm-api.html for general information on support for large language models in Allegro CL.

Function, gpt package

read-vector-database

Arguments: name &key dir

The value of dir defaults to *default-vector-database-dir*.

Read a vector database named name from the directory dir and return it as a vector-database.

If the database does not exist, return one by calling the constructor make-vector-database.

See llm-api.html for general information on support for large language models in Allegro CL.

set-openai-api-key

Function, gpt package

Arguments: key

Returns key.

You need an OpenAI API key to use most of the operators in this package. See https://platform.openai.com/overview for instructions on obtaining a key (start with the Quickstart Tutorial and follow the links there to get a key).

This function takes a valid key as its argument and returns it after setting it as the value of *openai-api-key*. The various functions in this package get the value from that variable. Note the validity of the supplied key is not checked by this function. If the key is invalid, a call to another function will fail.

THIS CALL USES AN INVALID KEY:

cl-user(21): (set-openai-api-key "sk-U01ABc2defGHIJKlmnOpQ3RstvVWxyZABcD4eFG5jiJKlmno")
"sk-U01ABc2defGHIJKlmnOpQ3RstvVWxyZABcD4eFG5jiJKlmno"
cl-user(22): *openai-api-key*
"sk-U01ABc2defGHIJKlmnOpQ3RstvVWxyZABcD4eFG5jiJKlmno"
cl-user(23): (ask-chat "Hello")
"Incorrect API key provided:
sk-U01AB.......................................lmno. You can find your
API key at https://platform.openai.com/account/api-keys."
cl-user(24):

See llm-api.html for general information on support for large language models in Allegro CL.

upload-openai-file

Function, gpt package

Arguments: filename

Upload filename to your opeanai user directory.

Returns a string of text relayed from executing an underlying shell command.

See llm-api.html for general information on support for large language models in Allegro CL.

Function, gpt package

vector-database-embedding-vectors

Arguments: vector-database

Accessor to get or set the list of embedding vectors in a vector database struct.

See the description the nearest-neighbor function nn for an example using vector-database-embedding-vectors.

See llm-api.html for general information on support for large language models in Allegro CL.

vector-database-name

Function, gpt package

Arguments: vector-database

Accessor to get or set the name of the vector database. The name must be a string compatible with the naming conventions of the file system.

See llm-api.html for general information on support for large language models in Allegro CL.

vector-database-property-vectors

Function, gpt package

Arguments: vector-database

Accessor to get or set the list of property vectors in a vector database struct.

See the description the nearest-neighbor function nn for an example using vector-database-property-vectors.

See llm-api.html for general information on support for large language models in Allegro CL.

write-vector-database

Function, gpt package

Arguments: vector-database &key dir

Write the contents of the database bound to vector-database to a pair of files with name (vector-database-name *vector-database*) and file extensions .vec and .dat respectively in the directory dir, in a format suitable to be read back by read-vector-database.

The value of dir defaults to *default-vector-database-dir*.

Returns the pathname of the vector file written.

See the description the nearest-neighbor function nn for an example using write-vector-database.

See llm-api.html for general information on support for large language models in Allegro CL.

ToC

DocOverview

CGDoc

RelNotes

FAQ

Index

PermutedIndex

Allegro CL version 11.0