http://franz.com/ns/allegrograph/8.0.0/llm/nearestNeighbor

Find the top N best matching embeddings above a minimum match score.

Namespace:

PREFIX llm: <http://franz.com/ns/allegrograph/8.0.0/llm/> 

General forms:

(?uri ?score ?originalText) llm:nearestNeighbor (?text ?vectorDatabase ?topN ?minScore ?selector)
(?uri ?score) llm:nearestNeighbor (?text ?vectorDatabase ?topN ?minScore ?selector)
?uri llm:nearestNeighbor (?text ?vectorDatabase ?topN ?minScore ?selector)

For example, the pattern

?uri llm:nearestNeighbor ("Famous Scientist" "historicalFigures" 10 0.8) 

will bind ?uri to each of up to 10 subject nodes in the vector database historicalFigures where the match score between the embedding vector of "Famous Scientist" and the embeddings of the original text in the database is at least 0.8. API JSON response.

The predicate binds an optional second parameter ?score with the value of the match score. It binds an optional third parameter ?originalText with the value of the original text.

The ?selector argument is optional. If given it should be the body of a sparql query where the result should be bindings for ?id which are resources in the vector database that have rdf:type of vdb:Object. The default value for ?selector is

"{?id rdf:type vdb:Object}" 

In the Sparql expression the namespaces vdb and vdbprop are defined.

prefix vdb: <http://franz.com/vdb/gen/>
prefix vdbprop: <http://franz.com/vdb/prop/>

API Key

You need an API key to utilize this predicate. You need an OpenAI API key to utilize this predicate. See https://platform.openai.com/overview for instructions on obtaining a key (start with the Quickstart Tutorial and follow the links there to get a key). There are three ways to configure your API key, as a query option prefix or in a couple of places in the Allegrograph configuration.

As a query option prefix, write:

PREFIX franzOption_openaiApiKey: <franz:sk-U01ABc2defGHIJKlmnOpQ3RstvVWxyZABcD4eFG5jiJKlmno> 

Syntax for config file:

QueryOption openaiApiKey=<franz:sk-U01ABc2defGHIJKlmnOpQ3RstvVWxyZABcD4eFG5jiJKlmno> 

In the file data/settings/default-query-options:

(("franzOption_openaiApiKey" "<franz:sk-U01ABc2defGHIJKlmnOpQ3RstvVWxyZABcD4eFG5jiJKlmno>")) 

API Options

The proprietary OpenAI API exposes many options and parameters for interaction with their LLM models. Currently the AllegroGraph magic predicates and functions take an opinionated approach and hide most of these options behind the scenes. Specifically, we set

API endpoint: https://api.openai.com/v1/chat/completions 

Endpoint parameters:

min-score: gpt:openai-default-min-score
model: "text-embedding-ada-002"
top-n: gpt:openai-default-top-n
verbose nil

Additionally we impose an API timeout of 10 seconds.

Finally, when the OpenAI API times out, returns an error, or the magic predicate implementation fails to parse the response, or any other error occurs, the magic predicate displays an informative message in Webview. Please contact AllegroGraph Support if you require different API options or customization.

Notes

The following namespace abbreviations are used:

The SPARQL magic properties reference has additional information on using AllegroGraph magic properties and functions.