Introduction
A Large Language Model (LLM) is a neural network that can mimic human language, visual and audio intelligence. The source may be a for-profit company, a non-profit entity, an open source project, or any other group or organization that provides an API for an LLM. Allegrograph connects with LLMs as well as with other neural network architectures. Combining these neural network features with existing symbolic reasoning capabilities, Allegrograph balances traditional AI techniques with modern neural AI approaches.
The symbolic processing complements the neural processing by enabling:
- logical reasoning
- hallucination detection
- natural language queries
- validation of LLM results
- generation of synthetic data
- and more
This document introduces and defines the terms used in discussing how AllegroGraph implements LLMs and provides tools for their use. Links to documents providing more specific information are provided.
Models and Vendors
When working with LLMs, it is important to distinguish between vendors and models.
A model is an instance of a large language model. GPT-4, Llama3.1, Mistral 8nd and Claude 3 are all examples of models.
A vendor is an entity that provides an API to LLM models. The vendor may be a for-profit entity (a company), or a non-profit entity. Because one organization may offer more than one API, the term vendor really refers to a single API specification. At the time this documentation is written, Allegrograph supports vendors OpeanAI and Ollama. Possible vendors we could support in the future include Anthropic, Microsoft, Mistral and others.
The holy grail in LLM adoption is a standalone LLM completely inside a company's own environment, as most organizations would prefer not to send their private data through an API controlled by another company. To keep their data secure, companies may host LLMs within a data center under their control. Allegrograph offers both options: vendors over the internet (such as OpenAI) or vendors and models hosted locally.
Vendors and models are specified in various WebView dialogs associated with LLMs. See the LLM Embedding document.
Chat vs Embedding
Generally LLMs offer two complementary, but different, functions: Chat and Embedding. The LLM feature most familiar to the public is called chat, or continuations. (The term continuations refers to the LLM's ability to read a prompt and generate text that would naturally continue the text in the prompt.) Because this process may simulate a human conversation, a continuation is also referred to as a chat.
An embedding on the other hand provides a way to represent text as a high-dimensional vector in such a way that vectors near each other in the vector space represent similar concepts. For example, the vector representing the phrase A good pet for an elderly person is near the vectors representing the words cat and parakeet. We refer to a search for the nearest vectors as Embedding Based Matching (EBM).
Allegrograph stores these semantic vectors in a vector repository, a kind of vector database implemented as an Allegrograph triple-store.
Because the vector repo is also an ordinary triple-store, the embedding based matching is easily combined with Allegrograph symbolic reasoning features.
In the various WebView dialogs that set up chats and embeddings, the available models will be listed once the embedder (or vendor) is specified. See the LLM Embedding document.
Magic Predicates
AllegoGraph extends the SPARQL standard by adding predicates to implement special computations, such as communication with LLMs. These predicates are known as magic predicates. AllegroGraph has defined magic predicates which generate LLM resposes, create lists and tables of data, perform embedding based matching, implement retrieval augmented generation (RAG), implement chatbots, and process database queries in natural language.
AllegroGraph also offers a magic predicate to connect to the SERP API, making it possible to link search engine results with LLM responses, another example of neural-symbolic AI.
See LLMagic integration for a list of LLM-related magic properties.
Vector Repositories
A vector store or vector database (VDB) is an object which stores embedding vectors, along with their associated text ((words, phrases, paragraphs or any other fragment of text), and other metadata associated with embeddings. In AllegroGraph, an ordinary triple store can function as a vector database. Thus we refer to it as a vector repository or vector repo. The vector repo may contain embeddings as well as symbolic RDF data. See Vector Storage Features in AllegroGraph for more information.
Selectors and Clustering
The time it takes to search a vector repo for the best match is proportional to the number of embeddings, so, for example, For example, the time to search a vector repo of 10M objects is 1000 limes longer than the time to search 10K objects. AllegroGraph offers two approaches that make matching faster and more efficient.
A selector is a special SPARQL query designed to limit the matching to a subset of the embeddings. The selected objects may be linked to any other relations in the repository. The selector, using a special SPARQL query, chooses a subset of embedding objects to match. For example, a vector repo of medical terminology may be partitioned into subsets of terms for diseases, observations, procedures, and medications. If the number of terms in each category is about equal, a selector choosing the subset of medications will run four times faster than of the search of all terms in the vector repo.
Clustering partitions the embedding vectors into subsets, called clusters, such that vectors nearby in space are grouped together. Because the cluster contains only a subset of embedding vectors, the time to search the cluster is much faster than the time to search the full vector repo. The cost is some loss of accuracy, because not all the best match embeddings necessarily appear in the same cluster.
See Using the ?selector and ?useClustering arguments with LLM Magic Predicates for more information.
Natural Language Queries
Allegrograph provides four distinct flavors of natural language queries:
Nearest Neighbor EBM matching: Given a fragment of natural language text, find the embedding vectors most closely matching the embedding of that text. Nearest neighbor is described in the LLM Embed Specification document.
Retrieval Augmented Generation (RAG): This method retrieves background information related to a query, either from a vector repo or some other data source, and uses it to create a prompt that combines the background information with the query. The RAG technique often avoids the problem LLM hallucinations. See our Chomsky RAG example.
Chatbot with Long-term Memory: A feature always lacking in pre-LLM chatbots was statefulness, that is the ability of the bot to remember the state and history of the conversation. The chatStream feature of AllegroGraph extends the RAG concept to include fragments of conversational history, as well as a brief transcript of the recent conversational dialog, to form a prompt that enables the bot to respond using both short-term and long-term memories. (So if earlier in the conversation the user gave their age as 23, the bot could say "you were not born in the 20th century".) See the example at the end of the Building a Stateful Chatbot where the bot summarizes the questions that have been asked.
Natural Language SPARQL queries (NLQ): Allegrograph provides an NLQ feature (currently in pre-release state but available for trial and testing). It allows direct querying of an RDF triple store through natural language. NLQ converts a natural language request into a SPARQL query, which then retrieves information from the RDF repository. See the natural-language-sparql-queries document.