AllegroGraph Reference Guide

Table of Contents

Managing AllegroGraph

triple-store management

Reports

Loading triples in bulk

TriX

Saving triples to a file

SPARQL support

Manipulating triples

Triple Parts: Resources, Literals, and Blank nodes.

Datatype and predicate mapping

Freetext indexing

Freetext Query Expressions

The !-reader macro and future-parts

Indexing

When to index

Querying the triple-store

Cursors

AllegroGraph and Prolog

The q functor

the select macros

Prolog and range queries

REPL User Interface

Miscellaneous Functions and Variables

Server for Lisp and Java Clients

Sesame

the AllegroGraph Reasoner

Clustering: indexing with multiple processors

Notes on Thread Safety

CVS/ontology management tools

Requirements

update_ontology command line arguments

sync_ontology command line arguments

Indices

Function index

Variable index

This is the reference guide for AllegroGraph 2.2.5. The tutorial can be found here. An introduction to AllegroGraph covering all of its many features at a high-level is in the AllegroGraph Introduction.

Managing AllegroGraph

AllegroGraph provides several utilities to help you manage your triple-stores and AllegroGraph itself. Most of the management operations are carried out automatically but there may be occasions when you will want to take greater control.

The following functions let you set and get properties that control AllegroGraph, find the version you are using and operate on all of the triple-stores that you have open.

ag-property name
function
Returns the value of the property associated with name.

ag-property-names
function
Return a list of the names of all defined AllegroGraph properties.

allegrograph-version &optional extended
function
Returns the version number of AllegroGraph as a string. If the optional extended argument is true, then it also returns the build date.

collect-open-triple-stores
function
Returns a list of the triple-stores are currently open. See also map-open-triple-stores.

load-properties &key filename errorp
function
Load AllegroGraph properties from a file that was previously created using save-properties.

map-open-triple-stores fn
function
Apply fn to each open triple-store. See also collect-open-triple-stores. Returns (values).

reset-properties
function
Restore all AllegroGraph properties to their default settings.

save-properties &key filename
function
Save the current set of AllegroGraph properties to a file. You can reload these properties later using load-properties.

find-triple-store designator
function

Search for an open triple-store named by designator. Designator can be one of the following:

  • An object of type triple-db
  • A pathname which is the directory of a triple-store (cf. data-directory)
  • A string which can be coerced into a pathname which is the directory of a triple-store
  • A string which is the name of an open triple-store.

Note that it is possible that two or more triple-stores are open which have the same naem. In this case find-triple-store will signal an error of type ambiguous-triple-store-designator.

In addition to several special variables that can be let bound to control its behavior dynamically, AllegoGraph has the following properties:

:agraph-cluster-code-pathname
property
When set, this is used to help locate the cluster/grid support code for AllegroGraph.

:blank-node-string-prefix
property
Used as the prefix when printing blank nodes.

:default-metaindex-skip-size
property
Used to control how often AllegroGraph records extra index information to speed up triple access in queries.

:error-on-redefine-namespace
property
The default value for the errorp keyword argument to register-namespace. If true and errorp is not supplied , then calling register-namespace with a namespace-prefix that already has a namespace mapping will signal an error. Otherwise the namespace mapping will be changed silently.

:default-print-triple-format
property
The default value for the format keyword argument to print-triple, print-triples, and triple->string. Defaults to :ntriple.

:display-cleanup-messages
property
If set to true, then AllegroGraph will display additional messages to *error-output* during certain maintenance operations such as closing triple-stores or exiting. This property defaults to nil.

:include-standard-parts
property
If true, then AllegroGraph will automatically intern a set of standard part strings into each triple-store it creates. These parts consist of the RDFS and OWL strings used by the reasoner.

:server-port
property
The default port for the AllegroGraph server.

:standard-indices
property
A list of indices that will be added to each newly created triple-store (unless it is overridden in the call to create-triple-store). It is also used by the function add-standard-indices.

:suppress-cleanup-messages
property
If nil, then AllegroGraph will print information regarding its internal maintenance to standard-output. If true, these messages will not be printed.

:temporary-directory
property
Specifies the location of a directory that AllegroGraph will use for some of its temporary files. Defaults to the return value of sys:temporary-directory.

:verbose
property
This is used as the default value of verbose in AllegroGraph functions that have a verbose keyword argument. Examples include load-ntriples, load-xml/rdf and index-all-triples.

:verbose-prepare-reasoning
property
If true, then prepare-reasoning will output information messages as it runs inferences.

Finally, AllegroGraph periodically runs a variety of background checks to help optimize your triple-stores. These are controlled by the AllegroGraph manager. You can set how often AllegroGraph runs these checks using manager-period.

manager
function
Returns the AllegroGraph-Manager of the AllegroGraph application.

manager-period excl::struct
function
Controls how often AllegroGraph runs its internal housekeeping functions.

pause-agraph-manager
function
Pause the AllegroGraph manager (see resume-agraph-manager).

resume-agraph-manager
function
Resume the AllegroGraph manager (see pause-agraph-manager).

triple-store management

These functions are used to create, delete, and examine a triple-store. All of a triple-store's data is kept in a single directory whose name it shares. For convenience, many operations act by default on the current triple-store which is kept in a special variable named *db*. The with-triple-store macro makes it easy to call other code with a particular store marked current. There are also several reports that describe the triple-store in detail.

Each triple-store has two parameters that help you to manage its indices automatically: unindexed-triple-count-threshold and unmerged-chunk-count-threshold. The first controls the maximum number of unindexed triples in the store; the second controls the maximum number of index chunks or fragments. When it ships, each triple-store has both of these parameters set to zero (0) which means that no automatic indexing and merging will take place -- everything is left to your control. Here are some of the things that you will want to consider when setting these values.

We suggest that you experiment with variations on the parameters. Franz is very open to hearing your feedback on how you would like to manage your triple-store.

One final point is that the automatic indexing and merging is subject to checks by the AllegroGraph manager process (see manager-period elsewhere in this documentation).

*db*
variable
The default triple store instance. API functions that take a db keyword argument default to the value of this variable.

close-all-triple-stores &key wait
function
Close all open triple-stores. The :wait keyword parameter is passed along to close-triple-store. It defaults to nil

close-triple-store &key db if-closed verbose wait
function

Close the triple store, after saving all persistent data to disk as if by sync-triple-store. Close-triple-store has the following keyword arguments:

  • :db - defaults to the value of *db* and specifies which triple-store to close. *db* is set to nil after the triple store is closed. A triple-store only needs to be closed once regardless of how many times open-triple-store has been called on it.

  • :if-closed - controls the behavior when the db is either not open or nil. If if-closed is :error then an error will be signaled in this situation. If it is :ignore, then close-triple-store will just return without signaling an error. The argument defaults to :ignore.

  • :verbose is true, then a message will be printed to *debug-io* before the triple store is closed.

  • :wait is true then close-triple-store will not return until all of its concurrent activity is complete. This includes synchronizing its data, building indices and running queries. The value of :wait will be true unless overriden. If it is false, then close-triple-store will return immediately but the triple-store will no longer be available to use. Its actual in-memory structures will not be available to the garbage-collector until all of its background activity is complete and it is finally closed.

create-triple-store name &rest args &key directory if-exists with-indices
function

Create a new triple store with the given name. This is both the name of the triple-store and its location on disk. It can be the full path to a directory or just a directory name. If it is a simple name, then the directory argument is used to convert it to a complete path.

For example, to create a triple-store named 'animals' in the directory 'c:/datafiles/biology', you could use either

(create-triple-store  
 "c:/datafiles/biology/animals") 

or

(create-triple-store "animals"  
 :directory "c:/datafiles/biology/") 

The directory defaults to the directory component of *default-pathname-defaults*. You cannot use a name containing a directory path and the directory argument simultaneously.

Create-triple-store takes numerous keyword arguments:

  • :if-exists - controls what to do if the data directory already exists. The default value, :supersede, will cause create-triple-store to delete the old triple store and create a new one; :error causes create-triple-store to signal an error; finally, the value :open will simply open the existing triple store as if you had called open-triple-store.

  • :expected-unique-resouces - sets the initial size of the triple-store. It determines the number of unique names (e.g., resources and literals) that the triple store is expected to hold and defaults to the value of default-expected-unique-resources. The triple-store will grow if this number is exceeded but setting the value initially will provide better performance because no resizing will be required.

  • :with-indices - adds indices to the newly created triple-store. It is equivalent to calling add-index once for each index in the list of indices. It defaults to the value of the AllegroGraph property standard-indices.

  • :include-standard-parts - if true, then AllegroGraph will add the following strings to the triple-store at creation time: rdf:type, owl:sameAs, owl:inverseOf, rdfs:subPropertyOf, rdfs:subClassOf, rdfs:range, rdfs:domain, owl:transitiveProperty. The value defaults to the ag-property(:include-standard-parts).

Create-triple-store returns a triple-store object and sets the value of the variable *db* to that object.

data-directory triple-db
function
Returns the directory in which the triple-store lives. This is specified using the :directory parameter to create-triple-store or by using a full pathname as the name argument to create-triple-store.

delete-triple-store db &key directory if-does-not-exist if-open
function

Delete an existing triple store. Returns t if the deletion was successful and nil if it was not. The :db keyword argument can be either a triple store instance or the name of a triple store. If it is an instance, then the triple store associated with the instance will be deleted. If it is a name, then the triple store of that name in the directory designated by the :directory keyword argument will be deleted. The directory parameter defaults to the directory component of *default-pathname-defaults*.

The :if-does-not-exist keyword argument specifies what to do if the data directory does not exist. The default value, :error, causes delete-triple-store to signal an error. The value :ignore will cause delete-triple-store to do nothing and return nil.

The :if-open keyword argument specifies the behavior if the designated triple store is currently open. The default value, :error, causes delete-triple-store to signal an error. The value :close causes delete-triple-store to close the triple store, as if by close-triple-store and then delete it. If AllegroGraph is unable to close the triple store, an error may still be signaled.

make-tutorial-store &optional temporary-directory
function

Close any current triple-store and create a new empty one in a temporary directory.

The directory used can be passed in as an optional parameter. If it is not supplied, then it will get its value from (ag-property temporary-directory).

The new triple-store will be bound to *db* and is also returned by make-tutorial-store.

open-triple-store name &key directory if-does-not-exist if-exists with-indices read-only-p
function

Open an existing triple store (previously created using create-triple-store) with the given name. Name is both the name of the triple-store and its location on disk. It can be the full path to a directory or just a directory name. If it is a simple name, then the directory argument is used to convert it to a complete path.

For example, to open a triple-store named 'animals' in the directory 'c:/datafiles/biology', you could use either

(open-triple-store  
 "c:/datafiles/biology/animals")  
 

or

(open-triple-store "animals"  
 :directory "c:/datafiles/biology/")  
 

The directory defaults to the directory component of *default-pathname-defaults*. You cannot use a name containing a directory path and the directory argument simultaneously.

The :if-does-not-exist keyword argument specifies what to do if the data directory does not exist. The default value, :error, causes open-triple-store to signal an error. The value :create will cause open-triple-store to create the triple store as if by a call to create-triple-store.

You can use the :with-indices parameter to add additional indices to the triple-store. (Adding an index multiple times has no effect). This will not remove any indices that already exist (see drop-index and drop-indices if you need to do that).

Returns a triple store object and sets the value of the variable *db* to that object. If the named triple store is already open, open-triple-store returns the same object.

prepare-reasoning &key db verbose force show-progress
function

This function has to be called before any inferences are made. It creates internal hashtables to speed up the reasoner. In normal operation, AllegroGraph will call prepare-reasoning as necessary. You can see diagnostic messages by using the parameter verbose which defaults to the value of (ag-property :verbose-prepare-reasoning). You can also force the hashtables to be regenerated using the force parameter. Finally, the show-progress keyword argument can be used to cause prepare-reasoning to print a message for each of the hashtables it builds.

The function prepare-reasoning returns no value.

sync-triple-store &key db
function
Ensure that all persistent data needed by the triple store is saved to disk. Called automatically by close-triple-store and the bulk loading operations like load-ntriples.

triple-count &key db
function
Returns the number of triples in a triple store. The :db keyword argument specifies the triple store to use, either by name or a triple store object. It defaults to the value of *db*.

triple-store-exists-p name &key directory
function

Returns true if the triple-store with the given name exists. Name can be the full path to a directory or just a directory name. If it is a simple name, then the directory argument is used to convert it to a complete path.

For example, to verifty the existance of a triple-store named 'animals' in the directory 'c:/datafiles/biology', you could use either

(triple-store-exists-p  
 "c:/datafiles/biology/animals")  
 

or

(triple-store-exists-p "animals"  
  :directory "c:/datafiles/biology/")  
 

The directory defaults to the directory component of *default-pathname-defaults*. You cannot use a name containing a directory path and the directory argument simultaneously.

triple-store-id triple-db
function
This returns the ID of the triple-store. The ID is generated randomly when the triple-store is created. It is an array of four octets.

unindexed-triple-count-threshold db
function
Whenever there are at least unindexed-triple-count-threshold unindexed triples, an indexing task will be started to index them. If a indexing-host has been added then the task will run on it (see add-indexing-host for details). Otherwise, indexing will run as a background task of the main AllegroGraph process.

unmerged-chunk-count-threshold db
function
A complete merge will happen automatically whenever there are at least this many index chunks.

with-triple-store (var store &key state read-only-p errorp) &body body
macro

Binds both var and db to the triple-store designated by store. The following keyword arguments can also be used:

  • errorp - controls whether or not with-triple-store signals an error if the specified store cannot be found.

  • read-only-p - if specified then with-triple-store will signal an error if the specified triple-store is writable and read-only-p is nil or if the store is read-only and read-only-p is t.

  • state - can be :open, :closed or nil. If :open or :closed, an error will be signaled unless the triple-store is in the same state.

*default-expected-unique-resources*
variable
The default size of the hash table used to map resource names to numeric ids.

*synchronize-automatically*
variable

This variables controls when changes to the triple store are written to disk. If t, then changes will be written after every call to add-triple. If nil (which is the default), then changes will be written only when the triple store is closed, indexed, or synchonized. A setting is nil is more efficient but can lead to surprises: for example, the results of queries may not contain recently added triples.

Note that functions such as load-ntriples and load-rdf/xml dynamically bind *synchronize-automatically* to nil in their inner loops for efficiency's sake. You may want to use a similar practice in any of your code that adds many triples at once.

Reports

db-room &optional db
function
Print a summary of the disk and memory usage for db (which defaults to db unless specified).

estimate-required-space triple-count index-count &key skip-size unique-strings (average-string-size 30)
function
Print a report that estimates the amount of disk space and memory that an AllegroGraph database will require.

index-status-report &key db stream verbose summary indices
function
Prints a summary of index information to stream.

Loading triples in bulk

Triples stored in files using the N-Triples 1 and RDF/XML 2 notations can be loaded into the triple-store with the following functions.

load-ntriples db.agraph.parser::source &key db.agraph.parser::db db.agraph.parser::default-graph db.agraph.parser::verbose db.agraph.parser::always-save-string-literals
function

Add triples from source to the triple store.

  • :db - specifies the triple-store to load into and defaults to the value of db.

  • :default-graph defaults to nil, which is interpreted as db's default graph. If supplied, it can be:

    • a string representing a URIref, or a UPI encoding a URIref, which adds the triples in source to a graph named by that URI

    • the keyword :source (in which case the source argument will be interned as a URI and the loaded triples added to a graph named by that URI)

  • The :verbose argument specifies whether or not progress information is printed to the listener. It defaults to the value of (ag-property :verbose).

  • :always-save-string-literals - determine whether or not to save the strings of a triple's object field when the object can be encoded directly into the triple. If true (the default) then the strings will be saved. If false then only the encoded values will be preserved (this may prevent exact round-trips if data is coerced during the encoding process).

  • source - can be a stream, a pathname to an N-Triples file or a string that can be coerced into a pathname to an N-Triples file.

load-ntriples returns the number of triples added, and the default-graph used, as multiple values.

load-ntriples-from-string string &key db.agraph.parser::db db.agraph.parser::default-graph db.agraph.parser::verbose db.agraph.parser::always-save-string-literals
function

Add all of the triples from the string to the triple store. The following keyword parameters can be used to control the loading process:

  • :db - specifies the triple-store to load into and defaults to the value of db.

  • :default-graph defaults to nil, which is interpreted as db's default graph. If supplied, it can be:

    • a string representing a URIref, or a UPI encoding a URIref, which adds the triples in source to a graph named by that URI

    • the keyword :source (in which case the source argument will be interned as a URI and the loaded triples added to a graph named by that URI)

  • The :verbose argument specifies whether or not progress information is printed to the listener. It defaults to the value of (ag-property :verbose).

  • :always-save-string-literals - determine whether or not to save the strings of a triple's object field when the object can be encoded directly into the triple. If true (the default) then the strings will be saved. If false then only the encoded values will be preserved (this may prevent exact round-trips if data is coerced during the encoding process).

load-ntriples-from-string returns the number of triples added, and the default-graph used, as multiple values.

load-ntriples* db.agraph.parser::filenames &key db.agraph.parser::db db.agraph.parser::default-graph db.agraph.parser::verbose db.agraph.parser::always-save-string-literals
function
Add triples from a list of N-Triples files to the triple-store. The :db keyword argument specifies the triple store to load into and defaults to the value of db. See load-ntriples for more information about the other keyword arguments.

db.agraph.serializer:load-rdf-manifest db.agraph.serializer::manifest db.agraph.serializer::destination-db &key db.agraph.serializer::verbosep
function
Load the file named by manifest, loading the triples from the graphs it references.

load-rdf/xml db.agraph.parser::filename &key db.agraph.parser::db db.agraph.parser::base-uri db.agraph.parser::default-graph db.agraph.parser::use-rapper-p
function

Add triples from the named RDF/XML file to the triple-store. The additional arguments are:

  • :db - specifies the triple-store into which triples will be loaded; defaults to the value of db.

  • :base-uri - this defaults to the name of the file from which the triples are loaded. It is used to resolve relative URI references during parsing. To use no base-uri, use the empty string "".

  • :default-graph defaults to nil, which is interpreted as db's default graph. If supplied, it can be:

    • a string representing a URIref, or a UPI encoding a URIref, which adds the triples in source to a graph named by that URI

    • the keyword :source (in which case the source argument will be interned as a URI and the loaded triples added to a graph named by that URI)

  • use-rapper-p - If use-rapper-p is true, then the RDF/XML file will be piped through the rapper function using run-shell-command. Obviously, rapper must be both installed and in your path for this to work. If rapper is not in your path, you can supply it explicity as the value of use-rapper-p.

load-rdf/xml* db.agraph.parser::filenames &rest db.agraph.parser::args &key db.agraph.parser::db db.agraph.parser::verbose db.agraph.parser::base-uri db.agraph.parser::default-graph
function
Add triples from the named RDF/XML files to the triple-store. See load-rdf/xml for details on the parser and the other arguments to this function.

load-rdf/xml-from-string string &key db.agraph.parser::db db.agraph.parser::base-uri db.agraph.parser::default-graph
function

Treat string as an RDF/XML data source and add it to the triple-store. For example:

   (load-rdf/xml-from-string  
    "<?xml version="1.0"?>  
<rdf:RDF xmlns:rdf=\"http://www.w3.org/1999/02/22-rdf-syntax-ns#\"  
         xmlns:ex=\"http://example.org/stuff/1.0/\">  
  <rdf:Description rdf:about=\"http://example.org/item01\">  
    <ex:prop rdf:parseType=\"Literal\"  
             xmlns:a=\"http://example.org/a#\"><a:Box required=\"true\">  
         <a:widget size=\"10\" />  
         <a:grommit id=\"23\" /></a:Box>  
    </ex:prop>  
  </rdf:Description>  
</rdf:RDF>  
") 

See load-rdf/xml for details on the parser and the other arguments to this function.

TriX

RDF in the form of named graphs can also be loaded using load-trix, which understands the TriX format (see http://www.w3.org/2004/03/trix/).

We have implemented a few extensions to TriX to allow it to represent richer data:

Saving triples to a file

There are several methods by which you can create a textual representation of your triple-store:

The print-triples function provides a simple mechanism to output triples to *standard-output* or a stream. It's easy to build an export function on top of it:

(defun export-triples (triples file)  
  (with-open-file (output file  
                          :direction :output  
                          :if-does-not-exist :create  
                          :if-exists :error)  
    (print-triples triples  
                   :limit nil :stream output :format :ntriple))) 

The other techniques provide more control over the output format.

print-triple triple &key format stream
function
Print a triple returned by next-row or get-triples-list. The keyword argument :format, which defaults to the value of (ag-property :default-print-triple-format), specifies how the triple should be printed. The value :ntriple specifies that it should be printed in N-Triples syntax. The value :long indicates that the string value of the part should be used. And the value :concise causes it to use a more concise, but possibly ambiguous, human-readable format.

print-triples triple-container &key limit format stream
function
Display the triples in triple-container which can be either a triple store object, a list of triples such as is returned by get-triples-list, or a cursor such as is returned by get-triples. If the keyword argument :limit is supplied, then at most that many triples will be displayed. The :format keyword argument controls how the triples will be displayed, in either :ntriple, :long, or :concise format. It defaults to (ag-property :default-print-triple-format). The stream argument can be used to send output to the stream of your choice. If left unspecified, then output will go to standard-output.

serialize-rdf/xml db.agraph.serializer::from db.agraph.serializer::to &key db.agraph.serializer::memoize-abbrev-lookups-p db.agraph.serializer::prepare-namespaces-p db.agraph.serializer::output-types-p db.agraph.serializer::nestp db.agraph.serializer::error-on-invalid-p db.agraph.serializer::indent db.agraph.serializer::if-exists db.agraph.serializer::if-does-not-exist
function

Write from, which should be a triple store, a list, or a cursor, to to, which should be a stream, file path, t (print to *standard-output*), or nil (return a string).

If from is a triple-store or a list of triples, and prepare-namespaces-p is t, it is first traversed to build a hash of namespaces to prefixes for all properties in the store. The value of db.agraph::*namespaces* is used as a seed.

If you can ensure that every property used in the triple-store has a defined prefix, you can pass nil for prepare-namespaces-p to gain a speed improvement from omitting this phase.

If error-on-invalid-p is t, the serializer will throw an error if it encounters a type or predicate that it cannot abbreviate for RDF/XML.

If a namespace prefix table is built, it will be returned as the second value.

If from is a cursor, it cannot be traversed multiple times, so prepare-namespaces-p is ignored. If a property is encountered that cannot be abbreviated with the current available prefixes, an error will be signaled, unless error-on-invalid-p is nil. You should be aware of this before using this serializer: serializing the output of a cursor can fail if you do not first prepare namespace mappings, or specify that invalid output is acceptable.

if-exists and if-does-not-exist are arguments to the Common Lisp open function which apply when to is a filename.

If memoize-abbrev-lookups-p is t, a upi-hash-table is built to store the mappings between resources in the store and string abbreviations. This hash-table will contain as many entries as there are types and properties in the data to be serialized. For some datasets disabling this caching will yield a significant reduction in space usage in exchange for a possible loss of speed.

If indent is non-nil, then it specifies the initial indentation of elements.

If nestp is t, then (subject to the order of triples in the triple store) some nesting of RDF/XML elements will be applied. nestp of nil will cause a flat tree to be produced, where each resource is its own top-level element.

If output-types-p is t, then additional queries will be performed for each resource to decorate the RDF/XML with types. If nil, then rdf:type elements alone will be used.

Please note that the RDF/XML serializer is not guaranteed to work correctly with value-encoded literals.

db.agraph.serializer:serialize-rdf-manifest db.agraph.serializer::source directory &key db.agraph.serializer::verbosep db.agraph.serializer::single-stream-p
function

Serialize source according to exchange-of-named-rdf-graphs. Serialization will probably open at least as many files as there are graphs in the source.

Returns the manifest path and the number of graphs saved.

If single-stream-p only a single file is open at any one time. This is slower, but guaranteed not to fail with large numbers of graphs.

db.agraph.serializer:serialize-rdf-n3 db.agraph.serializer::from db.agraph.serializer::to &key db.agraph.serializer::indent db.agraph.serializer::if-exists db.agraph.serializer::if-does-not-exist
function
Write from, which should be a triple store, a list, or a cursor, to to, which should be a stream, file path, t (print to standard-output), or nil (return a string). if-exists and if-does-not-exist are arguments to open which apply when to is a filename. If indent is non-nil, then it specifies the initial indentation of elements.

SPARQL support

AllegroGraph includes twinql, an implementation of the powerful SPARQL query language. You can learn more about twinql and SPARQL in the twinql reference and tutorial.

db.agraph.sparql:get-variable-list db.agraph.sparql::variables db.agraph.sparql::query db.agraph.sparql::ordered &key db.agraph.sparql::check-validity-p
function
Take a variable list (could be null), a query as s-expressions, and an optional ORDER BY s-expression, and return: - the variables that need to be collected in order to run the ORDER BY query - an optional function to trim the results - the variables that should be projected to. Throws an error of type 'sparql-variables-error if variables is provided and is not a subset of the variables in the query.

sparql.parser:parse-sparql string &optional sparql.parser::default-prefixes sparql.parser::default-base
function

parse-sparql takes a string and parses it into an s-expression format. A parse error will result in a sparql-parse-error being raised.

This function is useful for three reasons: validation and inspection of queries, manual manipulation of query expressions without text processing, and performing parsing at a more convenient time than during query execution.

You do not need an open triple store in order to parse a query.

default-base and default-prefixes allow you to provide BASE and PREFIX arguments to the parser without inserting them textually into the query.

default-base should be nil or a string, and default-prefixes can either be a hash-table (string prefix to string expansion) or a list similar to db.agraph:*standard-namespaces*.

parse-sparql returns the s-expression representation of the query string.

db.agraph.sparql:run-sparql db.agraph.sparql::query &rest db.agraph.sparql::args &key db.agraph.sparql::from db.agraph.sparql::from-named db.agraph.sparql::limit db.agraph.sparql::offset db.agraph.sparql::default-base db.agraph.sparql::default-prefixes db.agraph.sparql::db db.agraph.sparql::rdf-format db.agraph.sparql::results-format db.agraph.sparql::with-variables db.agraph.sparql::default-dataset-behavior db.agraph.sparql::output-stream db.agraph.sparql::extendedp db.agraph.sparql::memoizep db.agraph.sparql::memos db.agraph.sparql::load-function db.agraph.sparql::verbosep
function

run-sparql takes a SPARQL query as input and returns bindings or new triples as output.

SELECT and ASK query results will be presented in results-format; the RDF output of DESCRIBE and CONSTRUCT will be serialized according to rdf-format. If the format is programmatic, any results will be returned as the first value, and nothing will be printed on output-stream.

  • query can be a string, which will be parsed by parse-sparql, or an s-expression as produced by parse-sparql. If you expect to run a query many times, you can avoid some parser overhead by parsing your query once and calling run-sparql with the parsed representation.

  • If query is a string, then default-base and default-prefixes are provided to parse-sparql to use when parsing the query. See the documentation for that function for details. Parser errors can be signaled within parse-sparql, and will be propagated onwards by run-sparql.

  • Results or new triples will be serialized to output-stream. If a programmatic format is chosen for output, the stream is irrelevant. An error will be signaled if output-stream is not a stream, t, or nil.

  • If limit, offset, from, or from-named are provided, they override the corresponding values specified in the query string itself. As FROM and FROM NAMED together define a dataset, and the SPARQL Protocol specification states that a dataset specified in the protocol (in this case, the programmatic API) overrides that in the query, if either from or from-named are non-nil then any dataset specifications in the query are ignored. You can specify that the contents of the query are to be partially overridden by providing t as the value of one of these arguments. This is interpreted as 'use the contents of the query'. from and from-named should be lists of URIs.

  • default-dataset-behavior controls how the query engine builds the dataset environment if FROM or FROM NAMED are not provided. Valid options are :all (ignore graphs; include all triples) and :default (include only the triple store's default graph).

  • with-variables should be an alist of symbols and values. Before the query is executed, the variables named after symbols will be bound to the provided values. This allows you to use variables in your query which are externally imposed, or generated by other queries. The format expected by with-variables is the same as that used for each element of the list returned by the :alists results-format.

  • db (*db* by default) specifies the triple store against which queries should run.

  • If verbosep is non-nil, status information is written to *sparql-log-stream* (*standard-output* by default).

Three additional extensions are provided for your use.

  • If extendedp is true (or *use-extended-sparql-verbs-p* is true, and the argument omitted) some additional SPARQL verbs become available. SUM, AVERAGE, MEDIAN, STATS, CORRELATION, and COUNT can all be used in place of SELECT. These verbs are still experimental and undocumented.

  • If memoizep is true (or *build-filter-memoizes-p* is true, and the argument omitted) calls to SPARQL query functions (such as STR, fn:matches, and extension functions) will be memoized for the duration of the query. For most queries this will yield speed increases when FILTER or ORDER BY are used, at the cost of additional memory consumption (and consequent GC activity). For some queries (those where repetition of function calls is rare) the cost of memoization will outweigh the benefits. In large queries which call SPARQL functions on many values, the size of the memos can grow large.

Memoization also requires that your extension functions do not depend on side-effects. The standard library is correct in this regard.

  • You can achieve substantial speed increases by sharing your memos between queries. Create a normal eql hash-table with (make-hash-table), passing it as the value of the memos argument to run-sparql. This hash-table will gradually fill with memos for each used query function.

If you wish to globally enable memoization, set the variables as follows:

(progn  
  (setf *build-filter-memoizes-p* t)  
  (setf *sparql-sop-memos* (make-hash-table))) 

Be aware that the size of *sparql-sop-memos* could grow very large indeed. You might consider using a weak hash-table, or periodically discarding the contents of the hash-table.

  • load-function is a function with signature (uri db &optional type) or nil. If it is a function, it is called once for each FROM and FROM NAMED parameter making up the dataset of the query. The execution of the query commences once each parameter has been processed. The type argument is either :from or :from-named, and the uri argument is a part (ordinarily a future-part) naming a URI. The default value is taken from *dataset-load-function*. You can use this hook function to implement loading of RDF before the query is executed.

The values returned by run-sparql are dependent on the verb used. The first value is typically disregarded in the case of results being written to output-stream. If output-stream is nil, the first value will be the results collected into a string (similar to the way in which cl:format operates).

The second value is the query verb: one of :select, :ask, :construct, or :describe. Other values are possible in extended mode.

The third value, for SELECT queries only, is a list of variables. This list can be used as a key into the values returned by the :arrays and lists results formats, amongst other things.

Individual results formats are permitted to return additional values.

db.agraph.sparql:run-sparql-ask db.agraph.sparql::query format &key db.agraph.sparql::from db.agraph.sparql::from-named db.agraph.sparql::with-variables db.agraph.sparql::output-stream
function
Run the provided query, serializing the results as specified by the format argument. See the documentation for run-sparql for more information.

db.agraph.sparql:run-sparql-construct db.agraph.sparql::construct-pattern db.agraph.sparql::query-pattern db.agraph.sparql::rdf-format &key db.agraph.sparql::ordered db.agraph.sparql::from db.agraph.sparql::from-named db.agraph.sparql::limit db.agraph.sparql::offset db.agraph.sparql::with-variables db.agraph.sparql::output-stream
function

Run the provided query, using the results to construct a new graph. The new graph is either returned or serialized onto output-stream as specified by the format argument. construct-pattern must be a list of triple patterns. query-pattern is used as input to run-sparql-select.

See the documentation for run-sparql for more information.

UPI coercion depends on the current value of db!

db.agraph.sparql:run-sparql-describe db.agraph.sparql::targets db.agraph.sparql::query db.agraph.sparql::rdf-format &key db.agraph.sparql::variables db.agraph.sparql::with-variables db.agraph.sparql::from db.agraph.sparql::from-named db.agraph.sparql::output-stream
function

Run the provided query, using the results to construct a new graph. The new graph is either returned or serialized onto output-stream as specified by the format argument. targets is a list of resources to describe. query is used as input to run-sparql-select to yield further resources to describe.

See the documentation for run-sparql for more information.

db.agraph.sparql:run-sparql-select db.agraph.sparql::query format &key db.agraph.sparql::ordered db.agraph.sparql::distinct db.agraph.sparql::from db.agraph.sparql::from-named db.agraph.sparql::limit db.agraph.sparql::offset db.agraph.sparql::with-variables db.agraph.sparql::output-stream db.agraph.sparql::variables
function

Run the provided query, serializing the results as specified by the format argument. :distinct can be t or :distinct, meaning full DISTINCT processing, or :reduced, which optionally discards duplicates.

See the documentation for run-sparql for more information.

Return a list of alists, each of which contains the values for variables in order. If variables is not provided, you can find the variables as the second return value.

Return a list of lists, each of which contains the values for variables in order. If variables is not provided, you can find the variables as the second return value.

Manipulating triples

You can add triples to a triple-store programatically with the function add-triple. The three required arguments, representing the subject, predicate, and object of the triple to be added can be expressed either as strings in the N-Triples syntax for URI references and literals or as UPIs such as are returned by the functions intern-resource, intern-literal, intern-typed-literal, and new-blank-node.

The functions subject, predicate, object, and graph provide access to the part UPIs of triples returned by the cursor functions like row and next-row and collected by get-triples-list.

add-triple s p o &key db g triple
function

Add a new triple to the triple-store with the given subject, predicate and object and graph, specified either as UPIs or strings in N-Triples format. The :db keyword argument specifies the triple store to which the triple will be added and defaults to the value of *db*. Returns the numeric id of the new triple.

Note that duplicate triples can be added to a triple-store but indices will only refer to the same triple once.

copy-triple triple
function
Copy a triple. This function is useful if you want to keep a reference to a triple obtained from a cursor returned by query functions such as get-triples since the cursor reuses the triple datastructure for efficiency reasons.

delete-triple id &key db
function

Marks the triple whose id is id as deleted.

See also undelete-triple and undeleted-triple-p.

The :db keyword argument specifies the triple store in which the triple should be marked as deleted.

delete-triples &rest args &key s p o g s-end p-end o-end g-end filter db
function

Delete triples matching the given subject, predicate, object, and graph, specified either as part IDs, strings in N-Triples format, or the wildcard nil. Returns the number of triples deleted.

The :db keyword argument specifies the triple store to query, defaulting to the value of *db*.

deleted-triple-p id &key db
function

Returns a boolean indicated whether the triple whose ID is id is deleted.

The :db keyword argument specifies the triple store in which to check.

get-triple-by-id id &key db triple
function
Locate the triple whose triple-id is id and return it. The keyword argument db can be used to specify the triple-store in which to search. It defaults to the current triple-store, *db*. Get-triple-by-id allocates a new triple (using make-triple. You can prevent this by passing in your own triple using the keyword argument :triple. The data in the triple you pass in will by overwritten.

graph triple &optional upi
function

Sets and gets the graph UPI of a triple.

In addition to being returned, the optional upi argument will be filled in with the graph's UPI. If not passed in, a new UPI will be created.

make-triple
function
Creates and returns a new triple data structure. The subject, predicate, object and graph slots will have UPIs of type +rdf-unencoded-upi+ and the triple-id will be zero.

make-upi
function
Creates and returns a new UPI data structure.

object triple &optional upi
function

Sets and gets the object UPI of a triple.

In addition to being returned, the optional upi argument will be filled in with the object's UPI. If not passed in, a new UPI will be created.

part->concise part &optional use-namespaces-p
function
Return a concise, human-readable string representation of the part which can be either a UPI or a future-part.

part->string part &key format
function
Return a string representation of part (which can be a UPI or a future-part). The :format keyword argument controls the format and can be one of :ntriple, :long, or :concise, as with the :format argument to print-triple. The default is the value of (ag-property :default-print-triple-format).

part-value part
function
Return the value of the part. Part can be UPI or a future-part.

predicate triple &optional upi
function

Sets and gets the predicate UPI of a triple.

In addition to being returned, the optional upi argument will be filled in with the predicate's UPI. If not passed in, a new UPI will be created.

subject triple &optional upi
function

Sets and gets the subject UPI of a triple.

In addition to being returned, the optional upi argument will be filled in with the subject's UPI. If not passed in, a new UPI will be created.

triple->string
nil
No documentation found

triple= triple1 triple2
function
Compares triple1 and triple2 and returns true if they have the same contents and false otherwise.

undelete-triple id &key db
function

Unsets the deleted flag from the triple whose id is id.

The :db keyword argument specifies the triple store in which the triple should be marked as undeleted.

upi future-part &key errorp
function

Returns the UPI associated with the future-part future-part.

If the future-part uses namespaces, then calling upi will resolve the namespace mapping. An error will be signaled if upi is called and there is no namespace mapping defined. You can use the errorp keyword parameter to disable the error and return nil instead.

upi->value upi &key db
function

Decodes UPI and returns the value, the type-code and any extra information as multiple values. These three things can be interpreted as follows:

  • value - a string representing the contents of the UPI
  • type-code - an integer corresponding to one of the defined UPI types (see supported-types for more information). You can use type-code->type-name to see the English name of the type.
  • extra - Some encoded UPIs contain additional information which will be placed in extra if it is there. Examples include the language code or datatype of a literal.

Examples:

> (upi->value (value->upi 22 :byte))  
==> 22  
==> 18  
==> nil  
  
> (upi->value (upi (literal "hello" :language "en")))  
==> "hello"  
==> 3  
==> "en" 

Triple Parts: Resources, Literals, and Blank nodes.

Each triple has five parts (!), a subject, a predicate, an object, a graph and a (unique, AllegroGraph assigned) ID. In RDF, the subject must be a "resource", i.e., a URI or a blank node. The predicate must be a URI. The object may be a URI, a blank node or a "literal". Literals are represented as strings with an optional type indicated by URI or with a (human) language tag such as en or jp. 3 ; "Blank nodes" are anonymous parts whose identity is only meaningful within a given triple-store.

Resources and literals can be denoted with plain Lisp strings in the syntax used in N-Triples files. However this isn't entirely convenient since the N-Triples syntax for literals requires quotation marks which then need to be escaped when writing a Lisp string. For instance the literal whose value is "foo" must be written in N-Triples syntax as "\"foo\"". Similarly -- though not quite as irksome -- URIs must be written enclosed in angle brackets. The string "http://www.franz.com/simple#lastName", passed as an argument to add-triple will be interpreted as a literal, not as the resource indicated by the URI. To refer to the resource in N-Triples syntax you must write "<http://www.franz.com/simple#lastName> ". Finally, literals with datatypes or language codes are even more cumbersome to write as strings, requiring both escaped quotation marks and other syntax.

To ease the pain of producing correctly formatted N-Triples strings we provide two functions resource and literal that take care of building a syntactically correct N-Triples string. (The ! reader macro, discussed below, can also be used to produce syntactically correct N-Triples strings.) For efficiency, these are

literal thing &key language datatype
function

Create a new future-part with the provided values.

If provide, the language argument should be a valid RDF language tag.

If provided, the datatype can be a string or future-part specifying a resource. An overview of RDF datatypes can be found in the W3C's RDF concepts guide.

Only one of datatype and language can be used at any one time.

resource thing
function
Return the provided URI as a future-part.

Some examples (we will describe and explain the ! notation below):

(resource "http://www.franz.com/") =>  
  !<http://www.franz.com/>  
(literal "Peter") => !"Peter"  
(literal "10" :datatype  
  "http://www.example.com/datatypes#Integer") =>  
  !"10"^^<http://www.example.com/datatypes#Integer>  
(literal "Lisp" :language "EN") => !"Lisp"@en 

Another issue with using Lisp strings to denote literals and resources is that the strings must, at some point, be translated to the UPIs used internally by the triple-store. This means that if you are going to add a large number of triples containing the same resource or literal and you pass the resource or literal value as a string, add-triple will have to repeatedly convert the string into its UPI.

The functions intern-resource, intern-literal, intern-typed-literal can be used to compute the UPI of a resource, or literal which can then be passed to add-triple, saving the cost of having to compute the UPI each time add-triple is called. Similarly the function new-blank-node can be used to the UPI of a new anonymous node for use as the subject or object of a triple.

blank-node-p upi
function

Returns true if upi is a blank node and nil otherwise. For example:

> (blank-node-p (new-blank-node))  
t  
> (blank-node-p (literal "hello"))  
nil 

intern-resource uri &key db upi
function
Intern a URI and return its ID.

intern-literal value &key language db upi
function
Intern an untyped literal, possibly with a language tag, and return its UPI

intern-typed-literal value type &key db upi
function
Intern a typed literal and return its ID.

new-blank-node &key db upi
function
Create a new blank-node in the triple-store db and return the UPI. If a upi is not passed in with the :upi parameter, then a new UPI structure will be created.

with-blank-nodes blank-nodes &body body
macro

This convenience macro binds one or more variables to new blank nodes within the body of the form. For example:

(with-blank-nodes (b1 b2)  
  (add-triple b1 !rdf:type !ex:Person)  
  (add-triple b1 !ex:firstName "Gary")  
  (add-triple b2 !rdf:type !ex:Dog)  
  (add-triple b2 !ex:firstName "Abbey")  
  (add-triple b2 !ex:petOf b1)) 

The following function demonstrates the use of these functions. First we use intern-resource to avoid repeatedly translating the URIs used as predicates into numeric IDs and then use new-blank-node to create a blank node representing each employee and intern-literal and intern-typed-literal to translate the strings in the list employee-data into IDs. (We could also use literal to translate the strings to N-Triples strings which would then be parsed and interned by add-triple but this is more efficient.)

(defun add-employees (company employee-data)  
  (let ((first-name (intern-resource  
                      "http://www.franz.com/simple#firstName"))  
        (last-name (intern-resource  
                      "http://www.franz.com/simple#lastName"))  
        (salary (intern-resource  
                      "http://www.franz.com/simple#salary"))  
        (employs (intern-resource  
                      "http://www.franz.com/simple#employs"))  
        (employed-by (intern-resource  
                      "http://www.franz.com/simple#employed-by")))  
    (loop for (first last sal) in employee-data do  
         (let ((employee (new-blank-node)))  
           (add-triple company employs employee)  
           (add-triple employee employed-by company)  
           (add-triple employee first-name (intern-literal first))  
           (add-triple employee last-name (intern-literal last))  
           (add-triple employee salary  
                       (intern-typed-literal  
                        sal  
                        "http://www.franz.com/types#dollars"))))))  
    

Finally, you can compare triples and UPIs using

future-part= part-1 part-2
function
Test if two future-parts are equal. Future-parts are equal if they both resolve to the same UPI or, if they cannot yet be resolved, if their strings are string=.

part= part-1 part-2
function
Compare any two things that can be coerced into UPIs as UPIs.

triple= triple1 triple2
function
Compares triple1 and triple2 and returns true if they have the same contents and false otherwise.

upi= upi-1 upi-2
function
Test if two UPIs are equal. UPIs are equal if they are eq (i.e., the same object) or if they are equal as octet arrays.

upip thing
function
Returns true if thing appears to be a UPI. Recall that every UPI is a octet array 12 bytes in length but not every length 12 octet array is a UPI. It is possible, therefore, that upip will return true even if thing is not a UPI.

upi-type-code upi
function
Returns the type of a UPI. The type is a one-byte tag that describes how the rest of the bytes in the UPI should be interpreted. Some UPIs are hashed and their representation is stored in a string-table. Other UPIs encode their representation directly (see upi->value and value->upi for additional details).

Datatype and predicate mapping

Most triple-stores work internally only with strings. AllegroGraph, however, can store a wide range of datatypes directly in its triples. This ability not only allows for huge reductions in triple-store size but also lets AllegroGraph execute range queries remarkably quickly. Use the supported-types function to see a list of datatypes that AllegroGraph can encode directly. The datatypes are specified as keywords or integers (with the integers being used internally). You can translate in either direction using type-code->type-name and type-name->type-code. You can add encoded-triples (i.e., triples some of whose parts are encoded UPIs rather than references to strings) directly using add-triple or by setting up mappings between particular predicates or specific datatypes and then using one of the bulk loading functions. In the former case, you use value->upi and in the later, you use datatype-mappings and predicate-mappings.

For example, we can add a triple that directly points to another by using the :triple-id AllegroGraph datatype. This provides a simple and light-weight reification mechanism but note that it is outside the boundaries of RDF. Here we add one triple that describes Gary's birthdate and another that says the first is wrong.

> (let ((new-id (add-triple !o:gary  
                            !o:birthdate  
                            (value->upi "1924-03-13" :date))))  
  
> (add-triple !o:data !o:incorrect (value->upi new-id :triple-id))) 

As a second example, here is how to specify that the datatype xsd:double maps to an AllegroGraph :double-float and the predicate http://www.example.com/predicate#age maps to an :unsigned-byte

> (setf (datatype-mapping "<http://www.w3.org/2001/XMLSchema#double>")  
        :double-float)  
:double-float  
  
> (setf (predicate-mapping "<http://www.example.com/predicate#age>")  
        :unsigned-byte)  
:unsigned-byte 

Now when you load an RDF file, AllegroGraph will example each triples to see if it satisfies the mappings. When it does, then an encoded-triple will be added to the triple-store. Depending on your needs, you can even tell AllegroGraph to only load encoded-triples and not worry about strings at all. This can provide tremendous spaces savings and also gives you the benefit of range queries.

datatype-mapping string &optional db
function

Returns the type encoding for string. String should be XML Schema type designator. For example:

> (datatype-mapping "http://www.w3.org/2001/XMSchema#unsignedByte")  
:unsigned-byte 

see (setf datatype-mapping) for more information.

predicate-mapping part &optional db
function

Returns the type encoding for string. String should be the uriref of a property. For example:

> (predicate-mapping !<http://www.example.org/property/height>)  
:double-float 

See (setf predicate-mapping) for more information.

supported-types
function

Returns a list of type names that can be used in value->upi to encode a string into an encoded UPI. To see the corresponding type code for a name, use the type-name->type-code function.

See the section on type mapping.

type-code->type-name code
function
Returns the type name associated with code. See type-name->type-code and supported-types for more details. The numbers can come from the type-names returned by supported-types or can correspond to RDF node types like blank nodes, resources and literals. The function returns nil if the code does not correspond to any type.

type-name->type-code name
function

Returns the type code associated with the name. See type-code->type-name and supported-types. The name can be one of the supported-types or one of:

  • :blank-node
  • :resource
  • :literal
  • :literal-language
  • :literal-short
  • :literal-typed

The function returns nil if there is no type-code corresponding to name.

upi->value upi &key db
function

Decodes UPI and returns the value, the type-code and any extra information as multiple values. These three things can be interpreted as follows:

  • value - a string representing the contents of the UPI
  • type-code - an integer corresponding to one of the defined UPI types (see supported-types for more information). You can use type-code->type-name to see the English name of the type.
  • extra - Some encoded UPIs contain additional information which will be placed in extra if it is there. Examples include the language code or datatype of a literal.

Examples:

> (upi->value (value->upi 22 :byte))  
==> 22  
==> 18  
==> nil  
  
> (upi->value (upi (literal "hello" :language "en")))  
==> "hello"  
==> 3  
==> "en" 

value->upi value encode-as &optional upi
function
Encodes value into UPI as type of encode-as. The encode-as argument can be a type code or a type name (see supported-types and type-name->type-code for details. If a upi keyword argument is not supplied, then a new UPI will be created. See upi->value for information on retrieving the original value back from an encoded UPI.

Freetext indexing

AllegroGraph 2.2 and beyond support freetext indexing on the objects of triples whose predicates have been registered for indexing. In version 2.2, only triples added after a predicate has been registered will be indexed. 4 Once indexed, triples can be found using a simple but robust query language. Freetext indexing support includes functions to register predicates and see which predicates are registered:

register-freetext-predicate predicate &key db
function

Register the predicate predicate for freetext indexing in the triple-store db. Once registered, any triples with this predicate will have the string of their object indexed by AllegroGraph's built-in freetext indexer. Use the db keyword argument to specify the triple-store in which to register the predicate. If left unspecified, the predicate will be registered in triple-store *db*.

 (register-freetext-predicate  
      !<http://www.w3.org/2000/01/rdf-schema#comment>) 

note that you have to register predicates before you add triples with that predicate. Older triples will not get indexed.

freetext-registered-predicates &key db
function
Return a list of predicates that have been registered for freetext indexing for the triple-store db. See register-freetext-predicate for more information.

freetext-predicate-p predicate &key db
function
Returns true if the predicate predicate is registered for freetext indexing in the triple-store db. See register-freetext-predicate for more information.

It of course also includes several methods to query a triple-store for triples that match an expression:

freetext-get-ids expression &key db
function

Returns a list of ids of the triples whose object contains text matching the expression.

(freetext-get-ids "amsterdam")  
(freetext-get-ids "\"Good girls go to heaven,  
                     bad girls go to Amsterdam\"")  
(freetext-get-ids '(and "amsterdam" "usa"))  
(freetext-get-ids '(and "amst?r* (or "us?" "neth*"))) 

freetext-get-triples expression &key db
function

Returns a cursor that iterates over the triples whose objects contain text that matches the expression.

(iterate-cursor (triple (freetext-get-triples "amsterd*"))  
    (print triple)) 

freetext-get-triples-list expression &key db
function
Returns all the triples that satisfy the expression. Be careful: if many triples match the pattern, then the list may be very large. See freetext-get-triples for the cursor variant.

freetext-get-unique-subjects expression &key db
function

Returns all the unique subjects in triples whose objects contain expression. This is a useful function in prolog queries. The following example is included in the tutorial.

(select (?person)  
  (lisp ?list  
     (freetext-get-unique-subjects '(and "collection" "people")))  
  (member ?person ?list)  
  (q ?person !rdfs:subClassOf !c:AsianCitizenOrSubject)) 

Freetext Query Expressions

Here is the informal grammar used to build query expressions:

pattern
string-pattern | composite-pattern
string-pattern
string | phrase-string
string
"char*"
char
? denotes a wild card that matches any single character
char
* denotes a wild card that matches any sequence of characters
char
\" denotes an escaped "
char
any any other character denotes itself
phrase-string
"\"this is a phrase\"" no ? and * allowed
composite-pattern
(and pattern*) | (or pattern*)

The !-reader macro and future-parts

When working with the triple-store at the REPL (the lisp listener) it's nice to have a more concise way to refer to resources and literals than with calls to resource, literal or the part interning functions. It's also handy to be able to abbreviate the many long URIs with a common prefix such as http://www.w3.org/2000/01/rdf-schema#. Namespaces and the !-reader macro provide a concise syntax for both resources and literals.

The first thing the !-reader macro allows you to do is write N-Triples strings without quotation marks (except for those required by the N-Triples syntax itself!). Thus instead of writing:

"<http://www.franz.com/>"  
"\"foo\""  
"\"foo\"^^<http://www.w3.org/2000/01/rdf-schema#integer>"  
"\"foo\"@en" 

you can simply write:

!<http://www.franz.com/>  
!"foo"  
!"foo"^^<http://www.w3.org/2000/01/rdf-schema#integer>  
!"foo"@en 

In addition, the !-reader macro uses namespaces to abbreviate long URIs. Use the register-namespace function to assign an abbreviation to any prefix used in URIs. For instance you can register s as an abbreviation for the URI prefix http://www.franz.com/simple# like this:

(register-namespace "s" "http://www.franz.com/simple#") 

Then you can use that prefix with the !-reader macro to write URIs starting with that prefix:

!s:jans => !<http://www.franz.com/simple#jans> 

You have probably noticed that the !-reader macro does not seem to be doing anything:

!"hello" => !"hello" 

This is because ! is converting the string "hello" into what AllegroGraph calls a future-part and the future-part prints itself using the !-notation. If we describe the future-part, then we will see all of the additional structure:

agraph> (describe !"hello")  
!"hello" is a structure of type future-part.  It has these slots:  
 type               :literal  
 value-prefix       nil  
 value-fragment     "hello"  
 extra-prefix       nil  
 extra-fragment     nil  
 value              "hello"  
 extra              nil  
 extra-done-p       t  
 upi                #(5 0 0 0 0 0 111 108 108 101 104 7)  
 triple-db  
    #<triple-db  
      /repository/other/agraph/lisp/tests/scratch/butter, open @  
      #x128f7922> 

Now it's clear that AllegroGraph has parsed the string and performed the computations to determine the part's UPI.

future-parts are called future-parts because they cache some of the information (e.g., the namespace prefix) and wait to resolve until the namespace mappings are available which may be in the future. Here are Before we finish describing resolution, however, here are some examples. First, literals:

The story for resources is very similar:

Future part are resolved when it is necessary to determine their UPI. If the part uses a namespace (e.g., is something like !a:b), then the namespace will be resolved first. It is an error to try and determine a part's UPI if the necessary namespace mapping has not been registered. Once the namespace of a part is resolved, then it will not be resolved again (during the current Lisp session). After namespace resolution, a future-part is equivalent to a particular string which can be interned into a triple-store.

Future-parts make working with AllegroGraph much simpler but they do contain some machinery and can be confusing. Remember that you can always tell what is happening by using the Lisp describe function.

The following are the functions used for enabling the ! reader macro and for managing namespaces.

clear-namespaces &key keep-standard-namespaces
function
Delete all existing namespace mappings. If :keep-standard-namespaces is true (the default) then the namespace mappings in standard-namespaces will not be removed.

collect-namespaces &optional filter
function
Returns a list of namespace mappings.

display-namespaces &optional filter
function
Print out all namespace mappings in a human-readable format.

dump-namespaces &optional file
function
Dump all registered namespaces as calls to register-namespace that can be used to recreate the original namespaces if needed later. If the optional file argument is provided, output is written to that file. Otherwise the output is sent to *standard-output*.

namespace-redefinition-error
condition
No documentation found

lookup-namespace namespace-prefix
function
Returns the uriref associated with namespace-prefix or nil if there is no association.

map-namespaces fn &optional filter
function
Applies fn to each namespace mapping. Fn must be a function of two arguments: the prefix and the uri.

register-namespace namespace-prefix uri-reference &key errorp
function
Create a mapping between namespace-prefix and uri-reference, both of which should be strings. If the errorp keyword argument, which defaults to the value of (ag-property :error-on-redefine-namespace), is true, then defining a mapping for an existing prefix will signal a continuable namespace-redefinition-error condition.

register-standard-namespaces
function
Add standard namespaces such as rdf, rdfs and owl. See the variable standard-namespaces for more details.

*standard-namespaces*
variable

The standard-namespaces is a list of (name prefix) pairs representing namespace mappings. For example:

'(("rdf" "http://www.w3.org/1999/02/22-rdf-syntax-ns#")  
  ("rdfs" "http://www.w3.org/2000/01/rdf-schema#")  
  ("owl" "http://www.w3.org/2002/07/owl#")) 

It is used by the function register-standard-namespaces to create standard mappings.

Indexing

When triples are first added to the triple-store queries are performed by simple linear scanning. To avoid the cost of linear scans, triples can be indexed so that queries will be very fast. An AllegroGraph triple-store can have up to six different index flavors. Whether you need all six will depend on the sort of queries that you need to run. Each flavor is named by the order in which the triples are sorted. For example, if triples are sorted first on predicate, then object, subject, graph, and finally id, then the index flavor will be posgi. Here are the flavors AllegroGraph uses and the sorts of queries that they help optimize:

spogi    get-triples _s_, ---, ---, ---  
         get-triples _s_, _p_, ---, ---  
         get-triples _s_, _p_, _o_, ---  
  
posgi    get-triples ---, _p_, ---, ---  
         get-triples ---, _p_, _o_, ---  
  
ospgi    get-triples ---, ---, _o_, ---  
         get-triples _s_, ---, _o_, ---  
  
gspoi    get-triples _s_, ---, ---, _g_  
         get-triples _s_, _p_, ---, _g_  
         get-triples _s_, _p_, _o_, _g_  
  
gposi    get-triples ---, _p_, ---, _g_  
         get-triples ---, _p_, _o_, _g_  
  
gospi    get-triples ---, ---, _o_, _g_  
         get-triples _s_, ---, _o_, _g_ 

Unless you tell it otherwise, AllegroGraph will assume that each new triple-store should have all six indices 5 . You can change this with the :with-indices argument to create-triple-store and by using add-index and drop-index to manage indices explicitly.

When to index

Regardless of which flavors you are using, you must still reckon with when and how to build indices. The function index-new-triples builds indices of just the currently unindexed triples and takes time proportional to the number of triples to be indexed. However at query time each index built with index-new-triples must be queried so calling index-new-triples too often will reduce the benefits of building the indices. The function index-new-triples builds indices of just the currently unindexed triples and takes time proportional to the number of triples to be indexed. However at query time each index built with index-new-triples must be queried so calling index-new-triples too often will reduce the benefits of building the indices.

To get rid of the performance drag of having too many indices, you can use index-all-triples to build a unified index of all the triples in the triple-store. While this function can take time proportional to the total number of triples in the store, it can take advantage of the indices already built by index-new-triples to speed up indexing. In general the best strategy is to use index-new-triples to build indices after each large chunk of triples is added to the store and to periodically merge the indices with index-all-triples.

add-index flavor &key db location
function

Adds a new index flavor to a triple-store.

The actual index data structure will be created the next time that AllegroGraph indexes the triple-store (either by a call to index-new-triples or index-all-triples or automatically through the management policy of the triple-store.)

The :db keyword argument specifies the triple-store to which to add the index and will be *db* unless otherwise specified.

The :location keyword argument controls the physical location of the index. It can be left unspecified or set to the location of a directory. If left unspecified, the index will be located in the same directory as the triple-store (see data-directory). If used, it should name an empty directory.

Flavor should be a valid index name (e.g., :posgi). See index-flavors in the reference guide for additional details

add-indices flavors &key db location
function

Adds several indices to a triple-store at once.

Flavors should be a list of valid index flavors. See add-index for additional details.

add-standard-indices &key db
function
Adds the indices in the ag-property :standard-indices to a triple-store using add-indices.

average-index-fragment-count &key db
function

Returns the average index-fragment-count for a triple-store. This is the total number of index chunks divided by the number of indices. The higher the chunk count, the more work each query must do to find results.

The keyword argument :db can be used to specify the triple-store to use. It defaults to *db*.

drop-index flavor &key db
function

Remove an index from a triple-store.

The flavor argument should be a valid index flavor name that is currently an index of the triple-store.

The keyword argument :db controls the triple-store from which the index will be removed. It will default to *db* unless otherwise specified.

Flavor should be a valid index name (e.g., :posgi). See index-flavors in the reference guide for additional details

drop-indices flavors &key db
function

Remove several indices from a triple-store at once.

Flavors should be a list of valid index flavors that are indices of the triple-store. See drop-index for more details.

index-all-triples &key db wait
function

Create a unified index of all the triples in the triple store, including triples previously indexed. The time taken to index all triples includes the time index-new-triples plus the time to merge all index chunks. The first step is proportional to the number of unindexed triples multiplied by the logarithm of the same number (i.e., if N is the number of new triples, then the time is O(N log N)). The merge step is proportional to the total number of triples multiplied by the logarithm of the number of chunks. If you have a very large triple store to which you have added a fairly small number of new triples, you can use index-new-triples to get most of the benefits of indexing at much lower time cost.

Indexing can run as a background task in either the same Lisp process or on mulitple remote Lisp processes (see the clustering documentation for more details). Index-all-triples return the task id of the indexing task that it creates.

The wait keyword argument controls whether index-all-triples returns immediately or waits for all indexing and merging to complete. It defaults to true. If it is false, you can use the id of the task returned to see if indexing has completed.

See the variable maximum-indexing-sort-chunk-size for information on controlling the indexing process.

index-coverage-percent &key db
function
Returns the average proportion of triples that are indexed.

index-new-triples &key db wait
function

Index the triples that have been added to the triple store since the last time indices were built. The time taken to index new triples is proportional to the number of unindexed triples multiplied by the logarithm of the same number (i.e., if N is the number of new triples, then the time is O(N log N)).

However, the new indices will not be merged with the older indices which will cause queries to execute more slowly. To maximize query performance you will want to use index-all-triples to build a unified index.

Indexing can run as a background task in either the same Lisp process or on mulitple remote Lisp processes (see the clustering documentation for more details).

The wait keyword argument controls whether index-new-triples returns immediately or waits for all indexing to complete. It defaults to true. If it is false, then index-new-triples return the id of the indexing task that is created.

See the variable maximum-indexing-sort-chunk-size for information on controlling the indexing process.

index-status-report &key db stream verbose summary indices
function
Prints a summary of index information to stream.

indexing-status &key db
function

Returns the status of the current triple-store as regards indexing. This can be one of:

  • :scheduled

  • :running

  • :needed

  • :idle

indexing-needed-p &key db
function
Returns true if there are unindexed triples in the triple store.

*maximum-indexing-sort-chunk-size*
variable
This controls the maximum number of records that are sorted at a time during index merging. For example, suppose that you loaded 20-million new triples. The indexes for these triples must be merged with your existing indexes. Due to memory constraints, it may not be possible to do this merge in one step so instead it is done in chunks of maximum-indexing-sort-chunk-size. The best value depends on how much memory is available. The initial value is good for machines with 1-2GB of RAM. If you have significantly more memory than this, you should consider larger values, such as (expt 2 24), for a machine with 16GB of RAM.

merge-new-triples &key db wait
function
Merge-new-triples combines smaller index chunks into larger ones without trying to merge all of the chunks into one. Suppose, for example, you have loaded a triple-store with 100-million triples and then called index-all-triples. Now suppose you add several batches of a few thousand triples and call index-new-triples after each batch. At this point each index flavor will consist of many chunks: one big one for the initial load and many small ones. A full merge will take significant time and compuational resources but all of these chunks are inefficient for query. This is when merge-new-triples is a useful half-way measure. It will merge the small chunks into one (which should be quite fast). This will improve query performance without requiring a complete merge.

triple-count &key db
function
Returns the number of triples in a triple store. The :db keyword argument specifies the triple store to use, either by name or a triple store object. It defaults to the value of *db*.

triple-store-indices &key db verbose
function

Returns a list of information on the indices of a triple-store.

The :db keyword argument specifies the triple-store on which to report.

The :verbose keyword argument controls how much information to return. If verbose is nil, then only the flavor-names are returned. If it is true, then a list of lists is returned where each sublist starts with an index object and continues with ranges of the unindexed triples.

unschedule-indexing &key db
function
Unschedules any pending indexing related tasks (i.e., index building or index merging tasks). Returns true if the tasks were unscheduled and nil if unscheduling was not possible. In the case where indexing could not be unscheduled, the reason will be returned as a second value. This can be one of nil (there were no tasks) or :running (the tasks are already running and cannot be stopped.

Querying the triple-store

You can get triples out of a triple-store as a list or a cursor. The list structure is convenient but unwieldy if your query returns millions of triples (since every triple must be returned before you will see any of them). A cursor is like a database cursor from the RDBMS-world. It lets you navigate through the results of your query one at a time.

Cursors

Cursors supply the functions next, row, and next-p for basic forward iteration. For convenience we include next-row which advances the cursor and returns the next row immediately. Cursors reuse the triple data-structure as they move through the result set so if you want to accumulate triples, make sure to use the copy-triple function.

copy-triple triple
function
Copy a triple. This function is useful if you want to keep a reference to a triple obtained from a cursor returned by query functions such as get-triples since the cursor reuses the triple datastructure for efficiency reasons.

next-p cursor
function
Return true if and only if cursor has at least one more triple in it. Note that row, next and next-p are lower-level cursor manipulation routines. You may be better serverd by using collect-cursor, count-cursor, map-cursor and iterate-cursor.

next cursor
function
Moves cursor forward to the next triple in the collection. Note that row, next and next-p are lower-level cursor manipulation routines. You may be better serverd by using collect-cursor, count-cursor, map-cursor and iterate-cursor.

next-row cursor
function

Returns the next triple from the cursor. The actual triple object returned is always the the same (eql) object. If you want to hold onto a triple for use after you advance the cursor, use the function copy-triple to make a copy of the value returned by next-row

Note that row, next, next-row and next-p are lower-level cursor manipulation routines. You may be better serverd by using collect-cursor, count-cursor, iterate-cursor and map-cursor.

row cursor
function
Returns the triple that cursor is currently pointing at. Note that row, next and next-p are lower-level cursor manipulation routines. You may be better serverd by using collect-cursor, count-cursor, map-cursor and iterate-cursor.

*cursor-time-to-live*
variable

Cursors are guaranteed to be available at least *cursor-time-to-live* seconds after each row operation. If left unused for more than this time, the AllegroGraph manager will reclaim the resource used by the cursor. Once reclaimed, the cursor will expire and any reference you have to it will be unusable.

*cursor-time-to-live* defaults to a value of most-positive-fixnum because it can unsettling to have cursors disappear out from under you. If you are in a situation where you are doing much simulaneous loading and querying, then you may want to experiment with setting it to a smaller value such as 300 (i.e., 5-minutes).

There are several natural cursor idioms, the following functions handle many of them. We suggest building your own functions using these as building blocks since there may be internal optimizations possible only through these.

collect-cursor cursor &key transform
function
Iterate over the cursor and collect a list of its triples. The :transform keyword can be used to modify the triples as they are collected. It defaults to the copy-triple function but you can use any function that takes one triple as an argument. (collect-cursor reuses the row argument so make sure that you use copy-triple in your own code if necessary).

count-cursor cursor
function
Returns the number of triples remaining in a cursor (and) exhausts cursor in the process.

iterate-cursor (var cursor &key count) &body body
macro

Iterate over the triples in cursor binding var to each triple. Use the :count keyword to limit the maximum number of triples that iterate-cursor visits. The binding to var is the same EQ triple in each iteration. Make sure to use copy-triple if you are retaining any of the triples that you visit.

If the iteration finishes normally, iterate-cursor returns the number of triples visited. If the body returns prematurely, iterate-cursor returns whatever is returned.

map-cursor count fn cursor &rest args
function

Iterate over the triples in cursor while applying the function fn to each one. Use the count keyword to limit the maximum number of triples that map-cursor visits. In each iteration fn is applied to the curent triple and to the list of arguments in args. map-cursor reuses the triple argument as it iterates so make sure to use copy-triple if you are retaining any of the triples that you visit.

The function returns the number of triples visited.

The 2.0 function do-cursor has been deprecated in 2.0.1 in favor of the function map-cursor and the macro iterate-cursor. It is likely that `get-triples, which takes any or all of the subject, object, predicate, and graph you want to match and returns a cursor that can be used with the functions next-p and next-row to iterate over the matching triples. Get-triples can also do range queries over encoded-triples and apply additional filters. For efficiency the cursor reuses the data structure used to hold the current triple; thus you must use the function copy-triple if you want to hold on to a triple for use after you advance the cursor with next-row.

The function delete-triples deletes triples from the triple store, using the same query syntax as get-triples.

count-query &key s p o g s-end p-end o-end g-end db
function
Return the count of the number of triples in db using only the information stored in the indices. Unindexed triples will not be included.

estimated-count-query &key s p o g s-end p-end o-end g-end db
function
Return an estimated count of the number of triples in db using only the information stored in the indices. Unindexed triples will not be included. The estimate can be off by as much as twice the triple store's metaindex-skip-size for each index chunk involved.

get-triple &rest args &key s p o g db return-encoded-triples return-non-encoded-triples include-deleted filter triple
function
Returns the first triple found matching the search pattern specified by s, p, o and g. You may also specify that only encoded or unencoded triples by searched; whether or not to search deleted triples; and whether or not a filter should be used.. A new triple will be created unless you pass in one to use using the triple keyword parameter.

get-triples &key s p o g s-end p-end o-end g-end db include-deleted return-encoded-triples return-non-encoded-triples filter indexed-triples-only-p use-reasoner
function

Query a triple store for triples matching the given subject, predicate, object, and graph. These can be specified either as UPIs, future-parts, strings in N-Triple format, or the wildcard nil. Returns a cursor object that can be used with next-p and next-row.

The following example finds every triple that starts with !ub:Kevin.

> (add-triple !ub:Kevin !ub:isa !"programmer")  
8523645  
>: !ub:Kevin  
!<http://www.w3.org/1999/02/22-rdf-syntax-ns#Kevin>  
> (get-triples :s !ub:Kevin)  
#<row-cursor #<triple-record-file @ #x13c1a87a> 2019 [1 - 2018] @  
  #x14942bba>  
> (print-triples *)  
<http://www.w3.org/1999/02/22-rdf-syntax-ns#Kevin>  
  <http://www.w3.org/1999/02/22-rdf-syntax-ns#isa>  
  "programmer" . 

The function get-triples takes the following arguments:

  • s, p, o, g - controls the actual query pattern. Use nil as a wildcard.

  • s-end, p-end, o-end, g-end - Allows for range queries over encoded triples (triples whose parts are encoded UPIs). Each ?-end parameter may only be used in conjunction with its corresponding starting value parameter.

  • The :db keyword argument specifies the triple store to query, defaulting to the value of *db*.

  • filter - if supplied, the filter should be a predicate of one parameter, a triple. If the filter function returns nil, then the triple will not be included in the result set.

  • include-deleted - if set to true, then triples that are flagged as deleted will not be filtered out by the cursor.

  • return-encoded-triples - If true, then get-triples returns triples with encoded parts; i.e., triples that use directly encoded UPIs rather than strings stored in the dictionary. The is set to true unless overridden.

  • return-non-encoded-triples - if true, then get-triples will return triples all of whose UPIs are stored as strings. This is set to true unless overridden.

The return value is a cursor object. The functions next-row and next-p operate on this cursor.

get-triples-list &rest args &key s p o g s-end p-end o-end g-end limit db return-encoded-triples return-non-encoded-triples cursor if-fewer if-more use-reasoner
function

Query a triple store for triples matching the given subject, predicate, object, and graph, specified either as part IDs (UPIs), future-parts, strings in N-Triples format, or the wildcard nil. Returns a list of matching triples. The get-triples-list function supports a multitude of options:

  • db - This keyword argument specifies the triple store to query, defaulting to the value of *db*.

  • s, p, o, g - controls the actual query pattern. Use nil as a wildcard.

  • s-end, p-end, o-end, g-end - Allows for range queries over encoded triples (triples whose parts are encoded UPIs). Each ?-end parameter may only be used in conjunction with its corresponding starting value parameter.

  • cursor - if cursor is supplied then AllegroGraph will use it to return more triples rather than building a new cursor.

  • if-fewer - controls the behavior of get-triples-list when fewer than limit triples match the query. Possible values are (defaults to nil):

    • nil - return list of actual results

    • :error - signal error if fewer results than limit

    • other -- return other instead of short list

  • if-more - controls what happens when more results than limit are available. Possible values are (defaults to :cursor):

    • nil - return list of limit results

    • :error - signal error if more results than limit

    • :cursor - return the list of limit results and a second value of a cursor that will yield the remaining results

    • other - return other instead of truncated list

  • limit - This keyword argument can be used to place a cap on the maximum number of triples returned. It defaults to the value of the special variable get-triples-list-limit. If set, then get-triples-list will return no more than :limit triples. If it is nil, then get-triples-list will return all of the triples found by the query. Warning: setting limit to nil can cause get-triples-list to return every triple in the triple-store; this can be a very bad thing over a serial connection.

  • return-encoded-triples - If true, then get-triples-list returns triples with encoded parts; i.e., triples that use directly encoded UPIs rather than strings stored in the dictionary. The is set to true unless overridden.

  • return-non-encoded-triples - if true, then get-triples-list will return triples all of whose UPIs are stored as strings. This is set to true unless overridden.

  • use-reasoner - if true, then the RDFS++ reasoner will be used to return inferred triples. If left unspecified, this will take on the value of *use-reasoner*. Note that most of the arguments to get-triples-list do not make sense when reasoning is turned on. AllegroGraph will signal an error if you try to combine reasoning with other parameters that it cannot use.

pprint-object part &key maximum-depth format db
function
Print information about part down to a maximum depth of maximum-depth using the format format. Triples for which part is an object and their children will be printed. See part->string and (ag-property :default-print-triple-format) for information about part printing. See pprint-subject to display information based on objects rather than subjects.

pprint-subject part &key maximum-depth format db
function
Print information about part down to a maximum depth of maximum-depth using the format format. Triples for which part is a subject and their children will be printed. See part->string and (ag-property :default-print-triple-format) for information about part printing. See pprint-object to display information based on subjects rather than objects.

triple-exists-p s p o &key g db filter
function
Returns true if a triple matching s, p and o (and optionally g) can be found in the designated triple-store. If left unspecified, the triple-store designated by *db* will be searched. This is handy when you care only about the presence of a triple and not its contents. If you want to use the triple, then use get-triple instead.

*get-triples-list-limit*
variable
The default number of triples to return from get-triples-list. If nil, then all triples will be returned.

AllegroGraph and Prolog

With pure Lisp as the retrieval language, you use a combination of functional and procedural approaches to query the database. With Prolog, you can specify queries in a much more declarative manner. Allegro CL Prolog and AllegroGraph work very well together. The tutorial provides an introduction to using Prolog and AllegroGraph together.

The q functor

The main interface to Prolog is the function 'q' (for query). The q function is analogous to get-triples. It has three arguments for subject, predicate and object. It iterates over all of the possible matches in the triple-store, binding its logic variables to each result in turn.

> (?- (q ?x ?y ?z))  
?X = 0  
?Y = 1  
?Z = 2     
... 

As is usual with Prolog, you can stop the iteration by typing a period (' . ').

the select macros

You can combine multiple q clauses into a query using one of the family of select macros. These evaluate a list of Prolog clauses and then do something with the solutions found 6 . There are three dimensions in which this family of macros differ:

Each macro takes as its first argument a template. Often the template is a a single var or a simple list of vars. But in general the template may be a tree composed of Lisp data and Prolog variables

The select macros will find each solution to the list of supplied Prolog clauses. It will then build up a fresh instance of the template tree with each Prolog var in the tree replaced by the variable in the solution. For example:

> (select ((:name ?name) (:age ?age))  
    (q ?x ?isa !ex:person)  
    (q ?x ?hasname ?name)  
    (q ?x ?hasage ?age)) 

would return a list of results like

(((:name "Gary") (:age 12))  
 ((:name "Xaviar") (:age 56))  
 ...  
) 

You can find other examples of using select in the Using Prolog with AllegroGraph tutorial.

select template &body clauses
macro
Evaluate the Prolog clauses and return a list of all solutions. Each solution is a copy of the template (which is a tree) where each Prolog variable is replaced by its string value. See select0 if you want UPIs rather than strings.

select0 template &body clauses
macro
Evaluate the clauses and return a list of all solutions. Solutions consist fresh copies of the template (which is a tree) where each Prolog variable in the tree is replaced by its UPI value in the solution. If you are interested in part-names rather than raw UPIs, see select.

select/callback (&rest vars) callback &body clauses
macro
Evaluate the Prolog clauses and call callback once for each solution. Solutions consist of a fresh copy of the template with each var replaced by the string value of the Prolog variable in the solution. The callback argument is evaluated. See select0/callback if you are want UPIs rather than strings.

select0/callback (&rest vars) callback &body clauses
macro
Evaluate the Prolog clauses and call callback once for each solution. Each solution is a copy of the template with each Prolog variable replaced by its UPI. The callback argument is evaluated. See select/callback if you want strings rather than UPIs.

select-distinct template &body clauses
macro
Evaluate the clauses and return a list of all distinct solutions. Each solution is a copy of the template where each Prolog variable in the tree is replaced by its string value See select0-distinct if you want UPIs rather than strings.

select0-distinct template &body clauses
macro
Evaluate the Prolog clauses and return a list of all distinct solutions. Each solution is a copy of the template (which is a tree) where each Prolog variable is replaced by its UPI value. See select-distinct if you want strings rather than UPIs,

select-distinct/callback (&rest vars) callback &body clauses
macro
Evaluate the Prolog clauses and call callback once for each distinct solution. Each solution is a fresh copy of the template with each var replaced by the string value of the Prolog variable in the solution. The callback argument is evaluated. See select0-distinct/callback if you want UPIs rather than strings.

select0-distinct/callback (&rest vars) callback &body clauses
macro
Evaluate the Prolog clauses and call callback once for each distinct solution. Each solution is a fresh copy of the template with each var replaced by the UPI value of the Prolog variable in the solution. The callback argument is evaluated. See select-distinct/callback if you want strings rather than UPIs.

Prolog and range queries

You can make range queries using the same prolog q functor as you do to make non-range queries. We support range queries on either the object or the graph fields (but not on both simultaneously in the same clause. What follows is a list of all the 9 possibilities for q. Each token in the functor may be a UPI literal, a Prolog variable (possibly anonymous) or nil. We use a - prefix on a token to indicate that it is a placeholder and it cannot be used as a variable; a + prefix to indicate that the token must be supplied and a ? to indicate that the token is either. .

(q ?s ?p ?o)  
(q ?s ?p ?o ?g)  
(q ?s ?p ?o ?g ?i)  
(q ?s ?p (-o +o1 +o2))  
(q ?s ?p (-o +o1 +o2) ?g)  
(q ?s ?p (-o +o1 +o2) ?g ?i)  
(q ?s ?p ?o (-g +g1 +g2))  
(q ?s ?p ?o (-g +g1 +g2) ?i)  
(q ?s ?p ?o (-g +g1 +g2) ?i ?triple) 

Suppose, for example, that we use AllegroGraph to record the velocities during a mythical 1000-km trip:

;;;;;;;;;;;;;;;;;;;; A 1000 KM trip.  
  
(in-package :triple-store-user)  
  
(create-triple-store "My1000KmTrip" :if-exists :supersede)  
  
(register-namespace "t" "http://www.me.disorg#"  
                    :errorp nil)  
  
;; add triples describing a 1000-km trip  
(loop with km = 0           ; distance in km  
    with time = 0           ; time in seconds  
    as kmh = (random 80.0)  ; velocity in km/hr  
    do (add-triple  
        !t:me  
        !t:distance  
        (value->upi km :double-float)  
        :g (value->upi time :unsigned-long))  
       (add-triple  
        !t:me  
        !t:velocity  
        (value->upi kmh :double-float)  
        :g (value->upi time :unsigned-long))  
       (incf km (/ kmh 60))  
       (incf time 60)  
    count 1             ; Number of data points, which is  
                        ; also the duration in minutes.  
    while (< km 1000))  
  
(index-all-triples) 

We can then use the following select query to find triples that represent the 1-minute periods where average speed is 75..80 km/h and the distance at which it occurred.

(loop for (time speed distance)  
    in (sort (select (?time ?v ?distance)  
           (lisp ?v75 (value->upi 75 :double-float))  
           (lisp ?v80 (value->upi 80 :double-float))  
           (q !t:me !t:velocity (?v ?v75 ?v80) ?time)  
           (q !t:me !t:distance ?distance ?time))  
         (lambda (x y) (< (first x) (first y))))  
    do (format t "~7d ~5f ~5f~%" (/ time 60) speed distance)) 

REPL User Interface

In addition to the ! reader macro, AllegroGraph provides a handful of functions to make it easier to interact with the triple-store from the REPL.

enable-print-decoded boolean &key pprint-dispatch
function

Triples and parts are represented by (simple-array (unsigned-byte 8) 56) and (simple-array (unsigned-byte 8) 12) respectively. By default the Lisp printer prints these as vectors of bytes which is not very informative when using AllegroGraph interactively. This function modifies a pprint-dispatch table to print triples and parts interpretively if this can be done in the current environment. Specifically, the value of db must be an open triple store, and the vector being printed must print without error in the manner of print-triple or upi->value. If any of these conditions don't hold the vector is printed normally.

If the boolean argument is true, the informative printing is enabled, otherwise disabled. The :pprint-dispatch argument specifies the dispatch table to modify, by default the value of print-pprint-dispatch. Once enable-print-decoded has been turned on, you can also use the special-variable *print-decoded* to enable fine-grained control of triple and UPI printing.

*print-decoded*
variable
When true and enable-print-decoded has been called with true, then triples and upis are printed with interpretation.

print-triple triple &key format stream
function
Print a triple returned by next-row or get-triples-list. The keyword argument :format, which defaults to the value of (ag-property :default-print-triple-format), specifies how the triple should be printed. The value :ntriple specifies that it should be printed in N-Triples syntax. The value :long indicates that the string value of the part should be used. And the value :concise causes it to use a more concise, but possibly ambiguous, human-readable format.

print-triples triple-container &key limit format stream
function
Display the triples in triple-container which can be either a triple store object, a list of triples such as is returned by get-triples-list, or a cursor such as is returned by get-triples. If the keyword argument :limit is supplied, then at most that many triples will be displayed. The :format keyword argument controls how the triples will be displayed, in either :ntriple, :long, or :concise format. It defaults to (ag-property :default-print-triple-format). The stream argument can be used to send output to the stream of your choice. If left unspecified, then output will go to standard-output.

*print-triples-list-limit*
variable
The default number of triples to print in calls to print-triples. If nil, then all triples will be returned.

triple->string
nil
No documentation found

part->string part &key format
function
Return a string representation of part (which can be a UPI or a future-part). The :format keyword argument controls the format and can be one of :ntriple, :long, or :concise, as with the :format argument to print-triple. The default is the value of (ag-property :default-print-triple-format).

:default-print-triple-format
property
The default value for the format keyword argument to print-triple, print-triples, and triple->string. Defaults to :ntriple.

part-value part
function
Return the value of the part. Part can be UPI or a future-part.

pprint-subject part &key maximum-depth format db
function
Print information about part down to a maximum depth of maximum-depth using the format format. Triples for which part is a subject and their children will be printed. See part->string and (ag-property :default-print-triple-format) for information about part printing. See pprint-object to display information based on subjects rather than objects.

pprint-object part &key maximum-depth format db
function
Print information about part down to a maximum depth of maximum-depth using the format format. Triples for which part is an object and their children will be printed. See part->string and (ag-property :default-print-triple-format) for information about part printing. See pprint-subject to display information based on objects rather than subjects.

Miscellaneous Functions and Variables

*agraph-version*
variable
Returns the current version of AllegroGraph

Server for Lisp and Java Clients

AllegroGraph supports a client/server environment for clients implemented in Lisp or in Java. The Lisp client API is described in Lisp Client Reference and the Java client API is described in Java Tutorial and Javadocs.

The Lisp server is managed with the following functions exported from db.agraph.

start-ag-server &key debug db.agraph.servers::verbose db.agraph.servers::root db.agraph.servers::users db.agraph.servers::limit db.agraph.servers::ender db.agraph.servers::port db.agraph.servers::nanny db.agraph.servers::jport db.agraph.servers::indexing db.agraph.servers::timeout db.agraph.servers::idle-life
function

Start a server that allows client programs to access AllegroGraph triple stores.

  • The port argument (default 4567) is the port number where the server listens for connections.

  • The jport argument is unused but reserved for situations where two ports are required.

  • The users argument specifies the number of simultaneous connections allowed. The default is 50.

  • The limit argument specifies the total number of connections allowed over the life of the server. The default is nil to specify unlimited connections.

  • The root argument, when non-nil, specifies a default pathname for the entire Lisp image running the server.

  • The nanny argument may be nil or a number n of seconds. When non-nil, all connections are scanned every n seconds and triple stores opened by dead connections are closed.

  • The ender argument specifies a function of no arguments. This function is called just before the triple store associated with a dead connection is closed.

  • The timeout argument specifies a time interval in seconds. If a triple store cannot be accessed within this time interval, an error is signalled.

  • The idle-life argument specifies a time interval in minutes. If a triple store is open but idle for this interval, it is closed automatically (and re-opened if needed).

  • The indexing argument is currently ignored.

  • The verbose and debug arguments may be non-nil to trigger status and progress messages.

The function returns two values: a server port instance that becomes the default server port, and the port number where the server is listening.

start-agj-server &rest db.agraph.servers::keys
function
This function is deprecated. The recommended function is start-ag-server.

stop-ag-server db.agraph.servers::s &optional (db.agraph.servers::clients-too t)
function

Method (agj-server)

Stop the connection identified by the argument.

Method (integer)

Stop the connection identified by integer.

Method (agj-server)

Stop the listening server. If the optional argument is non-nil, stop any running connections as well. The default is t.

Method (null)

Stop the default listening server. If the optional argument is non-nil, stop any running connections as well. The default is t.

stop-agj-server db.agraph.servers::s &optional (db.agraph.servers::clients-too t)
function
This function is deprecated. The recommended function is stop-ag-server.

ag-server-trace &optional db.agraph.servers::onoff
function

This function enables or disables progress messages from client calls. The arguments and results of each call from the client are printed.

The onoff argument may be omitted, nil, :toggle, :close, a string, or any other non-nil value.

Two values are returned, the previous trace state and the new trace state, in that order.

  • If the argument is omitted, the function returns the current trace state.

  • If the argument is nil, a previous trace state is terminated.

  • If the argument is :toggle, the current trace state is reversed.

  • If the argument is :close, a previous trace state is terminated.

  • If the argument is a string, it should be a pathname to a writable file. The file is opened and supersedes any existing file. Trace output is initiated, and sent to the file.

  • Any other non-nil value initiates trace output to standard-output.

agj-trace &rest db.agraph.servers::args
function
This function is deprecated. The recommended function is ag-server-trace.

ag-server-db &optional db.agraph.servers::tsx db.agraph.servers::setdb
function

This function maps triple store integer ids to the actual triple store instance.

The client code references each triple store by an integer identifier. This integer is printed in the ag-server-trace output.

  • If the tsx argument is nil or unspecified, the function retuns an alist of all known triple stores.

  • If the tsx argument is an integer, the function returns the specified triple store instance.

  • If the setdb argument is non-nil, then the current (global) value of *db* is set to the triple store instance.

agj-db &rest db.agraph.servers::args
function
This function is deprecated. The recommended function is ag-server-db.

Sesame

A sesame-server is an instance of the sesame-server class. It represents a server at a given host and port that can handle Sesame 2.0 HTTP requests. See the HTTP protocol reference for additional details.

start-sesame-server db.agraph.servers::server &rest db.agraph.servers::start-args
function

When the server argument is an instance of sesame-server, start that Sesame server instance.

When the server argument is nil, start a Sesame server using the default server instance.

stop-sesame-server db.agraph.servers::server &key db.agraph.servers::stop-wserver
function

When the server argument is an instance of sesame-server, stop that Sesame server instance.

When the server argument is nil, stop the default Sesame server instance.

make-sesame-server &key class net.aserve:wserver db.agraph.servers::indexing db.agraph.servers::path db.agraph.servers::timeout db.agraph.servers::idle-life net.aserve:start db.agraph.servers::output-stream
function

This function creates a sesame-server instance and optionally starts the HTTP server. The returned value is the server instance. This value also becomes the default server.

  • The path argument (default is /sesame) specifies the initial part of the server URL.

  • The timeout argument is the default timeout value for all registered triple stores. If a triple store is not available within that number of seconds, a bad request return code is sent back to the client. The default is 15 seconds.

  • The idle-life argument is the default idle interval in minutes for an open triple store. If a triple store staus open but idle for this interval, it is closed automatically. The default is 10 minutes.

  • The start argument, if non-nil, ensures that the HTTP server is running. If the value is a list, it is passed to the AllegroServe start function. The most common value is (list :port nnn).

  • The wserver argument may be an AllegroServe wserver instance.

  • The class argument (default sesame-server) allows applications to provide a sub-class.

  • If the output-stream argument is specified, it must be a stream. This stream will be bound to standard-output, terminal-io, and debug-io in worker threads running Sesame HTTP requests. Binding this argument to initial-terminal-io may be needed when running under eli.

export-to-sesame db.agraph.servers::server &rest db.agraph.servers::args &key db.agraph.servers::db name directory db.agraph.servers::id (db.agraph.servers::readable t) (db.agraph.servers::writable t) db.agraph.servers::title db.agraph.servers::timeout db.agraph.servers::indexing db.agraph.servers::idle-life
function

When the server argument is an instance of sesame-server, make a triple store available as a Sesame 2.0 Remote Repository on that server.

If db is specified, name and directory must be omitted or consistent. If name and directory are specified, db must be omitted or consistent.

The specified database will be opened and closed as required.

Id defaults to name; it is the id by which the database will be known to clients. If id is not unique, an error is signalled.

When the server argument is nil, make a triple store available as a Sesame 2.0 Remote Repository. The default server instance is used for the operation.

the AllegroGraph Reasoner

See the Reasoner tutorial for more details on using AllegroGraph's RDFS++ reasoner. It works with get-triples, get-triples-list, SPARQL and the Prolog qs functor. The following additional functions and variables are also part of the reasoner interface:

*use-reasoner*
variable
If true, then the RDFS++ reasoner will be used in calls to get-triples. This will effect the output of SPARQL and Prolog queries as well.

inferred-triple-p triple
function
Returns true if triple is the result of inferrence (as opposed to be physically present in the triple-store).

Clustering: indexing with multiple processors

AllegroGraph can use multiple processors (on the same machine or different machines) to dramatically increase indexing speed. This clustering ability is built on a more generic core from the net.cluster package which is itself built upon Allegro Common Lisp's RPC mechanisms. In practice, all you need to do is make sure that Allegro Common Lisp is installed on each machine that you want to use, setup your environment to tell AllegroGraph where the machines are located and then start indexing.

The details of setting up a network for clustering are outside of the scope of this reference though Franz will be including more information on-line in the coming months. You will want to make sure that all of the machines can access the data quickly over a shared drive. The easiest setup, of course, is to use a single computer with multiple processors or processor cores.

Use add-indexing-host to tell AllegroGraph what machines to use (and the net.cluster functions machines, remove-machine and remove-all-machines to manage them). You can optimize start-up time by calling start-remote-clients or start-all-remote-clients yourself. This starts another Lisp instance over the connection and prepares that instance to run tasks.

All of the functions that use background processes (such as indexing, merging and (eventually) querying) have a :wait parameter to let you specify if the task should wait until all processing is complete. You can learn more about the details of a task by use find-task on the return value of a function like index-all-triples to retrieve the task instance associated with a particular ID. Finally, you can use the indexing-status and unschedule-indexing functions to see the current state of a triple-store's indexing tasks.

add-indexing-host hostname &rest args &key max-tasks lisp remote-command verbose
function
Adds hostname as an additional processor for AllegroGraph to use when building indices. See the reference-guide for additional details.

find-cluster-code
function
Looks for the AllegroGraph cluster fasl and returns its path. Looks in ag-property :agraph-cluster-code-pathname and sys:agraph;agraph-cluster.fasl

net.cluster:find-task net.cluster::identifier
function
Search for a task matching identifier in the task-manager. The search will occur in pending tasks, running tasks and failed tasks. The kind of search depends on the type of identifier. If it is a number, then the match will be performed on the id of the task; if it is a symbol or a string, then the match will be performed on task-name; finally, if it is a task, then the task is just returned.

net.cluster:machines &key net.cluster::active-only
function
Returns a list of the current machines

net.cluster:remove-machine net.cluster::name &key (net.cluster::wait :wait)
function
Removes the machine named name.

net.cluster:remove-all-machines &key net.cluster::wait
function
Remove all machines one at a time by calling remove-machine.

net.cluster:start-remote-clients net.cluster:machine &key net.cluster::force-restart-p
function
Start one Lisp instance for each of the tasks on machine.

net.cluster:start-all-remote-clients &key net.cluster::force-restart-p
function
Start one Lisp instance for each of the tasks on all connected machines.

Notes on Thread Safety

CVS/ontology management tools

Ontology management is often done by non-programmers. It's a little much to ask non-programmers to use one of the APIs for AllegroGraph just to get data into a database. For these users, we have created a tool for reading entire ontologies, a collection of triples, into a database. This tool assumes:

Since ontologies must have a unique name, it is recommended users of this tool maintain a consistent naming scheme. For example, the full module and path in the module of the file containing the ontology. For example: ontology_module/groupA/ontology123.owl. Here ontology_module is the CVS module name and groupA/ontology123.owl is the file name of the ontology in the CVS module.

Key to the operation of these tools are the tagging, in CVS, of versions of the ontology to be added to the database. This is done with the CVS tag command. For example, in a working copy of the `ontology_module' module, you could execute this command:

$ cvs tag -F jason_beta groupA/ontology123.owl 

jason_beta is the name of the CVS tag and groupA/ontology123.owl is the name of the file in the module. -F means to forcibly move the tag to the current version, if the tag has been used before.

Now, we could use the update_ontology command, like so:

$ update_ontology ontology_name nasa.db jason_beta \  
	      ontology_module/groupA/ontology123.owl 

This would unload, or delete, all previous triples in nasa.db that have the ontology name ontology_name, and then load the triples from the file groupA/ontology123.owl with the version given by jason_beta in the CVS module ontology_module.

Requirements

update_ontology command line arguments

$ update_ontology [-v] ontology_name database cvs_tag module_path 

where

The update_ontology command will retrieve from CVS the exact version of the ontologies specified by the tag User1_beta and executes ths sync_ontology command:

$ sync_ontology ontology3 module:ontology3:User1_beta database 

where

sync_ontology command line arguments

update_ontology uses a lower-level command line tool, sync_ontology. The arguments to it are:

$ sync_ontology [--src format] ontology_name file ag_database 

where

Other uses (mainly for testing):

$ sync_ontology -p [-n limit | -n none]  
                [-f { concise | long | ntriple } ]  
                database [ontology_name] 

where

To delete triples:

$ sync_ontology -d database ontology_name 

where -d is used to delete triples for an ontology

To initialize a database:

$ sync_ontology -Z [-r expected-unique-resources] database 

where

Indices

All of the functions and variables in AllegroGraph's public interface are indexed below.

Function index

Variable index


  1. For information on the N-Triples format, see both the W3C description and the information in the RDF test cases. Additionally the file w3c-ntriples-tests.nt in the sample-inputs directory of the AllegroGraph distribution contains a set of test cases and provides a good overview of what can be represented in the N-Triples format.
  2. Details of the RDF/XML syntax can be found in the RDF/XML-primer and related documentation.
  3. Language tags are defined in RFC 3066
  4. This facility will become more flexible in future versions.
  5. To be precise, AllegroGraph uses the :standard-indices property to determine which indices newly created triple-stores should have. The default value for this property is to use all possible indices.
  6. Remember that select is a macro and that using it in the REPL will produce interpreted code. This means that selects run in the REPL will be significantly slower than those that you write inside of compiled functions. (Note also that both the HTTP and the Java interfaces to AllegroGraph ensure that any select calls get compiled before they run). Whether compiled queries execute significantly faster depends on whether the generated code performs a lot of backtracking or other repeated computation.