Introduction

This document describes AllegroGraph's SPARQL implementation. Each of the following functions are exported from the db.agraph.sparql package. This package is also nicknamed sparql.

For notes on AllegroGraph's conformance to the W3C specification please see this document.

As of version 4.4, AllegroGraph's supports all of SPARQL 1.1 query including:

AllegroGraph also provides partial support for SPIN.

Conceptually, SPARQL has three layers:

Currently, input and output from each of these layers is limited (for example, the query plan is not available to user code, but parsed output is). This may change in a future release.

Using SPARQL versus using Prolog

Prolog is an alternative query mechanism for AllegroGraph. The Prolog tutorial provides an introduction to using Prolog and AllegroGraph together. Prolog is further described here in the Lisp Reference (where further links are provided). This section is a brief note on the differences between our SPARQL query engine and the Prolog select query engine. The main differences are:

Choosing a Query Execution Engine

AllegroGraph currently has two query execution modes: Single Set and Chunk at a Time (CaaT). The former tends to be more time efficient and more memory hungry whereas CaaT takes more time but uses less memory.

Consider a query like the following:

select * {  
  ?x :p1 ?o .  
  ?o :p2 ?y .  
} 

In Single Set mode, the query engine iterates over all triples whose predicate is :p1 (call this the p1 cursor) and creates pairs of bindings for each ?x and ?o It will then create a p2 cursor to iterate over all of the triples whose predicate is :p2 and merge the data from these triples with the bindings that have matching values for ?o. In short, the engine proceeds through the query plan one step at a time and accumulates all of the results immediately.

The CaaT mode is similar but it processes large chunks of results rather than trying to work on the query in its entirety. In the example above, the CaaT engine would gather up a group of results from the p1 cursor and then make a p2 cursor to find answers. Once p2 was exhausted, the engine would go back to the p1 cursor and build up the next chunk of results. The size of the chunks is based on the amount of memory allocated to the query using the chunkProcessingMemory query option.

Valid result formats

There are three possible outputs from a SPARQL query:

AllegroGraph provides a number of different ways to serialize these results to a stream, provided as keyword symbols to the query functions. The results-format argument controls how ASK and SELECT query results are serialized; some possible formats are :sparql-xml, which serializes the result into the SPARQL XML result format, and :sparql-json, which uses the JSON format.

For CONSTRUCT and DESCRIBE, the value of the rdf-format argument applies.

The default formats are :sparql-xml and :rdf/xml respectively. Providing an unrecognized format will signal an error.

You can find out which formats are allowed for a particular verb by using get-allowed-results-formats and get-allowed-rdf-formats.

Exported functions

parse-sparql string  &optional  default-prefixes  default-base
function

Parse a SPARQL query string and return it as an s-expression.

Calling run-sparql repeatedly on an already parsed query avoids the parsing overhead.

You do not need an open triple-store in order to parse a query. Any parse errors will signal a sparql-parse-error.

The optional arguments provide BASE and PREFIX arguments to the parser, as if they were inserted at the front of string.

default-base
A string to use as the BASE for the SPARQL query
default-prefixes
A mapping from string prefix to namespace expansion; it can be a hash-table mapping strings to strings, or a list of (prefix expansion) elements like:
(("rdf" "http://www.w3.org/1999/02/22-rdf-syntax-ns#")  
 ("rdfs" "http://www.w3.org/2000/01/rdf-schema#")  
 ("owl" "http://www.w3.org/2002/07/owl#")  
 ("xsd" "http://www.w3.org/2001/XMLSchema#")  
 ("xs" "http://www.w3.org/2001/XMLSchema#")  
 ("fn" "http://www.w3.org/2005/xpath-functions#")  
 ("err" "http://www.w3.org/2005/xqt-errors#")) 

This list uses the same format as db.agraph:standard-namespaces.

run-sparql query  &rest  args  &key  db  default-base  default-prefixes  output-stream  if-exists  if-does-not-exist  results-format  rdf-format  limit  offset  from  from-named  default-dataset-behavior  default-graph-uris  using-named-graph-uri  remove-graph-uri  using-graph-uri  insert-graph-uri  with-variables  destination-db  cancel-query-on-warnings-p  host  basic-authorization  timeout  uuid  load-function  verbosep  primary?  parent-executor  service-db  user-attributes-prefix-permission-p  engine  &allow-other-keys
function

Execute a SPARQL query, then print or return the results.

query can be:

  • a string like: "select ?s ?p ?o { ?s ?p ?o }". It will be parsed by parse-sparql providing default-base and default-prefixes and then executed;

  • a previously parsed query, meaning the s-expression returned by parse-sparql (default-base and default-prefixes are ignored).

The repository against which the parsed query is executed is determined as follows:

  • If host is provided, it must be the URL of a SPARQL endpoint. The query will be sent to the endpoint over HTTP, and the results collected. The results are returned in the format specified by results-format or rdf-format (see below);

  • Otherwise the query is run against the repository given by db which defaults to *db*. Both create-triple-store and open-triple-store set *db*.

  • If neither host nor db are provided, and *db* is not set, an error is signaled.

There are two parameters that control the format of the results, and the relevant one depends on the query verb:

  • For SELECT, ASK and UPDATE queries the output format is given by results-format which must be one of get-allowed-results-formats (supplying the verb), for example :sparql-xml, :alists or :cursor.

  • For DESCRIBE and CONSTRUCT queries the output format is given by rdf-format which must be one of get-allowed-rdf-formats (supplying the verb), for example :turtle, :rdf/xml, :arrays or :cursor.

The query results are either printed, returned as data structure, or returned as cursor, according to the specified format:

  • for text-based formats (like :turtle, :rdf/xml or :sparql-xml) the results are written to output-stream which can be one of:
    • t, indicating *standard-output*;
    • an open stream, e.g. an open file;
    • a pathname, which will be opened providing if-exists and if-does-not-exist;
    • nil, indicating that output should be collected in a string, which will be returned as first value.
  • for programmatic formats (like :alists or :arrays) the results are collected in a data structure (like a list of alists, or a list of arrays) that will be returned as first value;

  • for the cursor format (:cursor) a cursor will be returned as first value. Operators like iterate-cursor can be used to iterate over the results. All the cursor handling code must be wrapped inside with-query-environment to ensure the cursor resources are properly released.

Function run-sparql returns four values: result, verb, variables, and metadata.

The first value, result, is one of:

  • an error code (keyword) if the query execution failed, like :failed, :memory-exhausted, :timeout, :service-failure;
  • the query results as data structure (for programmatic formats);
  • the output as string (for text-based formats with :output-stream nil);
  • a cursor (for the :cursor format);
  • or nil (otherwise).

The second value is the query verb: one of :select, :ask, :construct, :describe, or :update.

The third value, variables, depends on the query verb:

  • for SELECT queries: the list of query variables;
  • for ASK queries: the list (?result);
  • for DESCRIBE queries: nil
  • for UPDATE queries:
    • If a SILENT clause failed: the error object. For example if LOAD SILENT <http://example.com/non-existing.ttl> fails, the first return value, result, will indicate success, and the third value will be the "HTTP error" object.

    • Otherwise: nil.

  • for CONSTRUCT queries: the list of variables (?s ?p ?o).

The fourth value is a structure containing query execution metadata. In its printed representation it includes among others:

  • any warnings that occured during query execution;
  • how long the parsing, execution, and output of the query took;
  • the number of rows returned.

The execution of the query can be influenced by the following options:

  • The limit parameter acts as if the query contained that value as LIMIT. If the query already contains a LIMIT, the minimum of the two will be used.

  • The offset paramter acts as if the query contained that value as OFFSET. If the query already contains an OFFSET, the sum of the two will be used.

  • with-variables must be an alist of variable names and values. The variable names can be strings (which will be interned in the package in which the query is parsed) or symbols (which should be interned in the package in which the query is to be, or was, parsed). The variable names can include or omit a leading '?'. Note that a query literal in code might be parsed at compile time. Using strings is the most reliable method for naming variables.

  • Before the query is executed, the variables named after symbols will be bound to the provided values.

    This allows you to use variables in your query which are externally imposed, or generated by other queries. The format expected by with-variables is the same as that used for each element of the list returned by the :alists results-format.

The handling of graphs is influenced by the following options:

  • If from or from-named are provided, they override the corresponding values specified in the query string itself. As FROM and FROM NAMED together define a dataset, and the SPARQL Protocol specification states that a dataset specified in the protocol (in this case, the programmatic API) overrides that in the query, if either from or from-named are non-nil then any dataset specifications in the query are ignored. You can specify that the contents of the query are to be partially overridden by providing t as the value of one of these arguments. This is interpreted as 'use the contents of the query'. from and from-named should be lists of URIs: future-parts, UPIs, or strings.

  • load-function must be nil or a function with signature (uri db type). If it is a function, it is called once for each FROM and FROM NAMED parameter making up the dataset of the query. The execution of the query commences once each parameter has been processed. The type argument is either :from or :from-named, and the uri argument is a part (ordinarily a future-part) naming a URI. The default value is taken from *dataset-load-function*. You can use this hook function to implement loading of RDF before the query is executed.

  • The using-named-graph-uri and using-graph-uri parameters are used similarly to the from and from-named parameters except that they are relevant only to SPARQL UPDATE commands.

  • The remove-graph-uri and insert-graph-uri parameters are also used only in SPARQL update. The first specifies a graph or list of graphs from which each triple to be deleted should be removed whereas the second specifies a graph or list of graphs into which each triple to be inserted should be added.

  • default-dataset-behavior controls how the query engine builds the dataset environment if FROM or FROM NAMED are not provided. Valid options are :all (ignore graphs; include all triples) and :default (include only the store's default graph).

  • default-graph-uris allows you to specify a list of resources which, when encountered in the SPARQL dataset specification, are to be treated as the default graph of the store. Each resource can be a resource UPI, resource future-part, or a URI string. For example, specifying '("http://example.com/default") will cause a query featuring

    FROM <http://example.com/default>  
    FROM <http://example.com/baz> 
  • to execute against the union of the contents of the named graph <http://example.com/baz> and the store's default graph, as determined by (default-graph-upi db).

Relevant for UPDATE queries:

  • destination-db (db by default) specifies the triple store against which Update modifications should take place. This is primarily of use when db is a read-only wrapper around a writable store, such as when reasoning has been applied.

Miscellaneous options:

  • If verbosep is t, a log of the query execution is written to *sparql-log-stream* (*standard-output* by default).

  • basic-authorization is used when making SPARQL SERVICE calls. It must be a cons cell whose car is the user name and whose cdr is the password. For example: '("user" . "password")

  • If cancel-query-on-warnings-p is true, then any warning found during query planning will immediately cancel the query. Setting this to true can help in query debugging.

  • timeout specifies the number of seconds after which query execution will be canceled. Timeout checks occur regularly during query computation, but not during result generation. A value of nil (the default) means no timeout.

  • user-attributes-prefix-permission-p must be true in order to run queries that include "prefix franzOption_userAttributes"; otherwise such errors signal an error.

The following parameters are used internally by AllegroGraph and should not be used: primary?, parent-executor, uuid, service-db and engine.

get-allowed-results-formats &optional  verb  engine
function
Returns a list of keyword symbols that are valid when applied as values of results-format to a query with the given verb. if verb is not provided, the intersection of :ask and :select (the two permitted values) is returned. With AllegroGraph 3.0, an additional engine argument is available. In a similar manner to verb, omitting this restricts the returned values to those that apply to all built-in query engines.
get-allowed-rdf-formats &optional  verb  engine
function

Returns a list of keyword symbols that are valid when applied as values of rdf-format to a query with the given verb. if verb is not provided, the intersection of :construct and :describe (the two permitted values) is returned. With AllegroGraph 3.0, an additional engine argument is available. In a similar manner to verb, omitting this restricts the returned values to those that apply to all built-in query engines.

Example:

  • Get formats for CONSTRUCT queries executed by the algebra query engine.

    (get-allowed-rdf-formats :construct :algebra)


Extension functions

SPARQL allows for query engines to associate extension functions with URIs, and call them from within queries.

You can define your own URI functions through defurifun, or associate existing functions with a URI through associate-function-with-uri. defurifun does some manipulation of the arguments, so you should use it whenever possible.

associate-function-with-uri function  uri  &key  cache-now-p  db
function
Assert a mapping between uri, which is a string or a valid part, and the provided function, which is a symbol or a function. If cache-now-p, and function is a symbol, its function binding is stored instead of the symbol itself.
print-function-uri-mappings &key  stream  db
function
Print all mappings between URIs and functions to stream *standard-output* by default).
defurifun name  uri  args  &body  body
macro
Define a new function, name, and associate it with uri as with associate-function-with-uri. args is not evaluated, exactly as with defun.

Here's an example: a function that will do an HTTP HEAD request against the provided URL, returning the HTTP status code as an integer literal, or 0 if there's a problem.

(The built-in functions are quite robust, so a Lisp integer will be treated as an RDF literal with data type xsd:integer.)

(defurifun ex-head-request !<http://example.com/fn/head> (uri)  
  (or  
    (when uri  
      (ignore-errors  
        (format t "~&Performing HTTP HEAD request on <~A>...~%"  
                  (upi->value uri))  
        (second  
          (multiple-value-list  
            (net.aserve.client:do-http-request (upi->value uri)  
                                               :method :head)))))  
    0)) 

You can use this function in a query exactly as you would a built-in function.

Using this data as an example:

<http://ex.com/a> <http://ex.com/foo> "200"^^<http://www.w3.org/2001/XMLSchema#integer> . 

we can run a query like so:

sparql(54): (run-sparql  
"  
PREFIX f: <http://example.com/fn/>  
SELECT ?x {  
  ?x <http://ex.com/foo> ?y .  
  FILTER ( ?y = f:head("http://franz.com\") )  
}"  
  :results-format :count) 

which produces this output:

Performing HTTP HEAD request on <http://franz.com>...  
1  
:select  
(?x) 

… we know, then, that franz.com is returning a 200 status code.

Note that these filter functions can be called an arbitrary number of times during the execution of a query. It's not a good idea to actually perform expensive operations like HTTP requests in your queries.

SELECT bindings and ASK results

run-sparql allows you programmatic access to results in a number of ways.

Any of the following results-formats are suitable as arguments to SELECT or ASK queries:

The following results-formats are suitable as arguments to SELECT queries:

The following results-formats are suitable as arguments to ASK queries:

Returning triples from CONSTRUCT and DESCRIBE queries

Any of the following rdf-formats are suitable as arguments to CONSTRUCT or DESCRIBE queries:

The following rdf-format is suitable for DESCRIBE queries:

The following rdf-format is suitable for CONSTRUCT queries:

Finally, SPARQL can return results from CONSTRUCT and DESCRIBE queries as in-memory triple stores, using the :in-memory format. These triple-stores support the full AllegroGraph API and can therefore be queried and serialized just like a regular triple-store. When no references to them remain, they will be garbage collected just like any other Lisp data-structure.

You can use get-allowed-results-formats and get-allowed-rdf-formats to access these allowed values dynamically at run-time.

Variables

Programmatic results associate values with variables. Variables are parsed into symbols by the query parser.

The mapping from variables to symbols is straightforward, and best illustrated by example:

If you provide variables in a with-variables argument, a leading ? is prepended to the variable name. Your queries will run correctly if you provide them as s-expressions and do not prepend ?, but:

All variables created by the parser are interned in the current package, as if by a call to cl:intern. You should adhere to these rules when processing results or providing bindings using with-variables.

SPARQL and first-class triples

AllegroGraph permits you to make assertions about triple IDs (UPIs of type triple-id). SPARQL offers no support for this: only named graphs are supported. First-class triples are entirely outside the scope of both RDF and SPARQL.

SPARQL queries against stores using first-class triples are not supported. AllegroGraph's SPARQL engine makes only limited provisions for such queries:

It bears repeating that SPARQL is not intended to work with first-class triples; any queries that run successfully are little more than accidents, and named graphs are a better choice in all cases.

Datasets

Dataset loading

It is sometimes useful to be able to process the SPARQL dataset — the set of URIs provided as FROM and FROM NAMED parameters — when a query is executed. AllegroGraph provides a dataset load hook for your convenience.

You may bind a function to *dataset-load-function* to specify a default, or pass one as the :load-function argument to run-sparql. Passing nil disables the hook for that query. The argument list of the function is described in *dataset-load-function*.

Default dataset handling

When no dataset (FROM and FROM NAMED) are provided to a query, the actual dataset against which the query is run is not defined by the SPARQL specification. AllegroGraph provides you with two options: :default, meaning that the default part of the dataset contains only the default graph of the store; and :all, whereby both the default and named parts of the dataset contain every graph in the store.

You can control the default behavior by setting *default-dataset-behavior* (formerly *sparql-default-graph-behavior*), and set the behavior for specific queries by passing the :default-dataset-behavior argument to run-sparql.

Verbose output

Logging output when queries are run in verbose mode is written to db.agraph.query.sparql:*sparql-log-stream*. This is *standard-output* by default.

SPARQL and encoded values

AllegroGraph offers the ability to directly encode a range of literal values — numbers, geospatial values, and more — directly within a UPI, without the overhead of a string representation as an RDF literal. Whenever these encoded values are encountered by AllegroGraph's printing functions, and in many other situations, they are seamlessly treated as RDF literals, but with significant time and space savings.

AllegroGraph's implementation of most SPARQL and XQuery operators also handles encoded values transparently.

SPARQL Query Options

AllegroGraph provides control over a number of internal settings by extending the SPARQL PREFIX notation. Options are changed by prepending a PREFIX of the form:

PREFIX franzOption_optionName: <franz:optionValue> 

where optionName and optionValue are replaced by the name and value of the option being changed.

Options can also be specified in the configuration file, which is described in the Server Configuration document. See here in that document for how options are specified.

The available options are subject to change as some of them are experimental. The following is a list of the currently available options:

authorizationBasic
query-option

The username and password to use for basic authorization

This is used when making a SPARQL SERVICE call.

Example of specifying the query option value via a PREFIX:

PREFIX franzOption_authorizationBasic: <franz:user:password> 

The default value is no authorization setting

cancelQueryOnWarnings
query-option

If true, then warnings found during query parsing, planning and execution will cause a query to fail immediately rather than continuing.

Warnings include things like unknown variables in a ORDER BY clause or FILTER expression, constants in the query that cannot be in the store and so on.The possible values are:

  • yes - turn the option on
  • no - turn the option off

Example of specifying the query option value via a PREFIX:

PREFIX franzOption_cancelQueryOnWarnings: <franz:no> 

The default value is: no

chunkProcessingAllowed
query-option

Controls whether to use Chunk at a Time (CaaT) processing.

It can be:

  • possibly - use CaaT for unordered queries with small limits and use the single-set approach otherwise. Note that this works best when solutions are found in the first several chunks processed which means that the query can finish quickly. If a large portion of the search space must be scanned, then the single-set approach can be faster.

  • yes - always use CaaT when possible (some query clauses like EXISTS filters and SPIN magic properties do not yet support CaaT).

  • no - always use the single set approach and never use CaaT.

The default value is possibly which means that AllegroGraph is optimizing for speed rather than space. The no option is focused on speed at the possible cost of higher memory use whereas the yes option is more constrained in memory use at the cost of slower queries.

Example of specifying the query option value via a PREFIX:

PREFIX franzOption_chunkProcessingAllowed: <franz:possibly> 
chunkProcessingMemory
query-option

Specifies the maximum amount of memory used by a single chunk.

Controls the size (in bytes) of the chunks used by the CaaT executor. This option takes precedence over the deprecated chunkProcessingSize option.

The minimium allowed value is 200M.

See the chunkProcessingAllowed option for additional query control.

Example of specifying the query option value via a PREFIX:

PREFIX franzOption_chunkProcessingMemory: <franz:4294967296> 

The default value is: 4,294,967,296

chunkProcessingSize
query-option

(Deprecated) Specifies the chunk processing size in rows

Deprecated in favor of the chunkProcessingMemory option.

Control the size (in rows of answers) of the chunks used by the CaaT executor. The higher the number, the larger the chunks processed will be which is both more efficient and more memory intensive. A typical value is 400000 or 1000000.

See the chunkProcessingAllowed option for additional control.

Example of specifying the query option value via a PREFIX:

PREFIX franzOption_chunkProcessingSize: <franz:400000> 

The default value is: 400,000

clauseReorderer
query-option

The strategy used to reorder triple patterns in a query.

This option controls how the triple patterns in a single Basic Graph Pattern (BGP) are reordered.

The available strategies will depend on the query engine being used but will always include identity which tells the query planner to not reorder the triple patterns of the BGPs. Another common choice is statistical which uses the statistics of the triple-store to try to reorder clauses most efficiently.

Note that other query planning algebraic manipulations may cause BGPs in your query to be merged and that reordering does not extend to larger query structures (like UNION or OPTIONAL).

Example of specifying the query option value via a PREFIX:

PREFIX franzOption_clauseReorderer: <franz:statistical> 

The clause reorderer defaults to statistical

defaultAttributes
query-option

Specify the default attributes to assign to any triple created by a SPARQL update command.

The attributes must be specified in URL encoded JSON format. So the example below is using the URL encoded form of which is the URL encoded form of {"rank": "High" }.

Example of specifying the query option value via a PREFIX:

PREFIX franzOption_defaultAttributes: <franz:%7B%22rank%22%3A%20%22High%22%20%7D> 

No attributes

defaultDatasetBehavior
query-option

In a SPARQL query the FROM clause specifies the default graph of the dataset, and the FROM NAMED clause specifies the named graphs of the dataset (see Specifying RDF Datasets in the SPARQL 1.1 standard).

If a SPARQL query does not specifiy FROM or FROM NAMED then it is up to the implementation to choose a behaviour. AllegroGraph offers three possibilities. The table below indicates which triples are present in which part of the dataset for each possible behaviour:

                                        +--------------------------+  
             default dataset behaviour: |   all  |   rdf  | default|  
                                        + - - - -| - - - -| - - - -|  
dataset's default graph / named graphs: | DG  NG | DG  NG | DG  NG |  
                                        |        |        |        |  
       triple (s,p,o) in default graph: |  x     |  x     |  x     |  
       triple (s,p,o,g) in named graph: |  x  x  |     x  |        |   "x" means present  
                                        +--------+--------+--------+ 

The possible values are:

  • all - All triples are present in the dataset's default graph. Triples in a named graph are present in the dataset in that named graph. Triples in a named graph thus occur twice in the dataset: once in the default graph, and once in the named graph.
  • rdf - Triples in the default graph are present in the dataset's default graph. Triples in a named graph are present in the dataset in that named graph.
  • default - Triples in the default graph are present in the dataset's default graph. The dataset has no named graphs.

Example of specifying the query option value via a PREFIX:

PREFIX franzOption_defaultDatasetBehavior: <franz:all> 

The default value is: all

diskChunkRowCount
query-option

Specifies the number of solutions to keep in memory before writing temporary files.

This should be a number like 500000 or 100m. The larger the value, the more memory AllegroGraph will use during query processing. Smaller values can be more memory efficient but also can perform more slowly because the will be more I/O activity.

Note that this setting controls the memory used to hold completed solutions not the memory used to hold intermediate solutions. See the chunkProcessingMemory option for more details.

Example of specifying the query option value via a PREFIX:

PREFIX franzOption_diskChunkRowCount: <franz:500000> 

The default value is: 500,000

enableDuplicateReduction
query-option

Controls whether or not the SPARQL engine reduces the number of duplicate triples it processes.

If the store has no duplicate triples (based on subject, predicate, object, and graph) then this option has no effect.

SPARQL is defined on sets of triples (i.e., no duplicates) but AllegroGraph can store duplicates which can lead to duplicate rows of bindings returned by a query.

If this option is set to true, then AllegroGraph will ignore consecutive duplicate triples returned from the storage layer. Since the storage layer makes no guarentees on the order of triples it returns, this can lead to different results from the same query.

Example of specifying the query option value via a PREFIX:

PREFIX franzOption_enableDuplicateReduction: <franz:true> 

Duplicate reduction defaults to off

logLineLength
query-option

Controls the length of query log lines.

The logLineLength query option limits the maximum length of each line of the query log. This can make the log easier to read at the cost of removing some information. Use zero print the entirety of every log message.

See the logQuery option for more details.

Example of specifying the query option value via a PREFIX:

PREFIX franzOption_logLineLength: <franz:512> 

The default value is: 512

logQuery
query-option

Controls whether or not query execution details are logged.

logQuery can be 'no', 'yes', or 'onFailure'.

The length of the log lines can be limited by using the logLineLength query log option.

If logging is on, the query engine prints additional information to the AllegroGraph log file as it plans and executes a query. If logging is onFailure, then query log information is gathered but not emitted unless there is a query failure.

Logging on failure has a small cost especially when the amount of data logged is high (e.g., when chunkProcessingAllowed is turned on). We recommend setting the value to onFailure during development and then turning it to no for production.

Example of specifying the query option value via a PREFIX:

PREFIX franzOption_logQuery: <franz:no> 

The default value is: no

maximumSolutionsSize
query-option

Specifies an upper limit on the number of solutions that are allowed during query processing before a warning is logged.

Queries run best when the solution space is kept small. This warning is in an indication that a query is generating many intermediate results. This is a normal part of query processing but can indicate that a query should be optimized

Example of specifying the query option value via a PREFIX:

PREFIX franzOption_maximumSolutionsSize: <franz:100k> 

The default is to warn when the intermediate solution space is larger than 100,000,000 solutions

maximumValuesCountForService
query-option

This option limits the number of VALUES that AllegroGraph will send to a SPARQL endpoint when executing a SERVICE clause. Sending partial results to the endpoint can help it answer the query more quickly but if the number of partial results is very large, the cost of data transfer can offset the help of supplying the data.

If the number of VALUES exceeds the limit, the the query will be sent to the endpoint with no VALUES supplied.

Example of specifying the query option value via a PREFIX:

PREFIX franzOption_maximumValuesCountForService: <franz:1048576> 

no limit

memoryExhaustionWarningPercentage
query-option

Specify how much system memory must be free for a query to continue.

If the query process is using more than this setting's percentage of total physical memory, then the query will be canceled. The default value is 90%.

Example of specifying the query option value via a PREFIX:

PREFIX franzOption_memoryExhaustionWarningPercentage: <franz:90.0> 

The default value is: 90.0

memoryLimit
query-option

Specifies the memory limit per query.

If a query tries to use more than this, it will be canceled.

Example of specifying the query option value via a PREFIX:

PREFIX franzOption_memoryLimit: <franz:8G> 

The default value will be 85% of the physical memory on the server

presentationTimeZone
query-option

The timezone in which xsd:dateTimes and xsd:times are serialized.

For example, if presentationTimeZone is "-02:00", then "2013-10-01T15:21:23+03:00" is serialized as "2013-10-01T10:21:23-02:00". Zoneless xsd:datetimes and xsd:times are always presented without a timezone. This option has no effect on what is stored in the database. The allowed values are strings representing the timezone. The format of these strings is the same as in xsd:dateTimes. The special value "none" means that no conversion will take place.

Example of specifying the query option value via a PREFIX:

PREFIX franzOption_presentationTimeZone: <franz:-5:00> 

The default is set to none.

queryTimeout
query-option

Specifies a query timeout value in seconds.

Note that the timeout is not an interrupt; AllegroGraph checks for query timeout relatively infrequently so that a query can run for many seconds longer than the specified timeout. This is especially true for operations involving reasoning or non-triple-pattern based queries like free-text indexing or SNA path planning operators.

Setting the timeout to zero is the same as having no timeout.

Example of specifying the query option value via a PREFIX:

PREFIX franzOption_queryTimeout: <franz:30> 

The default is to have no query timeout. I.e., queries will run until complete.

reorderDuringExecution
query-option

Controls whether or not AllegroGraph interleaves query execution and triple-pattern selection.

If no, then AllegroGraph will perform all reordering during query planning. If yes, then AllegroGraph will defer reordering until query execution time. In many cases, the additional information available at execution time can enhance query performance.

Note that interleaving reordering is not always a win because performing all ordering at query planning time allows for the query engine to introduce joins which can sometimes enhance query performance.

See the clauseReorderer option for additional informationThe possible values are:

  • yes - turn the option on
  • no - turn the option off

Example of specifying the query option value via a PREFIX:

PREFIX franzOption_reorderDuringExecution: <franz:no> 

The default value is: no

serviceTimeout
query-option

The number of seconds to wait before a remote query times out.

This will also have an effect on SPARQL Federated query (i.e., using the SERVICE clause).

Example of specifying the query option value via a PREFIX:

PREFIX franzOption_serviceTimeout: <franz:120> 

The default value is: 120

slowQueryLogThreshold
query-option

Specifies query duration threshold in milliseconds that triggers slow query logging.

If a query's runtime exceeds the threshold, it will be logged either to the file specified by the slowQueryLogFile configuration setting or to agraph.log.

Example of specifying the query option value via a PREFIX:

PREFIX franzOption_slowQueryLogThreshold: <franz:1000> 

The default is to not log slow queries.

solrQueryLimit
query-option

Specifies the maximum number of results to return from a given SOLR query.

Example of specifying the query option value via a PREFIX:

PREFIX franzOption_solrQueryLimit: <franz:100> 

The default value is: 100

temporaryFilesystemSpaceLimit
query-option

Specifies the maximum amount of temporary file space that may be used by a query.

If a query tries to use more file space than this, it will be canceled.

Queries write intermediate results to the filesystem when they will not fit in memory. With a huge query it is possible for such temporary files to fill the filesystem. In order to prevent this, the temporaryFilesystemSpaceLimit query option may be set.

The minimum allowable value for this setting is 2-gigabytes.

Example of specifying the query option value via a PREFIX:

PREFIX franzOption_temporaryFilesystemSpaceLimit: <franz:974780416> 

The default value is to use the minimum of 8-gigabytes and one quarter of the available filesystem space at the time the query begins.

trustEncodedDatatypesForRangeQueries
query-option

If yes, then range queries will not scan typed literal triples.

This means that only encoded triples will be considered. The only reason to set this option to no is if your triple-store contains typed literals that are not encoded (i.e., that are in the string-table) which could happen if you disabled AllegroGraph's datatype mapping.The possible values are:

  • yes - turn the option on
  • no - turn the option off

Example of specifying the query option value via a PREFIX:

PREFIX franzOption_trustEncodedDatatypesForRangeQueries: <franz:yes> 

The default value is: yes

trustPredicateTypeMappingsForRangeQueries
query-option

If yes, then predicate type mappings will be used for range queries.

This means that any triples whose encoded data-type does not match their predicate mapping will be ignored. This could happen only if a predicate mapping was added or changed after triples had been added.The possible values are:

  • yes - turn the option on
  • no - turn the option off

Example of specifying the query option value via a PREFIX:

PREFIX franzOption_trustPredicateTypeMappingsForRangeQueries: <franz:yes> 

The default value is: yes

usePredicateConstrainedUpiTypeInformation
query-option

Use subject and object UPI type-codes to improve constraint inference

If yes, then the query engine will gather information about the subjects and objects associated with particular predicates. This can be used in constraint analysis and query transformations. As an example, suppose we have a query like:

?one ex:date ?date1 .  
?two ex:date ?date2 .  
filter( ?date1 > ?date2 ) 

If there is no predicate type-mapping, then the query engine can not make any assumptions about the range comparison. If there is a predicate type-mapping and trustPredicateTypeMappingsForRangeQueries is true, then the engine can know that the filter can be treated as a date comparison. If usePredicateConstrainedUpiTypeInformation is yes, then the query engine will check the triple-store to determine which UPI type-codes the subjects and objects associated with ex:date can take on. If the objects of ex:date only have, e.g., UPI type-code +rdf-date+, then the filter will be handled more efficiently.

The type-code information is cached but if the store is changing rapidly, then the cache will often be invalid and this computation will slightly add to the cost of queries.The possible values are:

  • yes - turn the option on
  • no - turn the option off

Example of specifying the query option value via a PREFIX:

PREFIX franzOption_usePredicateConstrainedUpiTypeInformation: <franz:yes> 

The default value is: yes

usePredicateConstrainedXsdTypeInformation
query-option

Use typed-literal XSD types to improve constraint inference.

Similar to usePredicateConstrainedUpiTypeInformation but involves a scan of all typed-literals (which can be expensive). This is currently not cached!The possible values are:

  • yes - turn the option on
  • no - turn the option off

Example of specifying the query option value via a PREFIX:

PREFIX franzOption_usePredicateConstrainedXsdTypeInformation: <franz:no> 

The default value is: no

userAttributes
query-option

Specify the user attributes to use while evaluating the query.

The attributes must be specified in URL encoded JSON format. So the example below is using the URL encoded form of {"access-level": "medium", "department": "hr"}.

Example of specifying the query option value via a PREFIX:

PREFIX franzOption_userAttributes: <franz:%7B%22access-level%22%3A%20%22medium%22%2C%20%22department%22%3A%20%22hr%22%7D> 

empty

Query Warnings

SPARQL is relatively lax when it comes to accepting and evaluating queries. For example, this query is valid SPARQL but it is probably not what was intended ?variableIsNotBound is not actually bound by the triple patterns in the rest of the query:

select ?variableIsNotBound {  
  ?s ?p ?o .  
} 

A more typical example could be caused by a typo. For example, changing the case of a variable:

select ?subjectOne ?class {  
 ?subjectone rdf:type ?Class .  
} 

Neither ?subjectOne nor ?class will be bound since the query uses ?subjectone and ?Class. When AllegroGraph plans and executes a query, it will detect problems like the above and generate query warnings. The cancelQueryOnWarnings query option can be used to stop execution immediately when any warnings are found.

The following is a list of the currently defined warnings:

warn-bgp-cross-product
query-warning

This warning indicates that one or more basic graph patterns (BGPs) in the query create a cross product. I.e., there are patterns in the query that have disjoint sets of variables which will cause the SPARQL engine to find all possible matches between the sets which can lead to very large solution sets. For example:

SELECT * {  
  ?a ?b ?c .  
  ?d ?e ?f .  
} 

Since the first triple-pattern and the second triple-pattern share no variables, the above query will find all possible combinations of each pair of triples in the underlying repository.

warn-dataset-graph-not-found
query-warning

This warning is signaled when a query specifies a DATASET and one or more graphs in its FROM or FROM NAMED portions are not in the repository.

For example, if 'http://example#22` is not in the repository, then these queries will signal the warning:

SELECT *  
FROM <http://example#1>  
FROM <http://example#22> {  
  ?s ?p ?o .  
}  
 
SELECT *  
FROM NAMED <http://example#22> {  
  ?s ?p ?o .  
} 
warn-differing-limits
query-warning
This warning is signaled when the query string contains a LIMIT and a limit is also specified in the call to SPARQL (e.g., from the parameters to run-sparql or in the HTTP request), and these two values do not match. When this happens, the smaller of the two values is used.
warn-differing-offsets
query-warning
This warning is signaled when the query string contains an OFFSET and an offset is also specified in the call to SPARQL (e.g., from the parameters to run-sparql or in the HTTP request). In this case, the sum of the two offsets will be used.
warn-empty-bind-clause
query-warning

This warning is signaled when a SPARQL query uses BIND in such a way that no solutions are available to be processed. For example, the BIND in this query appears before anything else so there is nothing for it to do and ?X will always be unbound:

SELECT * {  
  BIND(?object as ?X)  
  ?s ?p ?object .  
} 
warn-graph-not-in-dataset
query-warning

This warning is signaled when a query uses a DATASET and the graph of a GRAPH clause is not in its FROM NAMED portion. For example

SELECT *  
FROM NAMED <http://graph1>  
FROM NAMED <http://graph2> {  
  GRAPH <http://example#22> {  
    ?s ?p ?o  
  }  
} 
warn-impossible-atomic-constraint
query-warning
This warning occurs when a FILTER expression can be determined to be impossible at plan time. For example, FILTER ( false ) can never succeed.
warn-impossible-freetext-query
query-warning
This warning is signaled for queries that use AllegroGraph's freetext search Magic Properties when it can be determined that such a query cannot succeed. For example, if the search expression contains only stop words (which are not indexed), then the expression cannot match.
warn-impossible-in-constraint
query-warning
This warning is signaled when an IN expression has an empty list. For example FILTER( ?x IN ( ) ).
warn-impossible-language-literal-constraint
query-warning

This warning indicates that a FILTER expression puts invalid constraints on the language of a variable. For example:

SELECT * { ?s ?p ?o . FILTER( LANG(?o) = 'es' && LANG(?o) = 'en') } 

cannot succeed because the LANG(?o) cannot be both 'es' and 'en'.

warn-impossible-membership-constraint
query-warning

This warning is signaled when a query has a filter with multiple alternatives, and it is known at plan time that none of the alternatives will succeed.

Alternatives can occur in various forms:

{ ?s (<ex:foo> | <ex:bar>) ?o }      -- matches <ex:foo> or <ex:bar>  
FILTER(?var in (<ex:foo>, <ex:bar>)) -- matches <ex:foo> or <ex:bar>  
FILTER(str(?var) = 'ex:foo')         -- matches 'ex:foo' or <ex:foo> 
warn-impossible-range-constraint
query-warning
This warning is signaled when a range constraint cannot succeed because the set of possible values is empty. For example, this FILTER can never succeed: FILTER( ?o > 3 && ?o < 0 ).
warn-impossible-str-literal-constraint
query-warning

This warning indicates that a FILTER expression cannot succeed because it is comparing a string with a non-string. E.g.,

SELECT * { ?s ?p ?o . FILTER( STR(?o) = 45 ) } 
warn-invalid-triples-generated
query-warning

This warning is signaled at query time when a CONSTRUCT or UPDATE template generates invalid RDF triples. For example, this query will emit no triples because ?s takes on only literal bindings and these are not valid in the subject position:

CONSTRUCT {  
  ?s a example:Car .  
} WHERE {  
  VALUES ?s { 'mazda 3' 'ford pinto' 'bmw 300i' }  
} 
warn-invalid-triples-in-template
query-warning
This warning is signaled when a CONSTRUCT or UPDATE template has triple patterns in it that are not valid RDF. For example, a triple pattern may not have a literal in the subject position.
warn-join-cross-product
query-warning

This warning indicates that the query algebra contains one or more cross products. I.e., there are portions of the algebra that have disjoint sets of variables which will cause the SPARQL engine to find all possible matches between the sets which can lead to very large solution sets. For example:

SELECT * {  
  ?a a ?type .  
  VALUES ?FOO { <ex://a> <ex://b> }  
} 

Since the triple-pattern and the VALUES clause share no variables, the above query will find all possible combinations from the two sets.

warn-limit-in-ask
query-warning
This warning is signaled when an ASK query specifies a LIMIT different from 1.
warn-literal-graph-clause
query-warning

This warning is signaled if a GRAPH clause specifies a literal for the GRAPH. For example:

SELECT * {  
  GRAPH ?g {  
    ?s ?p ?o .  
  }  
  BIND( 'literal' as ?g )  
} 
warn-literal-variable-required
query-warning
This warning is signaled when a FILTER expression cannot succeed because a variable in it is known to be a resource or blank node and the expression requires it to be a literal. For example ?s ?p ?o . FILTER( ?s = 34 ) must fail because ?s is bound to the subject of a triple and subjects cannot be literals.
warn-offset-in-ask
query-warning
This warning is signaled when an ASK query specifies an OFFSET.
warn-order-by-in-ask
query-warning
This warning is signaled when an ASK query specifies ORDER BY.
warn-range-constraint-cannot-be-satisfied
query-warning
This warning is signaled when a range constraint cannot succeed because AllegroGraph has determined that the repository does not contain any matching triples.
warn-range-filter-predicate-mapping-mismatch
query-warning

This warning is signaled when a predicate has a type mapping which does not match the datatype of the values used in a range FILTER. For example, if <http://example#age> has a mapping to xsd:byte, then this query will signal the warning:

SELECT * {  
  ?s <http://example#age> ?age .  
  FILTER( ?age > '2001-10-15'^^xsd:date )  
} 

because the FILTER cannot succeed.

warn-repeated-projected-variables-ignored
query-warning
warn-service-variable-is-unbound
query-warning
This warning indicates that the query contains a SERVICE clause with a variable HOST and the HOST variable is not bound by the rest of the query.
warn-sparql-type-errors
query-warning
This warning is signaled when SPARQL type errors are encountered during query evaluation. Examples of type errors include trying to add a number to a date, trying to compare a string with a time, etc.
warn-unknown-constants
query-warning

This warning is signaled when a query contains a constant value (e.g. IRI or literal value) that is not present in the repository. For example if is not used as predicate:

SELECT * { ?s <ex:pred> ?o }  -- gives: "No such predicate <ex:pred>" 
warn-unknown-variables
query-warning

This warning is signaled when it can be determined that a variable used in an expression is not bound anywhere in the query. Examples include:

SELECT * { ?s ?p ?o . } ORDER BY ?missing  
 
SELECT * { ?s ?p ?o . FILTER( ?missing > 5 )  
 
CONSTRUCT { ?missing ?p ?o } WHERE { ?s ?p ?o } 

and so on.

warn-unused-bind-variable
query-warning
This warning is signaled when a variable in a BIND expression is not used elsewhere in the query.

SPARQL functions

AllegroGraph supports the entire standard set of functions specified in the W3C SPARQL reference. It also supports several XPath constructor functions, XPath mathematical functions, and a number of custom functions designed to help with using AllegroGraph's extensions such as Geospatial and Social Network Analysis.

XPath Constructor Functions

AllegroGraph supports the standard SPARQL casting operations. For details, refer to the SPARQL reference for more details.

Functions on Dates and Times

Hash Functions

Functions on Numerics

SPARQL Operators

These functions are described in detail in the Operator Mappings section of the W3C SPARQL reference.

Functions on RDF terms

Functional Forms

Functions on Strings

Supported XPath functions

AllegroGraph also supports several XPath functions:

Functions on Dates and Times

XPath Mathematical Functions

For additional details on the XPath mathematical functions, see https://www.w3.org/TR/xpath-functions-3/.

Miscellaneous functions

Functions on Numerics

Functions on Strings

AllegroGraph extensions

AllegroGraph supports several functions above and beyond those defined by the SPARQL standard. The additional functions are named by URI (or prefixed name) and can appear anywhere in a SPARQL expression (e.g., in a BIND, a FILTER, an ORDER BY, etc.).

nD Geospatial

These functions are useful in working with AllegroGraph nD-geospatial facilities:

2D Geospatial

These functions are useful in working with the older AllegroGraph 2D Geospatial facilities:

2D Geospatial (deprecated)

These deprecated 2D Geospatial functions are still supported but you should change your queries to use the above functions instead.

Social Networking (SNA) Related functions

These functions can help when using AllegroGraph's SNA extensions.

Miscellaneous other functions

These functions help connect SPARQL to various other AllegroGraph features.

Variables

*sparql-log-stream*
variable
The log stream to which SPARQL verbose output is written.
*sparql-table-width*
variable
This variable specifies how wide to draw the results table in characters.
*default-dataset-behavior*
variable

Controls how SPARQL behaves when no dataset is specified.

If nil, then the default value of the defaultDatasetBehavior query property will be used. Otherwise, this value will be used.

See the defaultDatasetBehavior query option for details

*dataset-load-function*
variable
Set this to a function of two or three arguments, (uri db &optional type), to load dataset parameters before a query is executed.

Function index


Footnotes