Introduction
The Lisp client's run-sparql function can accept a pre-built query plan, a string in the SPARQL query syntax, or a Lisp s-expression that encodes the query. The s-expression can be created via a call to parse-sparql or by hand. This document describes the s-expression format.
Note that the s-expression syntax is stable but is still subject to change as the query engine it supports improves and the SPARQL language continues to mature. Franz intends to provide backward compatibility for all existing features.
Example
As an example, the s-expression for query 8 from from the SP2 benchmark:
SELECT DISTINCT ?name
WHERE {
?erdoes rdf:type foaf:Person .
?erdoes foaf:name 'Paul Erdoes'^^xsd:string .
{
?document dc:creator ?erdoes .
?document dc:creator ?author .
?document2 dc:creator ?author .
?document2 dc:creator ?author2 .
?author2 foaf:name ?name
FILTER (?author!=?erdoes &&
?document2!=?document &&
?author2!=?erdoes &&
?author2!=?author)
} UNION {
?document dc:creator ?erdoes.
?document dc:creator ?author.
?author foaf:name ?name
FILTER (?author!=?erdoes)
}
}
is
(sparql.parser:sparql :select :vars (?name) :where
(:join
(:join nil
(:bgp #(?erdoes !rdf:type !foaf:Person)
#(?erdoes !foaf:name !"Paul Erdoes"^^<http://www.w3.org/2001/XMLSchema#string>)))
(:union
(:filter
(:join nil
(:bgp
#(?document !dc:creator ?erdoes)
#(?document !dc:creator ?author)
#(?document2 !dc:creator ?author)
#(?document2 !dc:creator ?author2)
#(?author2 !foaf:name ?name)))
(and (sparql.sop:!= ?author ?erdoes)
(and (sparql.sop:!= ?document2 ?document)
(and (sparql.sop:!= ?author2 ?erdoes)
(sparql.sop:!= ?author2 ?author)))))
(:filter
(:join nil
(:bgp
#(?document !dc:creator ?erdoes)
#(?document !dc:creator ?author)
#(?author !foaf:name ?name)))
(sparql.sop:!= ?author ?erdoes))))
:distinct :distinct)
This is the raw output of AllegroGraph's parse-sparql function 1 . There are many possible s-expressions for a given query. Before execution, the query engine does additional analysis and simplification. This means that you can be flexible in the way you generate queries and let the query engine convert them into its preferred form.
Every SPARQL s-expression starts with a description of the query type and verb:
(query-type query-verb &key)
where query-type
can be either sparql.parser:sparql-update
or sparql.parser:sparql
and query-verb
must be one of:
- :ask
- :construct
- :describe
- :select
- :update
The rest of the s-expression is a keyword list whose permitted values vary depending on the verb. In the example above, the :vars
argument specifies which variables to project and the :where
argument describes the query plan to be executed. It shows that the first step of the query is a :join
between another :join
and a :union
.
Note that query-verb
and query-type
are redundant but both are kept for historical reasons.
Filters and Expressions
Many query clauses involve expressions that calculate values or that are used to filter the result set. These are also s-expressions that use a restricted set of functions and constants. For example, the SPARQL string filter "?a + ?b < 34" becomes the filter expression:
(< (+ ?a ?b) !"34"^^<http://www.w3.org/2001/XMLSchema#integer>)
The complete list of expressions is outlined below.
Packages
Note that the SPARQL parser places all query variables into the query.variables
package for consistency during parsing. Your s-expressions do not need to do this and you are free to use any package that you would like.
ASK, CONSTRUCT, DESCRIBE and SELECT
These four SPARQL verbs share the same basic query algebra (though some verbs have their own unique parameters).
The following keyword parameters and arguments are allowed in these verbs' s-expressions (these will be described in more detail below):
- aggregates aggregate-assignment-list
- construct-pattern triple-template
- default-base nil | URI
- distinct nil |
:distinct
|:reduced
- from dataset-list
- group expression-list
- having filter-expression
- in-line-data in-line-data
- limit number
- offset number
- order expression-list
- query-options option-list
- targets URI-list
- vars nil | variable-list
- where query-algebra
distinct, from, limit, offset, order
These forms control the number and order of the solutions as well as the SPARQL dataset used for the query and whether or not the solutions should be distinct (or reduced). For example a query like:
select distinct * { ?A ?b ?c } order by ?c desc(?a) limit 10 offset 20
would be represented by the s-expression:
(sparql.parser:sparql :select :vars (?A ?b ?c) :where (:bgp #(?A ?b ?c))
:distinct :distinct
:order ((:asc ?c) (:desc ?a))
:limit 10
:offset 20)
- :distinct
- This can be
nil
or one of the keywords:reduced
or:distinct
and controls whether the query returns all results, is allowed to omit duplicate results or must omit duplicates. - :from dataset-list
- from describes the SPARQL dataset to use in the query. It is in the form of a dataset-list which is a list of pairs where each pair has the
car
:from
or:from-named
and thecdr
is the name of the graph to include in the dataset (either a UPI or a future-part). - :limit number
- limit indicates the maximum number of solutions to emit.
- :offset number
- offset indicates how many solutions to skip before beginning to return them.
- :order expression-list
- order controls the order of the results returned from the query. It is list of lists. Each sub-list must have the first element of either
:asc
or:desc
(for ascending or descending). The second element of the sublist is a SPARQL expression.
aggregates, group, having
These forms control aggregation. As an example, this SPARQL query:
select ?account (count(?order) as ?count) (sum(?total) as ?grand) {
?account :hasOrder ?order .
?order :hasTotal ?total .
} group by ?account
having (?grand > 10000)
could be expressed as:
(sparql.parser:sparql :select :vars (?account ?count ?grand)
:where
(:bgp
#(?account !<eh://hasOrder> ?order)
#(?order !<eh://hasTotal> ?total))
:aggregates
((?count (:count ?order))
(?grand (:sum ?total))
(?account (:identity ?account)))
:group (?account)
:having ((> ?grand !"10000"^^<http://www.w3.org/2001/XMLSchema#integer>)))
- :aggregates aggregate-expression-list
- Aggregation is described by an ordered list of instructions. Each instruction consists of a list in the form
(variable-name (instruction expression &key distinct))
. The instruction can be one of:_assign
,:avg
,:count
,:group_concat
,:identity
,:max
,:min
,:sample
, or:sum
. Most of these correspond to the obvious SPARQL aggregation operator.:identity
is used for variables from the group by expression and means that the value will come from the group.:_assign
is used to handle temporary aggregation variables introduced by the parser.Consider a modified version of the query above where we now use an aggregate expression in the
having
clause:select (sum(?total) as ?grand) { ?account :hasOrder ?order . ?order :hasTotal ?total . } group by ?account having (sum(?total) > 10000)
- :group expression-list
- This is simply a list of expressions that will be evaluated to determine the grouping.
- :having filter-expression
- This is a filter expression (which can include aggregation operators) used to filter out solutions.
This will parse as:
(sparql.parser:sparql :select :vars (?grand) :where
(:bgp
#(?account !<eh://hasOrder> ?order)
#(?order !<eh://hasTotal> ?total))
:aggregates
((?@aggregate-1 (:sum ?total))
(?account (:identity ?account))
(?grand (:_assign ?@aggregate-1)))
:group (?account)
:having ((> ?@aggregate-1 !"10000"^^<http://www.w3.org/2001/XMLSchema#integer>))
Where the expression in the having
clause has been replaced by a newly introduced variable and :_assign
is used to bind that variable to the query variable ?grand
.
construct-pattern, targets
- :construct-pattern triple-pattern
- construct-pattern is used only by SPARQL CONSTRUCT queries and is a list of triple-pattern arrays (as in a BGP). For example, this construct query:
construct { ?s rdf:type <eh://fruit> . ?s ?p ?o } where { ?s a <eh://fruityThing> . ?s ?p ?o . }
- :targets list
- targets is used only by SPARQL DESCRIBE. It must be a list of IRIs (strings, UPIs or future-parts). This s-expression, for example, would be used to DESCRIBE two URIs:
(sparql.parser:sparql :describe :vars nil :where (:bgp) :targets (!<http://www.example.com#one> !<http://www.example.com#two>)
would be equivalent to the following s-expression:
(sparql.parser:sparql :construct :vars
(?o ?p ?s)
:where
(:bgp #(?s !rdf:type !<eh://fruityThing>) #(?s ?p ?o))
:construct-pattern
(#(?s !rdf:type !<eh://fruit>)
#(?s ?p ?o)))
in-line-data
The in-line-data
parameter provides the data from the outermost VALUES
expression. It consists of two lists. The first list is an ordered list of the variables involved and the second is a list of lists of the values for these variables. nil
is used for an undefined value. For example, the values clause:
values (?account ?order) { (undef 2) (3 4) }
would be represented by the s-expression:
:in-line-data
((?account ?order)
((nil
!"2"^^<http://www.w3.org/2001/XMLSchema#integer>)
(!"3"^^<http://www.w3.org/2001/XMLSchema#integer>
!"4"^^<http://www.w3.org/2001/XMLSchema#integer>)))
query-options
query-options
is a association-list of values that control various switches during query execution. These include the query engine, whether or not to use Chunk-at-a-Time processing and more. The reference guide includes more information on the available query options.
vars
A list of the variables projected by the query. If omitted, then all variables will be returned in an unspecified order 2 .
default-base
Used as the default-base by the SPARQL uri
function when constructing URIs.
where
The where
clause describes the body of the query and is the most complex parameter. It is a tree of commands where each command is named by the head of its list and obtains its parameters from the tail. The commands are:
- :bgp triple-pattern
- :bind assignments query
- :exists left-query right-query
- :filter filter query
- :geo options triple-pattern &key where
- :graph graph query
- :in-line-data in-line-data query
- :join left-query right-query filter
- :left-join left-query right-query filter
- :minus left-query right-query
- :not-exists left-query right-query
- :select &key vars distinct where order limit offset from group having bind aggregates
- :service &key host query-string silent
- :union left-query right-query
SPARQL queries work by obtaining bindings for variables and then combining these bindings in different ways. The :bgp
, :bind
, and :in-line-data
clauses generate bindings and all the rest combine them or remove them from consideration.
- :bgp triple-pattern
- The BGP (or basic graph pattern) form has a single argument which must be a list of arrays where each array has three elements corresponding to a subject, predicate, object pattern. Each element in the array can be a symbol denoting a variable, a gensym denoting a blank node or a UPI or future-part denoting a constant. For example:
(:bgp #(?inproc !rdf:type !bench:Inproceedings) #(?inproc !dc:creator #:?bnode_0) #(?inproc !bench:booktitle ?booktitle) #(#:?bnode_0 !foaf:name ?name))
- :bind assignments query
- The
:bind
form has two arguments: a list of assignments and the rest of the query. Each assignment in theassignments
list is a list of two elements whose first element is the variable being assigned and whose second element is the expression to evaluate in order to make the assignment. For example:(:bind ((?b !"2"^^<http://www.w3.org/2001/XMLSchema#integer>) (?a !"1"^^<http://www.w3.org/2001/XMLSchema#integer>)) (:bgp #(?d ?e ?f))))
- :in-line-data data query
- This is the in-query counterpart to the
:in-line-data
described above. The syntax is the same: it consists of two lists. The first is an ordered list of the variables involved and the second is a list of lists of the values for these variables.nil
is used for an undefined value. For example, a query likeselect ((?total * 1.056) as ?markup) { values (?account ?order) { (undef 2) (3 4) } ?account :hasOrder ?order . ?order :hasTotal ?total . }
query is evaluated before the assignments are made so the above would first evaluate the BGP and then bind ?b
to 2, ?a
to 1.
would produce an :in-line-data
clause like this:
(sparql.parser:sparql :select :vars (?markup) :where
(:bind
((?markup
(* ?total !"1.056"^^<http://www.w3.org/2001/XMLSchema#decimal>)))
(:join
(:join nil
(:in-line-data :data
((?account ?order)
((nil !"2"^^<http://www.w3.org/2001/XMLSchema#integer>)
(!"3"^^<http://www.w3.org/2001/XMLSchema#integer>
!"4"^^<http://www.w3.org/2001/XMLSchema#integer>)))
:rhs (:bgp)))
(:bgp
#(?account !<eh://hasOrder> ?order)
#(?order !<eh://hasTotal> ?total)))))
In this case, the query portion of the :in-line-data
clause is the empty :bgp
. Note that the positioning of the :in-line-data
clause is significant as it alters how the join is processed.
joining groups of bindings
You can combine two sets of bindings (created from other query terms) using :join
, :left-join
and :union
. You can remove a set bindings based on the contents of another set using :exists
, :minus
, and :not-exists
. In the descriptions below, left-query and right-query are sub-trees with the same structure as the main :where
clause and filter is a filter expression.
- :exists left-query right-query
- Evaluates left-query and right-query and keeps bindings in left-query if and only if there is a corresponding binding in right-query.
- :join left-query right-query &optional filter
- join evaluates left-query and right-query and then joins the two results using the variables that the two sides have in common. If there are no common variables, then join becomes a cross-product (which is usually a bad thing)
3 . The optional filter is evaluated for each row produced by the join and can be used as short-hand for
(:filter (:join A B) filter)
- :left-join left-query right-query &optional filter
- left-join evaluates left-query and right-query and performs a left join using the optional filter to further remove results.
- :minus left-query right-query
- Evaluates left-query and right-query and removes any bindings in left-query if and only if there is a corresponding binding in right-query.
- :not-exists left-query right-query
- Evaluates left-query and right-query and keeps bindings in left-query if and only if there is not a corresponding binding in right-query.
- :union left-query right-query
- union evaluates each query piece and joins then with the current set of bindings in turn.
Miscelaneous operations
- :filter query filter
- The
:filter
clause removes bindings based on an expression. It first evaluates query and then iterates over the bindings applying the filter to each one. - :geo options triple-pattern &key where
- The
:geo
pattern corresponds to AllegroGraph's SPARQL geospatial extensions. It consists of a list of options, the triple-pattern against which the geospatial match is made and a where clause for the rest of the query. options must have two lists like((field _ subtype) (geo-type &rest args))
.field
- whether to query the object or the graph. This can benil
,:object
or:graph
. Ifnil
, then:object
will be assumed._
- the second item in the first list is ignored.subject
- specifies the geospatial subtype.geo-type
- specifies the kind of query. It can be:haversine
,:radius
or:boundingbox
).args
- these are the arguments to the geospatial query and vary depending on thegeo-type
.:haversine
-args
should be in the form(point (unit radius))
.:radius
-args
should be in the form(point radius)
:boundingbox
-args
should be in the form(point-min point-max)
- :graph graph query
- The graph clause changes the current graph used during query evaluation and then evaluates query. graph can be a variable, an IRI (string, UPI or future-part), or one of the keywords:
:from
,:from-named
, or:default
. These last three mean::from
- iterate over the graphs in the default-graph portion of the dataset. I.e., the ones that come from theFROM
clauses in the SPARQL query.:from-named
- iterate over the named graphs in the dataset. I.e., the ones that come from theFROM NAMED
clauses in the SPARQL query.
:default
- use the current graph.
- :select query
:select
is used for SPARQL 1.1 sub-query. The:select
clause takes all of the arguments that the main SELECT query supports except for:from
(sub-queries cannot change the dataset). For example, the (rather contrived) query:select * { { select * { ?a ?b ?c } order by ?c limit 1000 }}
- :service &key host query-string silent
- The service clause sends the query-string to host and merges the result into the rest of the query. If there is an error and silent is false, then the entire query will fail. 4
is equivalent to the s-expression
(sparql.parser:sparql :select
:vars (?a ?b ?c)
:where (:join nil
(:select :vars (?a ?b ?c)
:where (:bgp #(?a ?b ?c))
:order ((:asc ?c))
:limit 1000)))
(In passing, note that the inner ORDER BY will be used to order the results of the sub-query before applying the limit but there is no guarantee that the final result set will retain the ordering).
Update
A SPARQL update s-expression has the following keywords: steps
, default-base
and query-options
5.
As in the other SPARQL query forms, default-base
is used by the SPARQL uri
function when constructing URIs and query-options
is a association-list of values that control various switches during query execution. These include the query engine, whether or not to use Chunk-at-a-time processing and more. They are described in more detail in the reference guide.
steps
is a list of actions to take. Each action is a list whose head names the action and whose tail specifies any parameters. The actions are broken down into graph manipulation, data manipulation and loading from external data sources.
All of the non data-manipulation update steps use keyword arguments in their specification. They all have a silent argument which controls whether or not errors are signaled back to the caller.
Graph manipulation
The graph manipulation commands allow you to alter sets of triples contained in particular named graphs.
- :add &key first second silent
- This is the add command.
ADD ( SILENT )? ( ( GRAPH )? IRIref_from | DEFAULT) TO ( ( GRAPH )? IRIref_to | DEFAULT)
- :clear &key graph silent
- This is the clear command.
CLEAR ( SILENT )? (GRAPH IRIref | DEFAULT | NAMED | ALL )
- :copy &key first second silent
- This is the copy command.
COPY ( SILENT )? ( ( GRAPH )? IRIref_from | DEFAULT) TO ( ( GRAPH )? IRIref_to | DEFAULT )
- :create &key graph silent
- This is the create command.
CREATE ( SILENT )? GRAPH IRIref
- :drop &key graph silent
- This is the drop command.
DROP ( SILENT )? (GRAPH IRIref | DEFAULT | NAMED | ALL )
- :move first second silent
- This is the move command.
MOVE (SILENT)? ( ( GRAPH )? IRIref_from | DEFAULT) TO ( ( GRAPH )? IRIref_to | DEFAULT)
first and second correspond to from
and to
and can be an IRI or the keyword :default
. Unless silent is true, :add
will fail if the first graph is not present in the triple-store.
graph can be an IRI (UPI or future-part) or one of keywords: :default
, :named
or :all
. Obviously, care should be taken before executing this command! The :clear
command cannot fail (i.e., it is not an error to clear a graph that does not exist) so the silent parameter is ignored.
first and second correspond to from
and to
and can be an IRI or the keyword :default
. If the source graph does not exist, then the :copy
command will fail unless silent is true.
Where graph must be an IRI (string, future-part or UPI). AllegroGraph does not record the existence of empty graphs so the create operation will never change the contents of a triple-store. If the graph already exists in the store, however, then it will fail (unless silent is specified).
graph can be an IRI (string, future-part or UPI) or one of the keywords :default
, :named
, or :all
. Because AllegroGraph does not keep track of empty graphs, the :drop
command is equivalent to the :clear
command.
first and second correspond to from
and to
and can be an IRI or the keyword :default
. As in :copy
, if the source graph does not exist, then the command will fail unless silent is true.
Data manipulation
The data manipulation commands let you add and remove triples from the triple-store. Each uses quad-templates to specify what to add and remove. A quad-template is like a bgp
(i.e., a list of arrays) extended with additional syntax to allow you to specify the graph(s) that should be altered. I.e., it looks like:
(#(s1 p1 o1)
#(s2 p2 o2)
(:graph g1 (#(s3 p3 o3)))
#(s4 p4 o4))
Note that the logic for determining the graph(s) to alter is complex and depends on graphs specified in the templates, the graphs specified in the USING or WITH clauses and various parameters passed to the run-sparql function. Please contact support if you need additional information.
- :delete quad-template
- This corresponds to the DELETE DATA command. Note that quad-template needs to be in an additional list. The template used is not allowed to contain either variables nor blank nodes 6 .
- :insert quad-template
- This corresponds to the INSERT DATA command. Note that quad-template needs to be in an additional list. Note that this template is not allowed to include variables 7 .
- :modify &key graph delete insert where using
- This corresponds to the DELETE/INSERT command. The delete and insert parameters must be quad-templates (or
nil
). The where clause uses the same grammar as it does in the query language above.using is like from (above). I.e., it is a dataset-list and it should be a list of pairs where each pair has the
car
:from
or:from-named
and thecdr
is the name of the graph to include in the dataset (either a UPI or a future-part).
The graph parameter represents the graph specified in the WITH
clause of the SPARQL grammar.
Loading from external sources
- :load &key from graph silent
- The
:load
command tries to read data from the URI from (which can be afile://
,http://
, orhttps://
URI) and load it into the graph graph. graph can benil
or a URI. If it is nil, the data is loaded into the default-graph of the destination triple-store. The format of the data at the other end of the from URI is determined via its extension (treating it as if it names a file). If the type cannot be guessed, then RDf/XML is assumed. If there is a problem during loading, then an error will be signaled unless silent is true.
Filter Summary
Filter expressions can be combined with the boolean operators and
and or
and negated with not
. We support all of the functions in the SPARQL standard. We will augment this document with a complete listing in a future version. In the meantime, please contact support if you need additional information.
Footnotes
- The query can be further simplified which occurs during the query planning process. ↩
- You can use the third return value of run-sparql to obtain a list of the variables in the order they occur in the result set. ↩
- Cross products almost always indicate that a query is asking two different and unrelated things. Because there is no join, the answer will contain a row for every pair of elements on the right-hand side and the left-hand. ↩
- Note that the service keyword requires the query to be specified as a string. At some point, AllegroGraph may allow s-expressions to be used here too. ↩
- Note that this document describes the s-expression syntax for the SPARQL 1.1 engine; the SPARQL 1.0 engine supports an older version of the UPDATE language and is not compatible with the newer syntax. ↩
- At this time the check for variables and blank nodes happens in the parser and the behavior is unspecified if templates using them are passed in as parameters to the delete data command. ↩
- At this time the check for this happens in the parser and the behavior is unspecified if templates with variables are used ↩