Introduction
This tutorial assumes that you already know what SPARQL Magic Properties are and have seen them used in other places in AllegroGraph.
Recall that a Magic Property associates a predicate URI with Lisp code that executes during a SPARQL query.
SPARQL syntax means that a Magic Property Looks sort of like a function call with multiple values on the subject side and object side. For example, the freetext index match
Magic Property does a freetext search for a phrase (with the search optionally restricted to a particular index) and binds the ?subject
variable to the subject of each matching triple:
?subject fti:match ( 'phrase' 'indexName' ) .
To define a Magic Property, we need to specify the arguments it takes on the subject side and the object side and the Lisp code that it should execute. As a toy example, suppose we wanted a Magic Property that would iterate over the letters in a string. The SPARQL query would look like:
select ?letter {
?letter <http://example.org/lettersOf> 'abc' .
}
with the expected output being:
------------
| letter |
============
| a |
| b |
| c |
------------
This Magic Property has one argument on the subject side and one argument on the object side. Here is how to define it:
(ag.sbqe:defmagic-property !<http://example.org/lettersOf>
:subject-arguments (letter)
:object-arguments (word)
:body (loop for element across word collect element))
Note that the subject-argument
letter
is still used even though body
does not reference it. AllegroGraph uses letter
both to determine the output of this magic property and to potentially filter the results (see Body Output Handling below for details on automatic filtering).
Details on Arguments
Both subject-arguments
and object-arguments
have the form
(argument* [&rest variable-name] | [&key argument* [&allow-other-keys]])
Each argument
can be a symbol naming the argument or a list consisting of
(argument-name [optional] [input] [:type <TYPE>])
Where optional
indicates that the argument can be left off and input
indicates that the value must be bound when the Magic Property executes (i.e., it is only an input). By default, all object-arguments
will treated as input
(see below for how to change this). Either &rest
or &key
may be specified but not both.
Here is an example of a Magic Property with an optional
argument:
(ag.sbqe:defmagic-property !<http://example.org/iterate>
:subject-arguments ((value :type numeric))
:object-arguments (start end (step optional))
:body (loop for index from start to end by (or step 1) collect index))
We've also indicated that the type
of the subject argument is :numeric
. We'll discuss datatypes in more detail below. This property can be used like this
select ?elt { ?elt <http://example.org/iterate> (1 4) . }
Which will output:
---------
| elt |
=========
| 1 |
| 2 |
| 3 |
| 4 |
---------
It can also be used like this:
select ?elt { ?elt <http://example.org/iterate> (1 4 2) . }
Which will output:
---------
| elt |
=========
| 1 |
| 3 |
---------
Note that filtering happens automatically too. Look at the following query:
# We use BIND so that the variable `?elt` will still appear
# in the output
select ?elt {
bind(1 as ?elt)
?elt <http://example.org/iterate> (1 4 2) .
}
Which will output:
---------
| elt |
=========
| 1 |
---------
The property still generated 3
as a binding for ?elt
but the as ?elt
was already bound to 1
, this binding was dropped.
Magic Property Machinery
The SPARQL query engine uses UPIs internally and bases most of its processing on the AllegroGraph cursor machinery. The defmagic-property
macro hides most of these details in two ways:
1 Depending on the datatype of the argument, it converts UPIs into Lisp values. For example, an encoded long or double-float UPI will be converted into a number and a date-time UPI will be converted into a cons
of a Lisp universal time and the timezone information.
1
2 The output of the body is converted back into a cursor that the query engine can use.
Datatypes
Each value may be coerced from a UPI to a Lisp datatype depending on the datatype of the variable. If no datatype is specified, then the default behavior is based on the UPI's type-code:
- :numeric - output is a Lisp number,
- :boolean - output is Lisp
t
ornil
, - :resource, :blank, :default-graph, :gensym - output remains a UPI,
- :literal - Output is a string (using upi->value)
- Other - Output is via upi->value.
Note that the behavior is currently unspecified if the datatype and the actual value clash. E.g., if you claim that an argument has :type
:numeric
and a non-numeric value is encountered during query processing.
Body output handling
body
should return something that AllegroGraph can use to create bindings for the subject-arguments
. The result can be one of these four things:
- a single value,
- a list of values,
- a list of lists (of values),
- a generator function (advanced usage and described only briefly in this tutorial).
- a cursor (advanced usage and described only briefly in this tutorial).
In the first four cases, the output will be transformed into a cursor that can be used by the query engine. Lisp values will be converted back into UPIs as follows:
nil
: the boolean false UPI,t
: the boolean true UPI,- fixnum : integer UPI,
- other number : double-float UPI,
- string : literal UPI,
- character : literal UPI.
- UPI : will stay a UPI.
:undef
: thenil
UPI which is treated by the SPARQL engine as unbound.
If you use a generator function, each call to it should return a row of results. This row will be treated as above where the values in the row will be coerced into UPIs and placed into the appropriate subject arguments.
Note that any supplied subject arguments will act as filters on the output -- you do not need to worry about this yourself (see the <http://example.org/allPairs>
example below).
The output of body
should be able to supply bindings for every subject-argument
. This means that when there are multiple subject-arguments
, body
should output either a cursor or a list of lists with each sublist having the same length as the number of subject-arguments
. The entries in the sublists will be bound to each subject argument in order. E.g., if the property has subject arguments: entry
, score
, weight
, then the output of body should look like:
entry score weight
((!<http://test1> 1.1 34.7)
(!<http://test2> 2.3 4.5)
...
)
and these lists will create bindings for the variables in their column.
If these sublists are too short, then some subject-arguments will be unbound. If the sublists are too long, then items beyond the number of subject variables will be ignored.
More advanced customization
These parameters can generally be left off.
pattern-size-estimator and set-size-estimator
You may supply pattern-size-estimator
or set-size-estimator
to defmagic-property
. These can either be numbers or symbols naming functions that take
db implementor method pattern
as arguments. The pattern-size-estimator
should return an estimate of how much work the magic property will require to evaluate pattern
in terms of the number of rows a triple cursor would need to examine. The set-size-estimator
should return an estimate of how large the resulting set will be. These values are used during query planning when reordering the clauses of a SPARQL BGP. It is often hard to provide good estimates of the work involved.
validation-function
You may supply a validation-function
to defmagic-property
. If supplied, It must be a symbol naming a function with signature:
magic-property arguments
This function will be called at plan time and may signal an error if the property cannot function with the supplied arguments. For example, AllegroGraph's fti:match
Magic Property will signal an error at plan time if the triple-store has no i
index since the lack of such an index will make the query perform exceedingly poorly.
allow-object-arguments-to-be-unbound?
We mentioned above that all object-arguments
will be treated as input
which means that an error will be signaled if they are not bound at query time. If the allow-object-arguments-to-be-unbound?
parameter is t
then this rule will no longer be enforced. This can be useful if the code in the body
can handle unbound values. The <http://franz.com/ns/allegrograph/4.0/tripleId>
example below will illustrate this.
More examples
All Pairs
Here is a Magic Property that iterates over all the pairs of the letters of two strings. E.g., allPairs of 'ab' and 'xyz' will be ('a' 'x'), ('a' 'y'), ('a' 'z'), ('b' 'x'), ('b' 'y'), and ('b' 'z'). It has two subject-arguments and two object arguments:
(ag.sbqe:defmagic-property !<http://example.org/allPairs>
:subject-arguments (a b)
:object-arguments (word1 word2)
:body (loop for letter1 across word1 append
(loop for letter2 across word2 collect
(list letter1 letter2))))
This SPARQL query provdes the same answer as we wrote out above
select ?a ?b { (?a ?b) <http://example.org/allPairs> ('ab' 'xyz') . }
-------------
| a | b |
=============
| a | x |
| a | y |
| a | z |
| b | x |
| b | y |
| b | z |
-------------
Note what happens if we fill in one of the subject arguments by replacing ?b
with 'y'
:
select ?a ?b {
bind('y' as ?b)
(?a ?b) <http://example.org/allPairs> ('ab' 'xyz') .
}
-------------
| a | b |
=============
| a | y |
| b | y |
-------------
The filtering happened automatically. We could also do
# No bindings for `?b` anymore
select ?a {
(?a ?b) <http://example.org/allPairs> ('ab' 'xyz') .
}
to get
-------
| a |
=======
| a |
| b |
-------
Rest and keyword arguments
Here is a magic property that uses keyword arguments:
(ag.sbqe:defmagic-property !<ex://#divide>
:subject-arguments (answer)
:object-arguments (&key (numerator :type numeric) (denominator :type :numeric))
:body (/ numerator denominator))
Keyword arguments are URIs that start with http://franz.com/ns/keyword#
. Here are two ways of asking the same question using the <ex://#divide>
magic property:
prefix : <http://franz.com/ns/keyword#>
select * {
?a <ex://#divide> (:numerator 4 :denominator 2) .
}
or
prefix : <http://franz.com/ns/keyword#>
select * {
?a <ex://#divide> (:denominator 2 :numerator 4 ) .
}
AllegroGraph ensures that numerator
and demoninator
are bound correctly based on the keyword names.
Magic properties also support &rest
parameters. For example:
(ag.sbqe:defmagic-property !<ex://#multiply>
:subject-arguments (answer)
:object-arguments (&rest values)
:body (apply #'* values))
This can be queried as so:
select * {
?answer <ex://#multiply> (1 2 3 4 5)
}
Note that the values from the query will be coerced as if they had an unspecified datatype. See the discussion on datatypes above for details.
Connecting an existing function to a Magic Property
Suppose we have this function:
(defun get-github-project-names (user)
;; returns a list of lists of (<repository name> <description>)
;; for github user `user`.
(loop for repo across
(db.agraph.parser:read-json-into-lists
(net.aserve.client:do-http-request
(string+ "https://api.github.com/users/" user "/repos")
:user-agent "ACL"))
collect
(list (cdr (assoc :name repo)) (cdr (assoc :description repo)))))
Here is a Magic Property that will call the function at query time:
(ag.sbqe:defmagic-property !<http://example.org/githubRepos>
:subject-arguments (project (description optional))
:object-arguments ((user :type :string))
:body (get-github-project-names user))
For example:
select * { ?repo <http://example.org/githubRepos> 'gwkkwg' }
Will return
---------------------------
| repo |
===========================
| asdf-install |
| asdf-system-connections |
| cl-containers |
| cl-graph |
| cl-html-parse |
| cl-markdown |
| cl-mathstats |
| clnuplot |
| dynamic-classes |
| lift |
...
| trivial-shell |
| trivial-timeout |
---------------------------
Summary
defmagic-property
makes it easy to connect Lisp code into the SPARQL query engine. Your feedback is welcome.
Footnotes
- See the discussion of time and timezones in upi->value and value->upi for more details. ↩