Introduction

This tutorial assumes that you already know what SPARQL Magic Properties are and have seen them used in other places in AllegroGraph. Magic Properties are also called a Computed Property by W3C.

Recall that a Magic Property associates a predicate URI with Lisp code that executes during a SPARQL query.

SPARQL syntax means that a Magic Property Looks sort of like a function call with multiple values on the subject side and object side. For example, the freetext index match Magic Property does a freetext search for a phrase (with the search optionally restricted to a particular index) and binds the ?subject variable to the subject of each matching triple:

 ?subject fti:match ( 'phrase' 'indexName' ) . 

To define a Magic Property, we need to specify the arguments it takes on the subject side and the object side and the Lisp code that it should execute. As a toy example, suppose we wanted a Magic Property that would iterate over the letters in a string. The SPARQL query would look like:

 select ?letter {  
   ?letter <http://example.org/lettersOf> 'abc' .  
 } 

with the expected output being:

------------  
| letter   |  
============  
| a        |  
| b        |  
| c        |  
------------ 

This Magic Property has one argument on the subject side and one argument on the object side. Here is how to define it:

(ag.sbqe:defmagic-property !<http://example.org/lettersOf>  
  :subject-arguments (letter)  
  :object-arguments (word)  
  :body (loop for element across word collect element)) 

Note that the subject-argument letter is still used even though body does not reference it. AllegroGraph uses letter both to determine the output of this magic property and to potentially filter the results (see Body Output Handling below for details on automatic filtering).

Details on Arguments

Both subject-arguments and object-arguments have the form

(argument* [&rest variable-name] | [&key argument* [&allow-other-keys]]) 

Each argument can be a symbol naming the argument or a list consisting of

(argument-name [optional] [input] [:type <TYPE>]) 

Where optional indicates that the argument can be left off and input indicates that the value must be bound when the Magic Property executes (i.e., it is only an input). By default, all object-arguments will treated as input (see below for how to change this). Either &rest or &key may be specified but not both.

Here is an example of a Magic Property with an optional argument:

(ag.sbqe:defmagic-property !<http://example.org/iterate>  
  :subject-arguments ((value :type numeric))  
  :object-arguments (start end (step optional))  
  :body (loop for index from start to end by (or step 1) collect index)) 

We've also indicated that the type of the subject argument is :numeric. We'll discuss datatypes in more detail below. This property can be used like this

select ?elt { ?elt <http://example.org/iterate> (1 4) . } 

Which will output:

---------  
| elt   |  
=========  
| 1     |  
| 2     |  
| 3     |  
| 4     |  
--------- 

It can also be used like this:

select ?elt { ?elt <http://example.org/iterate> (1 4 2) . } 

Which will output:

---------  
| elt   |  
=========  
| 1     |  
| 3     |  
--------- 

Note that filtering happens automatically too. Look at the following query:

# We use BIND so that the variable `?elt` will still appear  
# in the output  
select ?elt {  
  bind(1 as ?elt)  
  ?elt <http://example.org/iterate> (1 4 2) .  
} 

Which will output:

---------  
| elt   |  
=========  
| 1     |  
--------- 

The property still generated 3 as a binding for ?elt but the as ?elt was already bound to 1, this binding was dropped.

Magic Property Machinery

The SPARQL query engine uses UPIs internally and bases most of its processing on the AllegroGraph cursor machinery. The defmagic-property macro hides most of these details in two ways:

1 Depending on the datatype of the argument, it converts UPIs into Lisp values. For example, an encoded long or double-float UPI will be converted into a number and a date-time UPI will be converted into a cons of a Lisp universal time and the timezone information.
1

2 The output of the body is converted back into a cursor that the query engine can use.

Datatypes

Each value may be coerced from a UPI to a Lisp datatype depending on the datatype of the variable. If no datatype is specified, then the default behavior is based on the UPI's type-code:

Note that the behavior is currently unspecified if the datatype and the actual value clash. E.g., if you claim that an argument has :type :numeric and a non-numeric value is encountered during query processing.

Body output handling

body should return something that AllegroGraph can use to create bindings for the subject-arguments. The result can be one of these four things:

In the first four cases, the output will be transformed into a cursor that can be used by the query engine. Every Lisp value is converted into an equivalent UPI as follows:

If you use a generator function, each call to it should return a row of results. This row will be treated as above where the values in the row will be coerced into UPIs and placed into the appropriate subject arguments.

Note that any supplied subject arguments will act as filters on the output -- you do not need to worry about this yourself (see the <http://example.org/allPairs> example below).

The output of body should be able to supply bindings for every subject-argument. This means that when there are multiple subject-arguments, body should output either a cursor or a list of lists with each sublist having the same length as the number of subject-arguments. The entries in the sublists will be bound to each subject argument in order. E.g., if the property has subject arguments: entry, score, weight, then the output of body should look like:

  entry         score    weight  
((!<http://test1> 1.1      34.7)  
 (!<http://test2> 2.3       4.5)  
 ...  
) 

and these lists will create bindings for the variables in their column.

If these sublists are too short, then some subject-arguments will be unbound. If the sublists are too long, then items beyond the number of subject variables will be ignored.

More advanced customization

These parameters can generally be left off.

pattern-size-estimator and set-size-estimator

You may supply pattern-size-estimator or set-size-estimator to defmagic-property. These can either be numbers or symbols naming functions that take

db implementor method pattern 

as arguments. The pattern-size-estimator should return an estimate of how much work the magic property will require to evaluate pattern in terms of the number of rows a triple cursor would need to examine. The set-size-estimator should return an estimate of how large the resulting set will be. These values are used during query planning when reordering the clauses of a SPARQL BGP. It is often hard to provide good estimates of the work involved.

validation-function

You may supply a validation-function to defmagic-property. If supplied, It must be a symbol naming a function with signature:

magic-property arguments 

This function will be called at plan time and may signal an error if the property cannot function with the supplied arguments. For example, AllegroGraph's fti:match Magic Property will signal an error at plan time if the triple-store has no i index since the lack of such an index will make the query perform exceedingly poorly.

allow-object-arguments-to-be-unbound?

We mentioned above that all object-arguments will be treated as input which means that an error will be signaled if they are not bound at query time. If the allow-object-arguments-to-be-unbound? parameter is t then this rule will no longer be enforced. This can be useful if the code in the body can handle unbound values. The <http://franz.com/ns/allegrograph/4.0/tripleId> example below will illustrate this.

More examples

All Pairs

Here is a Magic Property that iterates over all the pairs of the letters of two strings. E.g., allPairs of 'ab' and 'xyz' will be ('a' 'x'), ('a' 'y'), ('a' 'z'), ('b' 'x'), ('b' 'y'), and ('b' 'z'). It has two subject-arguments and two object arguments:

(ag.sbqe:defmagic-property !<http://example.org/allPairs>  
  :subject-arguments (a b)  
  :object-arguments (word1 word2)  
  :body (loop for letter1 across word1 append  
          (loop for letter2 across word2 collect  
            (list letter1 letter2)))) 

This SPARQL query provdes the same answer as we wrote out above

select ?a ?b { (?a ?b) <http://example.org/allPairs> ('ab' 'xyz') . }  
 
-------------  
| a   | b   |  
=============  
| a   | x   |  
| a   | y   |  
| a   | z   |  
| b   | x   |  
| b   | y   |  
| b   | z   |  
------------- 

Note what happens if we fill in one of the subject arguments by replacing ?b with 'y':

select ?a ?b {  
  bind('y' as ?b)  
  (?a ?b) <http://example.org/allPairs> ('ab' 'xyz') .  
}  
-------------  
| a   | b   |  
=============  
| a   | y   |  
| b   | y   |  
------------- 

The filtering happened automatically. We could also do

# No bindings for `?b` anymore  
select ?a {  
  (?a ?b) <http://example.org/allPairs> ('ab' 'xyz') .  
} 

to get

-------  
| a   |  
=======  
| a   |  
| b   |  
------- 

Rest and keyword arguments

Here is a magic property that uses keyword arguments:

(ag.sbqe:defmagic-property !<ex://#divide>  
    :subject-arguments (answer)  
    :object-arguments (&key (numerator :type numeric) (denominator :type :numeric))  
    :body (/ numerator denominator)) 

Keyword arguments are URIs that start with http://franz.com/ns/keyword#. Here are two ways of asking the same question using the <ex://#divide> magic property:

prefix : <http://franz.com/ns/keyword#>  
select * {  
  ?a <ex://#divide> (:numerator 4 :denominator 2) .  
} 

or

prefix : <http://franz.com/ns/keyword#>  
select * {  
  ?a <ex://#divide> (:denominator 2 :numerator 4 ) .  
} 

AllegroGraph ensures that numerator and demoninator are bound correctly based on the keyword names.

Magic properties also support &rest parameters. For example:

(ag.sbqe:defmagic-property !<ex://#multiply>  
    :subject-arguments (answer)  
    :object-arguments (&rest values)  
    :body (apply #'* values)) 

This can be queried as so:

select * {  
  ?answer <ex://#multiply> (1 2 3 4 5)  
} 

Note that the values from the query will be coerced as if they had an unspecified datatype. See the discussion on datatypes above for details.

Connecting an existing function to a Magic Property

Suppose we have this function:

(defun get-github-project-names (user)  
  ;; returns a list of lists of (<repository name> <description>)  
  ;; for github user `user`.  
  (loop for repo across  
     (db.agraph.parser:read-json-into-lists  
       (net.aserve.client:do-http-request  
        (string+ "https://api.github.com/users/" user "/repos")  
        :user-agent "ACL"))  
    collect  
    (list (cdr (assoc :name repo)) (cdr (assoc :description repo))))) 

Here is a Magic Property that will call the function at query time:

(ag.sbqe:defmagic-property !<http://example.org/githubRepos>  
    :subject-arguments (project (description optional))  
    :object-arguments ((user :type :string))  
    :body (get-github-project-names user)) 

For example:

select * { ?repo <http://example.org/githubRepos> 'gwkkwg' } 

Will return

---------------------------  
| repo                    |  
===========================  
| asdf-install            |  
| asdf-system-connections |  
| cl-containers           |  
| cl-graph                |  
| cl-html-parse           |  
| cl-markdown             |  
| cl-mathstats            |  
| clnuplot                |  
| dynamic-classes         |  
| lift                    |  
...  
| trivial-shell           |  
| trivial-timeout         |  
--------------------------- 

Summary

defmagic-property makes it easy to connect Lisp code into the SPARQL query engine. Your feedback is welcome.


Footnotes

  1. See the discussion of time and timezones in upi->value and value->upi for more details.