Example 10: Graphs in SPARQL

In Example 6: Importing triples and Example 7: Querying multiple contexts we’ve seen how to import data to a non-default context and run queries against such data. In this example we’ll explore facilities for handling multiple contexts provided by SPARQL and the AllegroGraph Python client.

We’ll start by creating two URIs that will represent named contexts.

conn = connect()
context1 = conn.createURI("ex://context1")
context2 = conn.createURI("ex://context2")

The first context will be filled using the addData() method:

conn.addData("""
    @prefix : <ex://> .
    :alice a :person ;
           :name "Alice" .""",
    context=context1)

The second context will be filled using addTriple(). Notice how we use a constant defined in the RDF class to obtain the URI of the type predicate:

from franz.openrdf.vocabulary.rdf import RDF

bob = conn.createURI('ex://bob')
bob_name = conn.createLiteral('Bob')
name = conn.createURI('ex://person')
person = conn.createURI('ex://person')
conn.addTriple(bob, RDF.TYPE, person,
               contexts=[context2])
conn.addTriple(bob, name, bob_name,
               contexts=[context2])

Finally we’ll add two triples to the default context using addStatement():

from franz.openrdf.model import Statement

ted = conn.createURI('ex://ted')
ted_name = conn.createLiteral('Ted')
stmt1 = Statement(ted, name, ted_name)
stmt2 = Statement(ted, RDF.TYPE, person)
conn.addStatement(stmt1)
conn.addStatement(stmt2)

Warning

The Statement object contains a context field. This field is ignored by addStatement(). If you wish to add a statement object to a specific context, use the contexts parameter.

As we’ve seen already in Example 7: Querying multiple contexts, a call to getStatements() will return triples from all contexts:

with conn.getStatements() as result:
    print('getStatements(): {0}'.format(len(result)))
print('size(): {0}'.format(conn.size()))

size() will also process all contexts by default.

getStatements(): 6
size(): 6

Both getStatements() and size() accept a contexts parameter that can be used to limit processing to a specified list of graphs:

contexts = [context1, context2]
with conn.getStatements(contexts=contexts) as result:
    print('getStatements(): {0}'.format(len(result)))
print('size(): {0}'.format(conn.size(contexts=contexts)))

As expected, triples from the default context are not processed:

getStatements(): 4
size(): 4

To include the default graph when using the contexts parameter use None as a graph URI:

contexts = [context1, None]
with conn.getStatements(contexts=contexts) as result:
    print('getStatements(): {0}'.format(len(result)))
print('size(): {0}'.format(conn.size(contexts=contexts)))

Now triples from the default context and from one of our named contexts are processed:

getStatements(): 4
size(): 4

SPARQL using FROM, FROM DEFAULT, and FROM NAMED

In many of our examples we have used a simple SPARQL query to retrieve triples from AllegroGraph’s default graph. This has been very convenient but it is also misleading. As soon as we tell SPARQL to search a specific graph, we lose the ability to search AllegroGraph’s default graph! Triples from the null graph vanish from the search results. Why is that?

It is important to understand that AllegroGraph and SPARQL use the phrase “default graph” to identify two very different things.

  • AllegroGraph’s default graph, or null context, is simply the set of all triples that have null in the fourth field of the “triple.” The default graph is an unnamed subgraph of the AllegroGraph triple store.
  • SPARQL uses default graph to describe something that is very different. In SPARQL, the default graph is a temporary pool of triples imported from one or more named graphs. SPARQL’s default graph is constructed and discarded in the service of a single query. Standard SPARQL was designed for named graphs only, and has no syntax to identify a truly unnamed graph. AllegroGraph’s SPARQL, however, has been extended to allow the unnamed graph to participate in multi-graph queries.

We can use AllegroGraph’s SPARQL to search specific subgraphs in three ways.

  • We can create a temporary default graph using the FROM operator.
  • We can put AllegroGraph’s unnamed graph into SPARQL’s default graph using FROM DEFAULT.
  • Or we can target specific named graphs using the FROM NAMED operator.

Here’s an example of a query that accesses the unnamed graph explicitly:

query = conn.prepareTupleQuery(query="""
    SELECT DISTINCT ?s FROM DEFAULT {
        ?s ?p ?o
    }""")
query.evaluate(output=True)

This will not process any of the triples in named contexts:

------------
| s        |
============
| ex://ted |
------------

Here’s an example of a query that uses FROM. It instructs SPARQL to regard context1 as the default graph for the purposes of this query.

query = conn.prepareTupleQuery(query="""
    SELECT DISTINCT ?s FROM <ex://context1> {
        ?s ?p ?o
    }""")
query.evaluate(output=True)

Now only one context is processed:

--------------
| s          |
==============
| ex://alice |
--------------

The next example changes FROM to FROM NAMED in the same query:

query = conn.prepareTupleQuery(query="""
    SELECT DISTINCT ?s FROM NAMED <ex://context1> {
        ?s ?p ?o
    }""")
query.evaluate(output=True)

There are no matches now! The pattern { ?s ?p ?o . } only matches the SPARQL default graph. We declared context1 to be a named graph, so it is no longer the default graph.

-----
| s |
=====
-----

To match triples in named graphs, SPARQL requires a GRAPH pattern:

query = conn.prepareTupleQuery(query="""
    SELECT DISTINCT ?s ?g FROM NAMED <ex://context1> {
        GRAPH ?g { ?s ?p ?o }
    }""")
query.evaluate(output=True)

This time we’ll also print the graph:

------------------------------
| s          | g             |
==============================
| ex://alice | ex://context1 |
------------------------------

We can also combine all the forms presented above:

query = conn.prepareTupleQuery(query="""
    SELECT DISTINCT ?s ?g
    FROM DEFAULT
    FROM <ex://context1>
    FROM NAMED <ex://context2> {
        { ?s ?p ?o } UNION { GRAPH ?g { ?s ?p ?o } }
    }""")
query.evaluate(output=True)

This query puts AllegroGraph’s unnamed graph and the context1 graph into SPARQL’s default graph, where the triples can be found by using a simple {?s ?p ?o . } query. Then it identifies context2 as a named graph, which can be searched using a GRAPH pattern. In the final line, we used a UNION operator to combine the matches of the simple and GRAPH patterns.

This query should find all three subjects:

------------------------------
| s          | g             |
==============================
| ex://alice | ---           |
| ex://ted   | ---           |
| ex://bob   | ex://context2 |
------------------------------

SPARQL with Dataset object

A Dataset object is a construct that contains two lists of named graphs. There is one list of graphs that will become the SPARQL default graph, just like using FROM in the query. There is a second list of graphs that will be named graphs in the query, just like using FROM NAMED. To use the dataset, we put the graph URIs into the dataset object, and then add the dataset to the query object. When we evaluate the query, the results will be confined to the graphs listed in the dataset.

from franz.openrdf.query.dataset import Dataset

dataset = Dataset()
dataset.addDefaultGraph(context1)
dataset.addNamedGraph(context2)
query = conn.prepareTupleQuery(query="""
    SELECT DISTINCT ?s ?g {
      { ?s ?p ?o } UNION { GRAPH ?g { ?s ?p ?o } }
    }""")
query.setDataset(dataset)
query.evaluate(output=True)

Note that, since we’re explicitly specifying graphs (through a dataset object), we need a GRAPH pattern to match triples from the named graphs. Triples from the unnamed graph are not matched at all, since that graph is not a part of the dataset.

------------------------------
| s          | g             |
==============================
| ex://alice | ---           |
| ex://bob   | ex://context2 |
------------------------------