Example 10: Graphs in SPARQL¶
In Example 6: Importing triples and Example 7: Querying multiple contexts we’ve seen how to import data to a non-default context and run queries against such data. In this example we’ll explore facilities for handling multiple contexts provided by SPARQL and the AllegroGraph Python client.
We’ll start by opening a connection:
from franz.openrdf.connect import ag_connect
conn = ag_connect('python-tutorial', create=True, clear=True)
Now we will create two URIs that will represent named contexts.
context1 = conn.createURI("ex://context1")
context2 = conn.createURI("ex://context2")
The first context will be filled using the addData()
method:
conn.addData("""
@prefix : <ex://> .
:alice a :person ;
:name "Alice" .""",
context=context1)
The second context will be filled using addTriple()
. Notice how
we use a constant defined in the RDF
class to obtain the URI of
the type
predicate:
from franz.openrdf.vocabulary.rdf import RDF
bob = conn.createURI('ex://bob')
bob_name = conn.createLiteral('Bob')
name = conn.createURI('ex://person')
person = conn.createURI('ex://person')
conn.addTriple(bob, RDF.TYPE, person,
contexts=[context2])
conn.addTriple(bob, name, bob_name,
contexts=[context2])
Finally we’ll add two triples to the default context using
addStatement()
:
from franz.openrdf.model import Statement
ted = conn.createURI('ex://ted')
ted_name = conn.createLiteral('Ted')
stmt1 = Statement(ted, name, ted_name)
stmt2 = Statement(ted, RDF.TYPE, person)
conn.addStatement(stmt1)
conn.addStatement(stmt2)
Warning
The Statement
object contains a context field.
This field is ignored by addStatement()
. If you
wish to add a statement object to a specific context, use
the contexts
parameter.
As we’ve seen already in Example 7: Querying multiple contexts, a call to
getStatements()
will return triples from all contexts:
with conn.getStatements() as result:
print('getStatements(): {0}'.format(len(result)))
print('size(): {0}'.format(conn.size()))
size()
will also process all contexts by default.
getStatements(): 6
size(): 6
Both getStatements()
and size()
accept a contexts
parameter that can be used to limit processing to a specified list of
graphs:
contexts = [context1, context2]
with conn.getStatements(contexts=contexts) as result:
print('getStatements(): {0}'.format(len(result)))
print('size(): {0}'.format(conn.size(contexts=contexts)))
As expected, triples from the default context are not processed:
getStatements(): 4
size(): 4
To include the default graph when using the contexts
parameter use
None
as a graph URI:
contexts = [context1, None]
with conn.getStatements(contexts=contexts) as result:
print('getStatements(): {0}'.format(len(result)))
print('size(): {0}'.format(conn.size(contexts=contexts)))
Now triples from the default context and from one of our named contexts are processed:
getStatements(): 4
size(): 4
SPARQL using FROM
, FROM DEFAULT
, and FROM NAMED
¶
In many of our examples we have used a simple SPARQL query to retrieve triples from AllegroGraph’s default graph. This has been very convenient but it is also misleading. As soon as we tell SPARQL to search a specific graph, we lose the ability to search AllegroGraph’s default graph! Triples from the null graph vanish from the search results. Why is that?
It is important to understand that AllegroGraph and SPARQL use the phrase “default graph” to identify two very different things.
- AllegroGraph’s default graph, or null context, is simply the set of all triples that have null in the fourth field of the “triple.” The default graph is an unnamed subgraph of the AllegroGraph triple store.
- SPARQL uses default graph to describe something that is very different. In SPARQL, the default graph is a temporary pool of triples imported from one or more named graphs. SPARQL’s default graph is constructed and discarded in the service of a single query. Standard SPARQL was designed for named graphs only, and has no syntax to identify a truly unnamed graph. AllegroGraph’s SPARQL, however, has been extended to allow the unnamed graph to participate in multi-graph queries.
We can use AllegroGraph’s SPARQL to search specific subgraphs in three ways.
- We can create a temporary default graph using the
FROM
operator.- We can put AllegroGraph’s unnamed graph into SPARQL’s default graph using
FROM DEFAULT
.- Or we can target specific named graphs using the
FROM NAMED
operator.
Here’s an example of a query that accesses the unnamed graph explicitly:
query = conn.prepareTupleQuery(query="""
SELECT DISTINCT ?s FROM DEFAULT {
?s ?p ?o
}""")
query.evaluate(output=True)
This will not process any of the triples in named contexts:
------------
| s |
============
| ex://ted |
------------
Here’s an example of a query that uses FROM
. It instructs SPARQL
to regard context1
as the default graph for the purposes of this
query.
query = conn.prepareTupleQuery(query="""
SELECT DISTINCT ?s FROM <ex://context1> {
?s ?p ?o
}""")
query.evaluate(output=True)
Now only one context is processed:
--------------
| s |
==============
| ex://alice |
--------------
The next example changes FROM
to FROM NAMED
in the same query:
query = conn.prepareTupleQuery(query="""
SELECT DISTINCT ?s FROM NAMED <ex://context1> {
?s ?p ?o
}""")
query.evaluate(output=True)
There are no matches now! The pattern { ?s ?p ?o . }
only matches
the SPARQL default graph. We declared context1
to be a named
graph, so it is no longer the default graph.
-----
| s |
=====
-----
To match triples in named graphs, SPARQL requires a GRAPH
pattern:
query = conn.prepareTupleQuery(query="""
SELECT DISTINCT ?s ?g FROM NAMED <ex://context1> {
GRAPH ?g { ?s ?p ?o }
}""")
query.evaluate(output=True)
This time we’ll also print the graph:
------------------------------
| s | g |
==============================
| ex://alice | ex://context1 |
------------------------------
We can also combine all the forms presented above:
query = conn.prepareTupleQuery(query="""
SELECT DISTINCT ?s ?g
FROM DEFAULT
FROM <ex://context1>
FROM NAMED <ex://context2> {
{ ?s ?p ?o } UNION { GRAPH ?g { ?s ?p ?o } }
}""")
query.evaluate(output=True)
This query puts AllegroGraph’s unnamed graph and the context1
graph into SPARQL’s default graph, where the triples can be found by
using a simple {?s ?p ?o . }
query. Then it identifies
context2
as a named graph, which can be searched using a GRAPH
pattern. In the final line, we used a UNION
operator to combine
the matches of the simple and GRAPH
patterns.
This query should find all three subjects:
------------------------------
| s | g |
==============================
| ex://alice | --- |
| ex://ted | --- |
| ex://bob | ex://context2 |
------------------------------
SPARQL with Dataset
object¶
A Dataset
object is a construct that contains two lists of
named graphs. There is one list of graphs that will become the SPARQL
default graph, just like using FROM
in the query. There is a
second list of graphs that will be named graphs in the query, just
like using FROM NAMED. To use the dataset, we put the graph URIs into
the dataset object, and then add the dataset to the query object. When
we evaluate the query, the results will be confined to the graphs
listed in the dataset.
from franz.openrdf.query.dataset import Dataset
dataset = Dataset()
dataset.addDefaultGraph(context1)
dataset.addNamedGraph(context2)
query = conn.prepareTupleQuery(query="""
SELECT DISTINCT ?s ?g {
{ ?s ?p ?o } UNION { GRAPH ?g { ?s ?p ?o } }
}""")
query.setDataset(dataset)
query.evaluate(output=True)
Note that, since we’re explicitly specifying graphs (through a dataset
object), we need a GRAPH
pattern to match triples from the named
graphs. Triples from the unnamed graph are not matched at all, since
that graph is not a part of the dataset.
------------------------------
| s | g |
==============================
| ex://alice | --- |
| ex://bob | ex://context2 |
------------------------------