.. _example10: Example 10: Graphs in SPARQL ---------------------------- In :ref:`example6` and :ref:`example7` we've seen how to import data to a non-default context and run queries against such data. In this example we'll explore facilities for handling multiple contexts provided by SPARQL and the AllegroGraph Python client. We'll start by opening a connection: .. literalinclude:: doctest_setup.py :language: python_rdf :start-after: BEGIN-CONNECT :end-before: END-CONNECT Now we will create two URIs that will represent named contexts. .. testcode:: example10 context1 = conn.createURI("ex://context1") context2 = conn.createURI("ex://context2") The first context will be filled using the :meth:`addData` method: .. testcode:: example10 conn.addData(""" @prefix : . :alice a :person ; :name "Alice" .""", context=context1) The second context will be filled using :meth:`addTriple`. Notice how we use a constant defined in the ``RDF`` class to obtain the URI of the ``type`` predicate: .. testcode:: example10 from franz.openrdf.vocabulary.rdf import RDF bob = conn.createURI('ex://bob') bob_name = conn.createLiteral('Bob') name = conn.createURI('ex://person') person = conn.createURI('ex://person') conn.addTriple(bob, RDF.TYPE, person, contexts=[context2]) conn.addTriple(bob, name, bob_name, contexts=[context2]) Finally we'll add two triples to the default context using :meth:`addStatement`: .. testcode:: example10 from franz.openrdf.model import Statement ted = conn.createURI('ex://ted') ted_name = conn.createLiteral('Ted') stmt1 = Statement(ted, name, ted_name) stmt2 = Statement(ted, RDF.TYPE, person) conn.addStatement(stmt1) conn.addStatement(stmt2) .. warning:: The :class:`.Statement` object contains a `context` field. This field is *ignored* by :meth:`addStatement`. If you wish to add a statement object to a specific context, use the ``contexts`` parameter. As we've seen already in :ref:`example7`, a call to :meth:`getStatements` will return triples from all contexts: .. testcode:: example10 with conn.getStatements() as result: print('getStatements(): {0}'.format(len(result))) print('size(): {0}'.format(conn.size())) :meth:`size` will also process all contexts by default. .. testoutput:: example10 getStatements(): 6 size(): 6 Both :meth:`getStatements` and :meth:`size` accept a ``contexts`` parameter that can be used to limit processing to a specified list of graphs: .. testcode:: example10 contexts = [context1, context2] with conn.getStatements(contexts=contexts) as result: print('getStatements(): {0}'.format(len(result))) print('size(): {0}'.format(conn.size(contexts=contexts))) As expected, triples from the default context are not processed: .. testoutput:: example10 getStatements(): 4 size(): 4 To include the default graph when using the ``contexts`` parameter use ``None`` as a graph URI: .. testcode:: example10 contexts = [context1, None] with conn.getStatements(contexts=contexts) as result: print('getStatements(): {0}'.format(len(result))) print('size(): {0}'.format(conn.size(contexts=contexts))) Now triples from the default context and from one of our named contexts are processed: .. testoutput:: example10 getStatements(): 4 size(): 4 SPARQL using ``FROM``, ``FROM DEFAULT``, and ``FROM NAMED`` ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ In many of our examples we have used a simple SPARQL query to retrieve triples from AllegroGraph's default graph. This has been very convenient but it is also misleading. As soon as we tell SPARQL to search a specific graph, we lose the ability to search AllegroGraph's default graph! Triples from the null graph vanish from the search results. Why is that? It is important to understand that AllegroGraph and SPARQL use the phrase "default graph" to identify two very different things. * AllegroGraph's default graph, or null context, is simply the set of all triples that have `null` in the fourth field of the "triple." The *default graph* is an unnamed subgraph of the AllegroGraph triple store. * SPARQL uses *default graph* to describe something that is very different. In SPARQL, the *default graph* is a temporary pool of triples imported from one or more *named* graphs. SPARQL's *default graph* is constructed and discarded in the service of a single query. Standard SPARQL was designed for named graphs only, and has no syntax to identify a truly unnamed graph. AllegroGraph's SPARQL, however, has been extended to allow the unnamed graph to participate in multi-graph queries. We can use AllegroGraph's SPARQL to search specific subgraphs in three ways. * We can create a temporary *default graph* using the ``FROM`` operator. * We can put AllegroGraph's unnamed graph into SPARQL's default graph using ``FROM DEFAULT``. * Or we can target specific named graphs using the ``FROM NAMED`` operator. Here's an example of a query that accesses the unnamed graph explicitly: .. testcode:: example10 query = conn.prepareTupleQuery(query=""" SELECT DISTINCT ?s FROM DEFAULT { ?s ?p ?o }""") query.evaluate(output=True) This will not process any of the triples in named contexts: .. testoutput:: example10 ------------ | s | ============ | ex://ted | ------------ Here's an example of a query that uses ``FROM``. It instructs SPARQL to regard ``context1`` as the default graph for the purposes of this query. .. testcode:: example10 query = conn.prepareTupleQuery(query=""" SELECT DISTINCT ?s FROM { ?s ?p ?o }""") query.evaluate(output=True) Now only one context is processed: .. testoutput:: example10 -------------- | s | ============== | ex://alice | -------------- The next example changes ``FROM`` to ``FROM NAMED`` in the same query: .. testcode:: example10 query = conn.prepareTupleQuery(query=""" SELECT DISTINCT ?s FROM NAMED { ?s ?p ?o }""") query.evaluate(output=True) There are no matches now! The pattern ``{ ?s ?p ?o . }`` only matches the SPARQL default graph. We declared ``context1`` to be a *named* graph, so it is no longer the default graph. .. testoutput:: example10 ----- | s | ===== ----- To match triples in named graphs, SPARQL requires a ``GRAPH`` pattern: .. testcode:: example10 query = conn.prepareTupleQuery(query=""" SELECT DISTINCT ?s ?g FROM NAMED { GRAPH ?g { ?s ?p ?o } }""") query.evaluate(output=True) This time we'll also print the graph: .. testoutput:: example10 ------------------------------ | s | g | ============================== | ex://alice | ex://context1 | ------------------------------ We can also combine all the forms presented above: .. testcode:: example10 query = conn.prepareTupleQuery(query=""" SELECT DISTINCT ?s ?g FROM DEFAULT FROM FROM NAMED { { ?s ?p ?o } UNION { GRAPH ?g { ?s ?p ?o } } }""") query.evaluate(output=True) This query puts AllegroGraph's unnamed graph and the ``context1`` graph into SPARQL's default graph, where the triples can be found by using a simple ``{?s ?p ?o . }`` query. Then it identifies ``context2`` as a named graph, which can be searched using a ``GRAPH`` pattern. In the final line, we used a ``UNION`` operator to combine the matches of the simple and ``GRAPH`` patterns. This query should find all three subjects: .. testoutput:: example10 :options: +SORT ------------------------------ | s | g | ============================== | ex://alice | --- | | ex://ted | --- | | ex://bob | ex://context2 | ------------------------------ SPARQL with :class:`.Dataset` object ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ A :class:`.Dataset` object is a construct that contains two lists of named graphs. There is one list of graphs that will become the SPARQL default graph, just like using ``FROM`` in the query. There is a second list of graphs that will be *named graphs* in the query, just like using FROM NAMED. To use the dataset, we put the graph URIs into the dataset object, and then add the dataset to the query object. When we evaluate the query, the results will be confined to the graphs listed in the dataset. .. exttestcode:: example10 :emphasize-lines: 10 from franz.openrdf.query.dataset import Dataset dataset = Dataset() dataset.addDefaultGraph(context1) dataset.addNamedGraph(context2) query = conn.prepareTupleQuery(query=""" SELECT DISTINCT ?s ?g { { ?s ?p ?o } UNION { GRAPH ?g { ?s ?p ?o } } }""") query.setDataset(dataset) query.evaluate(output=True) Note that, since we're explicitly specifying graphs (through a dataset object), we need a ``GRAPH`` pattern to match triples from the named graphs. Triples from the unnamed graph are not matched at all, since that graph is not a part of the dataset. .. testoutput:: example10 :options: +SORT ------------------------------ | s | g | ============================== | ex://alice | --- | | ex://bob | ex://context2 | ------------------------------