.. _example6:
Example 6: Importing triples
----------------------------
AllegroGraph can import files in multiple RDF :class:`formats
<.RDFFormat>`, such as `Turtle`_ or `N-Triples`_. The example below
calls the connection object's :meth:`add` method to load an N-Triples
file, and :meth:`addFile` to load an RDF/XML file. Both methods work,
but the best practice is to use :meth:`addFile`.
The :download:`RDF/XML file <../../data/vcards.rdf>` contains a short
list of v-cards (virtual business cards), like this one:
.. code-block:: xml
John Smith
Smith
John
Save this file in :file:`./data/vcards.rdf` (or choose another path
and adjust the code below).
The :download:`N-Triples file <../../data/kennedy.ntriples>` contains
a graph of resources describing the Kennedy family, the places where
they were each born, their colleges, and their professions. A typical
entry from that file looks like this:
.. code-block:: text
"Joseph" .
"Patrick" .
"Kennedy" .
"none" .
.
"1888" .
"1969" .
.
.
.
.
.
.
Save the file to :file:`./data/kennedy.ntriples`.
Note that AllegroGraph can segregate triples into contexts (subgraphs)
by treating them as quads, but the N-Triples and RDF/XML formats
cannot include context information (unlike e.g `N-Quads`_ or
`Trig`_). They deal with triples only, so there is no place to store a
fourth field in those formats. In the case of the :meth:`add` call, we
have omitted the context argument so the triples are loaded into the
default graph (sometimes called the "null context.") The
:meth:`addFile` call includes an explicit context setting, so the
fourth field of each VCard triple will be the context named
``http://example.org#vcards``. The connection :meth:`size` method
takes an optional context argument. With no argument, it returns the
total number of triples in the repository. Below, it returns the
number ``16`` for the ``context`` context argument, and the number
``28`` for the null context (``None``) argument.
.. literalinclude:: doctest_setup.py
:language: python_rdf
:start-after: BEGIN-CONNECT
:end-before: END-CONNECT
The variables ``path1`` and ``path2`` are bound to the RDF/XML and
N-Triples files, respectively.
.. testcode:: example6
import os.path
# We assume that our data files live in this directory.
DATA_DIR = 'data'
path1 = os.path.join(DATA_DIR, 'vcards.rdf')
path2 = os.path.join(DATA_DIR, 'kennedy.ntriples')
The triples about the VCards will be added to a specific context, so
naturally we need a URI to identify that context.
.. testcode:: example6
context = conn.createURI("http://example.org#vcards")
In the next step we use :meth:`addFile` to load the VCard triples into
the ``#vcards`` context:
.. testcode:: example6
from franz.openrdf.rio.rdfformat import RDFFormat
conn.addFile(path1, None, format=RDFFormat.RDFXML, context=context)
Then we use :meth:`add` to load the Kennedy family tree into the
default context:
.. testcode:: example6
conn.add(path2, base=None, format=RDFFormat.NTRIPLES, contexts=None)
Now we'll ask AllegroGraph to report on how many triples it sees in
the default context and in the `#vcards` context:
.. testcode:: example6
print('VCard triples (in {context}): {count}'.format(
count=conn.size(context), context=context))
print('Kennedy triples (default graph): {count}'.format(
count=conn.size('null')))
The output of this report was:
.. testoutput:: example6
VCard triples (in ): 16
Kennedy triples (default graph): 1214