Example 6: Importing triples

AllegroGraph can import files in multiple RDF formats, such as Turtle or N-Triples. The example below calls the connection object’s add() method to load an N-Triples file, and addFile() to load an RDF/XML file. Both methods work, but the best practice is to use addFile().

The RDF/XML file contains a short list of v-cards (virtual business cards), like this one:

<rdf:Description rdf:about="http://somewhere/JohnSmith/">
  <vCard:FN>John Smith</vCard:FN>
  <vCard:N rdf:parseType="Resource">
    <vCard:Family>Smith</vCard:Family>
    <vCard:Given>John</vCard:Given>
  </vCard:N>
</rdf:Description>

Save this file in ./data/vcards.rdf (or choose another path and adjust the code below).

The N-Triples file contains a graph of resources describing the Kennedy family, the places where they were each born, their colleges, and their professions. A typical entry from that file looks like this:

<http://www.franz.com/simple#person1> <http://www.franz.com/simple#first-name> "Joseph" .
<http://www.franz.com/simple#person1> <http://www.franz.com/simple#middle-initial> "Patrick" .
<http://www.franz.com/simple#person1> <http://www.franz.com/simple#last-name> "Kennedy" .
<http://www.franz.com/simple#person1> <http://www.franz.com/simple#suffix> "none" .
<http://www.franz.com/simple#person1> <http://www.franz.com/simple#alma-mater> <http://www.franz.com/simple#Harvard> .
<http://www.franz.com/simple#person1> <http://www.franz.com/simple#birth-year> "1888" .
<http://www.franz.com/simple#person1> <http://www.franz.com/simple#death-year> "1969" .
<http://www.franz.com/simple#person1> <http://www.franz.com/simple#sex> <http://www.franz.com/simple#male> .
<http://www.franz.com/simple#person1> <http://www.franz.com/simple#spouse> <http://www.franz.com/simple#person2> .
<http://www.franz.com/simple#person1> <http://www.franz.com/simple#has-child> <http://www.franz.com/simple#person3> .
<http://www.franz.com/simple#person1> <http://www.franz.com/simple#profession> <http://www.franz.com/simple#banker> .
<http://www.franz.com/simple#person1> <http://www.franz.com/simple#birth-place> <http://www.franz.com/simple#place5> .
<http://www.franz.com/simple#person1> <http://www.w3.org/1999/02/22-rdf-syntax-ns#type> <http://www.franz.com/simple#person> .

Save the file to ./data/kennedy.ntriples.

Note that AllegroGraph can segregate triples into contexts (subgraphs) by treating them as quads, but the N-Triples and RDF/XML formats cannot include context information (unlike e.g N-Quads or Trig). They deal with triples only, so there is no place to store a fourth field in those formats. In the case of the add() call, we have omitted the context argument so the triples are loaded into the default graph (sometimes called the “null context.”) The addFile() call includes an explicit context setting, so the fourth field of each VCard triple will be the context named http://example.org#vcards. The connection size() method takes an optional context argument. With no argument, it returns the total number of triples in the repository. Below, it returns the number 16 for the context context argument, and the number 28 for the null context (None) argument.

from franz.openrdf.connect import ag_connect

conn = ag_connect('python-tutorial', create=True, clear=True)

The variables path1 and path2 are bound to the RDF/XML and N-Triples files, respectively.

import os.path

# We assume that our data files live in this directory.
DATA_DIR = 'data'
path1 = os.path.join(DATA_DIR, 'vcards.rdf')
path2 = os.path.join(DATA_DIR, 'kennedy.ntriples')

The triples about the VCards will be added to a specific context, so naturally we need a URI to identify that context.

context = conn.createURI("http://example.org#vcards")

In the next step we use addFile() to load the VCard triples into the #vcards context:

from franz.openrdf.rio.rdfformat import RDFFormat

conn.addFile(path1, None, format=RDFFormat.RDFXML, context=context)

Then we use add() to load the Kennedy family tree into the default context:

conn.add(path2, base=None, format=RDFFormat.NTRIPLES, contexts=None)

Now we’ll ask AllegroGraph to report on how many triples it sees in the default context and in the #vcards context:

print('VCard triples (in {context}): {count}'.format(
      count=conn.size(context), context=context))

print('Kennedy triples (default graph): {count}'.format(
      count=conn.size('null')))

The output of this report was:

VCard triples (in <http://example.org#vcards>): 16
Kennedy triples (default graph): 1214