Example 6: Importing triples

AllegroGraph can import files in multiple RDF formats, such as Turtle or N-Triples. The example below calls the connection object’s add() method to load an N-Triples file, and addFile() to load an RDF/XML file. Both methods work, but the best practice is to use addFile().

Note

If you get a ‘file not found’ error while executing this example, make sure that the DATA_DIR setting (described in the Setting the environment for the tutorial section of this tutorial).

The RDF/XML file contains a short list of v-cards (virtual business cards), like this one:

<rdf:Description rdf:about="http://somewhere/JohnSmith/">
  <vCard:FN>John Smith</vCard:FN>
  <vCard:N rdf:parseType="Resource">
    <vCard:Family>Smith</vCard:Family>
    <vCard:Given>John</vCard:Given>
  </vCard:N>
</rdf:Description>

The N-Triples file contains a graph of resources describing the Kennedy family, the places where they were each born, their colleges, and their professions. A typical entry from that file looks like this:

<http://www.franz.com/simple#person1> <http://www.franz.com/simple#first-name> "Joseph" .
<http://www.franz.com/simple#person1> <http://www.franz.com/simple#middle-initial> "Patrick" .
<http://www.franz.com/simple#person1> <http://www.franz.com/simple#last-name> "Kennedy" .
<http://www.franz.com/simple#person1> <http://www.franz.com/simple#suffix> "none" .
<http://www.franz.com/simple#person1> <http://www.franz.com/simple#alma-mater> <http://www.franz.com/simple#Harvard> .
<http://www.franz.com/simple#person1> <http://www.franz.com/simple#birth-year> "1888" .
<http://www.franz.com/simple#person1> <http://www.franz.com/simple#death-year> "1969" .
<http://www.franz.com/simple#person1> <http://www.franz.com/simple#sex> <http://www.franz.com/simple#male> .
<http://www.franz.com/simple#person1> <http://www.franz.com/simple#spouse> <http://www.franz.com/simple#person2> .
<http://www.franz.com/simple#person1> <http://www.franz.com/simple#has-child> <http://www.franz.com/simple#person3> .
<http://www.franz.com/simple#person1> <http://www.franz.com/simple#profession> <http://www.franz.com/simple#banker> .
<http://www.franz.com/simple#person1> <http://www.franz.com/simple#birth-place> <http://www.franz.com/simple#place5> .
<http://www.franz.com/simple#person1> <http://www.w3.org/1999/02/22-rdf-syntax-ns#type> <http://www.franz.com/simple#person> .

Note that AllegroGraph can segregate triples into contexts (subgraphs) by treating them as quads, but the N-Triples and RDF/XML formats cannot include context information (unlike e.g N-Quads or Trig). They deal with triples only, so there is no place to store a fourth field in those formats. In the case of the add() call, we have omitted the context argument so the triples are loaded into the default graph (sometimes called the “null context.”) The addFile() call includes an explicit context setting, so the fourth field of each VCard triple will be the context named http://example.org#vcards. The connection size() method takes an optional context argument. With no argument, it returns the total number of triples in the repository. Below, it returns the number 16 for the context context argument, and the number 28 for the null context (None) argument.

from franz.openrdf.rio.rdfformat import RDFFormat
import os.path

conn = connect()

The variables path1 and path2 are bound to the RDF/XML and N-Triples files, respectively.

path1 = os.path.join(DATA_DIR, 'vcards.rdf')
path2 = os.path.join(DATA_DIR, 'kennedy.ntriples')

The triples about the VCards will be added to a specific context, so naturally we need a URI to identify that context.

context = conn.createURI("http://example.org#vcards")

In the next step we use addFile() to load the VCard triples into the #vcards context:

conn.addFile(path1, None, format=RDFFormat.RDFXML, context=context)

Then we use add() to load the Kennedy family tree into the default context:

conn.add(path2, base=None, format=RDFFormat.NTRIPLES, contexts=None)

Now we’ll ask AllegroGraph to report on how many triples it sees in the default context and in the #vcards context:

print('VCard triples (in {context}): {count}'.format(
      count=conn.size(context), context=context))

print('Kennedy triples (default graph): {count}'.format(
      count=conn.size('null')))

The output of this report was:

VCard triples (in <http://example.org#vcards>): 16
Kennedy triples (default graph): 1214