This document provides an overview of the SPARQL extension syntax for AllegroGraph's geospatial subsystem. By using this syntax, you can write queries that efficiently search geospatial information, yet seamlessly integrate with your other RDF data.

Invocation

All you need to do to use geospatial syntax in your SPARQL queries is to pass t for the :extendedp argument to run-sparql. From Java, call setExtended(true) on your SPARQLQuery instance.

Syntax

Because a geospatial query pattern contributes new result rows, it cannot reasonably be implemented as a mere FILTER. Instead, a syntax similar to SPARQL's existing GRAPH operator is used. Wherever you can write a graph pattern — GRAPH ?x { ... } — you can write a geospatial query pattern.

A geospatial query pattern looks like this:

GEO geo-spec geo-op geo-args {  
  triple-patterns     # ← geospatial query pattern body  
}  
WHERE {  
  patterns  
}  

A grammar is included at the end of this document.

geo-spec

The geospatial system needs some input settings that establish its interpretation:

geo-spec, then, can be 0 or 1 of the keywords OBJECT or GRAPH, followed by the keyword SUBTYPE and a subtype identifier (described below).

If neither OBJECT nor GRAPH is provided, OBJECT is assumed.

geo-op, geo-args

Next in the pattern comes the operator itself, and its arguments. In the current incarnation of AllegroGraph, the available operators are:

Geospatial query pattern body

This is a sequence of triple patterns. The first pattern is privileged: this is the point of unification between results produced by the geo engine (as described by the geo op) and the rest of the query.

This first pattern provides the predicate and input/output for the subject, object, and graph of each considered triple.

Points

Points are an (x, y) or (longitude, latitude) pair. Because geospatial information can be stored as discrete values, or combined into a geospatial UPI, the POINT pseudo-operator is provided to yield a single value from two inputs (either SPARQL variables or literals in the query). The geospatial query operations are all defined in terms of points.

Note that the order of the arguments is always horizontal then vertical, even for latitude and longitude. This choice is made for consistency's sake.

The WHERE clause

An AllegroGraph geospatial query requires a certain set of inputs — for instance, get-triples-geospatial-radius needs a predicate, a point, and a radius. The predicate is established by the geospatial pattern body, but how are the other inputs provided?

The algebraic definition of SPARQL mandates the semantics (if not the execution strategy) of bottom-up query evaluation: each pattern stands alone, combined with other patterns through algebraic operators such as Join. The input to the geospatial pattern can't come from sibling patterns. In order to ensure that a geospatial pattern executes with the inputs it needs, the pattern must be associated with a subordinate WHERE clause.

Bindings established by the WHERE clause are used as input to the geospatial body. All bindings established the WHERE clause are shared with the geospatial pattern. This is the most expressive possible implementation strategy.

If at run-time a binding is not established by the WHERE clause — for example, if an OPTIONAL pattern fails to match — then that result row is skipped. For example, running the query

SELECT * {  
  GEO  
  SUBTYPE ...  
  RADIUS (POINT(?lon, ?lat), ?rad) {  
    # Some patterns, possibly mentioning ?lon, ?lat, ?rad.  
  }  
  WHERE {  
    ?p foo:placename "Home" ;  
       geo:lat ?lat ;  
       geo:lon ?lon .  
    OPTIONAL {  
      ?p geo:radius ?rad .  
    }  
  }  
} 

against the following data (in Turtle format):

@prefix ex: <http://example.com/foo#> .  
@prefix geo: <http://example.com/geo#> .  
ex:a foo:placename "Home" ;  
     geo:lat 5.0;  
     geo:lon 5.0.  
ex:b foo:placename "Home" ;  
     geo:lat 8.0;  
     geo:lon 8.0;  
     geo:radius 10.0; 

will only cause the GEO clause to execute with the inputs (8.0, 8.0, 10.0) — the bindings established by ex:a are incomplete, so the row is skipped. Omitting the OPTIONAL clause entirely will result in an error before query execution.

A GEO query with only constant inputs can leave empty the WHERE clause:

SELECT * {  
  GEO SUBTYPE ... RADIUS (POINT(-122.275, 37.8036), 10.0) {  
    # Some patterns.  
  }  
  WHERE {}  
} 

Subtype identifiers

Geo subtypes must be identified through the augmented UUID syntax understood by AllegroGraph's geo component. This is the identifier returned by add-geospatial-subtype-to-db:

sparql(6): (add-geospatial-subtype-to-db *lat-lon-5*  *db*)  
"21e6000c-0b43-11dd-a684-000bcdce3e4b-[-180.0,180.0][35.0,55.0]-5.0-miles" 

This syntax, while cumbersome, is at least unambiguous. Future extensions might improve usability in this area.

Grammar

WhereClause, Var, NumericLiteral, and Triples are all defined by the SPARQL grammar itself. Terminals are in uppercase. (|) denotes alternation; sequencing is implicit; ? means optional; ( and ) are literal parentheses.

GeoPattern ::=  
  GEO PartSelection? Subtype GeoOp GeoBody WhereClause  
 
PartSelection ::=  
  ( OBJECT | GRAPH )  
 
Subtype ::=  
  SUBTYPE String  
 
GeoOp ::=  
  RADIUS \( Point , Radius \)  
  HAVERSINE \( Point , Radius ( MILES | KM ) \)  
  BOUNDINGBOX \( Point , Point \)  
  POLYGON \( RESOURCE Var \)  
 
Point ::=  
  Var  
  POINT \( LatLon , LatLon \)  
 
 
LatLon ::=  
  ( Var | NumericLiteral )  
 
GeoBody ::=  
  { Triples }  

Extension functions

It is very common to want to leverage geospatial information in filters and ordering expressions — e.g., to order results by distance. To this end we expose the following functions, suitable for use within a FILTER or ORDER BY expression.

All of these functions will raise a type error if their arguments are not points. If you need to synthesize a point from two numeric values (or the numeric bindings of variables), you can use the POINT builtin. For example:

geo:cartesian-distance(?point, POINT(10.0, ?y))  

The prefix geo is bound to <http://franz.com/ns/allegrograph/3.0/geospatial/fn/>.

geo:cartesian-distance-squared (p1, p2): Returns the square of the distance between the two points.

geo:cartesian-distance (p1, p2): Returns the distance between the two points.

geo:haversine-r (p1, p2): Returns the Haversine distance between two points, in radians.

geo:haversine-miles (p1, p2): Returns the Haversine distance between two points, in miles.

geo:haversine-km (p1, p2): Returns the Haversine distance between two points, in kilometers.

geo:longitude (p): Returns the longitude component of a point.

geo:latitude (p): Returns the latitude component of a point.

geo:cartesian-x (p): Returns the x component of a point.

geo:cartesian-y (p): Returns the y component of a point.

A note

This SPARQL extension is a work in progress; as the geospatial component of AllegroGraph evolves, and more user experience is gained, it is expected that this interface will change. Franz, Inc. welcomes your feedback.