This document provides an overview of the SPARQL extension syntax for AllegroGraph's geospatial subsystem. By using this syntax, you can write queries that efficiently search geospatial information, yet seamlessly integrate with your other RDF data.
Invocation
All you need to do to use geospatial syntax in your SPARQL queries is to pass t
for the :extendedp
argument to run-sparql. From Java, call setExtended(true)
on your SPARQLQuery
instance.
Syntax
Because a geospatial query pattern contributes new result rows, it cannot reasonably be implemented as a mere FILTER
. Instead, a syntax similar to SPARQL's existing GRAPH
operator is used. Wherever you can write a graph pattern — GRAPH ?x { ... }
— you can write a geospatial query pattern.
A geospatial query pattern looks like this:
GEO geo-spec geo-op geo-args {
triple-patterns # ← geospatial query pattern body
}
WHERE {
patterns
}
A grammar is included at the end of this document.
geo-spec
The geospatial system needs some input settings that establish its interpretation:
whether the geospatial data is stored in the object or graph parts of the triple
which geospatial subtype is to be queried
geo-spec, then, can be 0 or 1 of the keywords OBJECT
or GRAPH
, followed by the keyword SUBTYPE
and a subtype identifier (described below).
If neither OBJECT
nor GRAPH
is provided, OBJECT
is assumed.
geo-op, geo-args
Next in the pattern comes the operator itself, and its arguments. In the current incarnation of AllegroGraph, the available operators are:
RADIUS
, taking a point (cartesian; x and y) and a numeric radiusBOUNDINGBOX
, taking two pointsHAVERSINE
, taking a point (spherical; longitude and latitude) and a radiusPOLYGON
, taking either a resource in the store that describes a polygon, or a polygon specification.POLYGON
is not yet supported; the interface is described here for completeness.
Geospatial query pattern body
This is a sequence of triple patterns. The first pattern is privileged: this is the point of unification between results produced by the geo engine (as described by the geo op) and the rest of the query.
This first pattern provides the predicate and input/output for the subject, object, and graph of each considered triple.
Points
Points are an (x, y) or (longitude, latitude) pair. Because geospatial information can be stored as discrete values, or combined into a geospatial UPI, the POINT
pseudo-operator is provided to yield a single value from two inputs (either SPARQL variables or literals in the query). The geospatial query operations are all defined in terms of points.
Note that the order of the arguments is always horizontal then vertical, even for latitude and longitude. This choice is made for consistency's sake.
The WHERE
clause
An AllegroGraph geospatial query requires a certain set of inputs — for instance, get-triples-geospatial-radius
needs a predicate, a point, and a radius. The predicate is established by the geospatial pattern body, but how are the other inputs provided?
The algebraic definition of SPARQL mandates the semantics (if not the execution strategy) of bottom-up query evaluation: each pattern stands alone, combined with other patterns through algebraic operators such as Join
. The input to the geospatial pattern can't come from sibling patterns. In order to ensure that a geospatial pattern executes with the inputs it needs, the pattern must be associated with a subordinate WHERE
clause.
Bindings established by the WHERE
clause are used as input to the geospatial body. All bindings established the WHERE
clause are shared with the geospatial pattern. This is the most expressive possible implementation strategy.
If at run-time a binding is not established by the WHERE
clause — for example, if an OPTIONAL
pattern fails to match — then that result row is skipped. For example, running the query
SELECT * {
GEO
SUBTYPE ...
RADIUS (POINT(?lon, ?lat), ?rad) {
# Some patterns, possibly mentioning ?lon, ?lat, ?rad.
}
WHERE {
?p foo:placename "Home" ;
geo:lat ?lat ;
geo:lon ?lon .
OPTIONAL {
?p geo:radius ?rad .
}
}
}
against the following data (in Turtle format):
@prefix ex: <http://example.com/foo#> .
@prefix geo: <http://example.com/geo#> .
ex:a foo:placename "Home" ;
geo:lat 5.0;
geo:lon 5.0.
ex:b foo:placename "Home" ;
geo:lat 8.0;
geo:lon 8.0;
geo:radius 10.0;
will only cause the GEO
clause to execute with the inputs (8.0, 8.0, 10.0) — the bindings established by ex:a
are incomplete, so the row is skipped. Omitting the OPTIONAL
clause entirely will result in an error before query execution.
A GEO
query with only constant inputs can leave empty the WHERE
clause:
SELECT * {
GEO SUBTYPE ... RADIUS (POINT(-122.275, 37.8036), 10.0) {
# Some patterns.
}
WHERE {}
}
Subtype identifiers
Geo subtypes must be identified through the augmented UUID syntax understood by AllegroGraph's geo component. This is the identifier returned by add-geospatial-subtype-to-db
:
sparql(6): (add-geospatial-subtype-to-db *lat-lon-5* *db*)
"21e6000c-0b43-11dd-a684-000bcdce3e4b-[-180.0,180.0][35.0,55.0]-5.0-miles"
This syntax, while cumbersome, is at least unambiguous. Future extensions might improve usability in this area.
Grammar
WhereClause
, Var
, NumericLiteral
, and Triples
are all defined by the SPARQL grammar itself. Terminals are in uppercase. (|)
denotes alternation; sequencing is implicit; ?
means optional; (
and )
are literal parentheses.
GeoPattern ::=
GEO PartSelection? Subtype GeoOp GeoBody WhereClause
PartSelection ::=
( OBJECT | GRAPH )
Subtype ::=
SUBTYPE String
GeoOp ::=
RADIUS \( Point , Radius \)
HAVERSINE \( Point , Radius ( MILES | KM ) \)
BOUNDINGBOX \( Point , Point \)
POLYGON \( RESOURCE Var \)
Point ::=
Var
POINT \( LatLon , LatLon \)
LatLon ::=
( Var | NumericLiteral )
GeoBody ::=
{ Triples }
Extension functions
It is very common to want to leverage geospatial information in filters and ordering expressions — e.g., to order results by distance. To this end we expose the following functions, suitable for use within a FILTER
or ORDER BY
expression.
All of these functions will raise a type error if their arguments are not points. If you need to synthesize a point from two numeric values (or the numeric bindings of variables), you can use the POINT
builtin. For example:
geo:cartesian-distance(?point, POINT(10.0, ?y))
The prefix geo
is bound to <http://franz.com/ns/allegrograph/3.0/geospatial/fn/>
.
geo:cartesian-distance-squared (p1, p2)
: Returns the square of the distance between the two points.
geo:cartesian-distance (p1, p2)
: Returns the distance between the two points.
geo:haversine-r (p1, p2)
: Returns the Haversine distance between two points, in radians.
geo:haversine-miles (p1, p2)
: Returns the Haversine distance between two points, in miles.
geo:haversine-km (p1, p2)
: Returns the Haversine distance between two points, in kilometers.
geo:longitude (p)
: Returns the longitude component of a point.
geo:latitude (p)
: Returns the latitude component of a point.
geo:cartesian-x (p)
: Returns the x component of a point.
geo:cartesian-y (p)
: Returns the y component of a point.
A note
This SPARQL extension is a work in progress; as the geospatial component of AllegroGraph evolves, and more user experience is gained, it is expected that this interface will change. Franz, Inc. welcomes your feedback.