AllegroGraph GeoSPARQL Tutorial
================================

AllegroGraph supports GeoSPARQL, the OGC standard for representing and querying
geospatial linked data in SPARQL. This tutorial covers the vocabulary, query
functions, and AllegroGraph extensions. You already have sparql_query and
add_triples tools — no special APIs needed.

Standard prefixes used throughout:
  PREFIX geo: <http://www.opengis.net/ont/geosparql#>
  PREFIX geof: <http://www.opengis.net/def/function/geosparql/>
  PREFIX geoext: <http://franz.com/ns/allegrograph/3.0/geosparql/ext#>
  PREFIX qudt: <http://qudt.org/vocab/unit/>

COPY-PASTE SYNTAX REFERENCE (use these EXACTLY):

Serialization properties:
  geo:asWKT       — Well-Known Text serialization (with datatype geo:wktLiteral)
  geo:asGeoJSON   — GeoJSON serialization (with datatype geo:geoJSONLiteral)

Simple Features relation FUNCTIONS (use in FILTER, return boolean):
  geof:sfEquals(?geom1, ?geom2)
  geof:sfDisjoint(?geom1, ?geom2)
  geof:sfIntersects(?geom1, ?geom2)
  geof:sfTouches(?geom1, ?geom2)
  geof:sfCrosses(?geom1, ?geom2)
  geof:sfWithin(?geom1, ?geom2)
  geof:sfContains(?geom1, ?geom2)
  geof:sfOverlaps(?geom1, ?geom2)

Simple Features relation MAGIC PROPERTIES (use as triple patterns, REQUIRE geohash index):
  ?geom1 geo:sfEquals ?geom2
  ?geom1 geo:sfDisjoint ?geom2
  ?geom1 geo:sfIntersects ?geom2
  ?geom1 geo:sfTouches ?geom2
  ?geom1 geo:sfCrosses ?geom2
  ?geom1 geo:sfWithin ?geom2       — ?geom1 is inside ?geom2
  ?geom1 geo:sfContains ?geom2     — ?geom1 contains ?geom2
  ?geom1 geo:sfOverlaps ?geom2

  Magic properties can also take inline WKT/GeoJSON literals on the right side:
  ?geom geo:sfWithin "POLYGON((...))..."^^geo:wktLiteral

Egenhofer relations (SPARQL functions only):
  geof:ehEquals, geof:ehDisjoint, geof:ehMeet, geof:ehOverlap,
  geof:ehCovers, geof:ehCoveredBy, geof:ehInside, geof:ehContains

RCC8 relations (SPARQL functions only):
  geof:rcc8eq, geof:rcc8dc, geof:rcc8ec, geof:rcc8po,
  geof:rcc8tppi, geof:rcc8tpp, geof:rcc8ntpp, geof:rcc8ntppi

Non-topological functions:
  geof:distance(?geom1, ?geom2, <unit>)   — shortest distance between geometries
  geof:intersection(?geom1, ?geom2)       — geometry of intersection
  geof:union(?geom1, ?geom2)              — merged geometry
  geof:difference(?geom1, ?geom2)         — geometry in ?geom1 but not ?geom2
  geof:symDifference(?geom1, ?geom2)      — geometry in either but not both
  geof:convexHull(?geom)                  — convex hull of a geometry
  geof:envelope(?geom)                    — bounding box
  geof:centroid(?geom)                    — center point
  geof:isEmpty(?geom)                     — true if geometry is empty
  geof:getSRID(?geom)                     — spatial reference identifier

AllegroGraph extensions:
  geoext:haversineDistance(?geom1, ?geom2)          — great-circle distance (default: meters)
  geoext:haversineDistance(?geom1, ?geom2, <unit>)  — with unit
  ?geom geoext:nearby (<point> <radius> <unit>)     — proximity search (magic property, REQUIRES geohash index)
  ?success geoext:buildGeohashIndex ()               — incremental index build
  ?success geoext:rebuildGeohashIndex ()             — full index rebuild

Units of measurement (qudt: prefix = http://qudt.org/vocab/unit/):
  qudt:M       Meter (default)
  qudt:KiloM   Kilometer
  qudt:MI      International Mile
  qudt:MI_US   Mile US Statute
  qudt:YD      Yard
  qudt:FT      Foot
  qudt:FT_US   US Survey Foot


1. RDF DATA MODEL FOR GEOSPATIAL DATA
======================================

GeoSPARQL uses two core classes connected by geo:hasGeometry:
  - geo:Feature  — the real-world thing (a city, a building, a restaurant)
  - geo:Geometry — the geometric representation (point, polygon, etc.)

The geometry is serialized as a literal via geo:asWKT or geo:asGeoJSON.

Turtle example (WKT):

  @prefix geo: <http://www.opengis.net/ont/geosparql#> .
  @prefix : <http://example.org/> .

  :Franz a geo:Feature ;
         geo:hasGeometry :FranzGeom .
  :FranzGeom a geo:Geometry ;
             geo:asWKT "POINT(-122.12956 37.88963)"^^geo:wktLiteral .

Turtle example (GeoJSON):

  :FranzGeom geo:asGeoJSON '''{"type": "Point",
    "coordinates": [-122.12956, 37.88963]}'''^^geo:geoJSONLiteral .

IMPORTANT: Geometry literals MUST be typed with geo:wktLiteral or geo:geoJSONLiteral.

WKT geometry types: POINT, LINESTRING, POLYGON, MULTIPOINT, MULTILINESTRING,
MULTIPOLYGON, GEOMETRYCOLLECTION.

WKT format:
  "POINT(longitude latitude)"^^geo:wktLiteral
  "LINESTRING(lon1 lat1, lon2 lat2, ...)"^^geo:wktLiteral
  "POLYGON((lon1 lat1, lon2 lat2, ..., lon1 lat1))"^^geo:wktLiteral
  Note: Polygons must close (first point = last point).

Optional SRS prefix in WKT (CRS84 is default if omitted):
  "<http://www.opengis.net/def/crs/OGC/1.3/CRS84> POINT(-83.4 34.3)"^^geo:wktLiteral

GeoJSON format follows RFC 7946:
  '''{"type": "Point", "coordinates": [-122.12956, 37.88963]}'''^^geo:geoJSONLiteral
  '''{"type": "Polygon", "coordinates": [[[lon1,lat1],[lon2,lat2],...,[lon1,lat1]]]}'''^^geo:geoJSONLiteral

Both WKT and GeoJSON can be used interchangeably in queries.


2. SIMPLE FEATURES RELATION FUNCTIONS
======================================

These SPARQL extension functions compare two geometries and return a boolean.
They work WITHOUT a geohash index (but may be slower on large datasets).

All functions take two geometry serializations (WKT or GeoJSON literals).

General query pattern:

  PREFIX geo: <http://www.opengis.net/ont/geosparql#>
  PREFIX geof: <http://www.opengis.net/def/function/geosparql/>

  SELECT ?feature WHERE {
    ?feature geo:hasGeometry ?geom .
    ?geom geo:asWKT ?wkt .
    ?referenceFeature geo:hasGeometry ?refGeom .
    ?refGeom geo:asWKT ?refWkt .
    FILTER (geof:sfContains(?refWkt, ?wkt))
  }

2a. geof:sfContains — find features contained by another feature
----------------------------------------------------------------

  SELECT ?f WHERE {
    :A :hasExactGeometry ?aGeom .
    ?aGeom geo:asWKT ?aWKT .
    ?f :hasExactGeometry ?fGeom .
    ?fGeom geo:asWKT ?fWKT .
    FILTER (
      geof:sfContains(?aWKT, ?fWKT) &&
        !sameTerm(?aGeom, ?fGeom)
    )
  }

2b. geof:sfWithin — find features within a bounding box or polygon
------------------------------------------------------------------

You can use inline geometry literals (no need to store the polygon):

  SELECT ?f WHERE {
    ?f :hasPointGeometry ?fGeom .
    ?fGeom geo:asWKT ?fWKT .
    FILTER (
      geof:sfWithin(
        ?fWKT,
        '''<http://www.opengis.net/def/crs/OGC/1.3/CRS84>
        Polygon ((-83.4 34.0, -83.1 34.0,
                  -83.1 34.2, -83.4 34.2,
                  -83.4 34.0))'''^^geo:wktLiteral
      )
    )
  }

2c. geof:sfIntersects — find features a line or polygon crosses
---------------------------------------------------------------

  SELECT ?label WHERE {
    BIND ('''{
      "type": "LineString",
      "coordinates": [[6.074, 53.510], [5.887, 50.750]]
    }'''^^geo:geoJSONLiteral AS ?line)
    ?province geo:hasGeometry ?geometry ;
              rdfs:label ?label .
    ?geometry geo:asGeoJSON ?geoJsonLiteral .
    BIND (geof:sfIntersects(?line, ?geoJsonLiteral) AS ?result)
    FILTER (?result = "true"^^xsd:boolean)
  }

Key relationships:
  sfWithin is the inverse of sfContains:
  - sfWithin: geometry A is completely inside geometry B
  - sfContains: geometry A completely encompasses geometry B
  sfIntersects: geometries share at least one point
  sfTouches: geometries share boundary points but not interior points
  sfOverlaps: geometries share some but not all interior points
  sfCrosses: geometries cross each other (e.g., line crosses polygon)
  sfDisjoint: geometries share no points at all
  sfEquals: geometries are topologically equal


3. NON-TOPOLOGICAL FUNCTIONS
=============================

3a. geof:distance — shortest distance between geometries
---------------------------------------------------------

Takes two geometry literals and a unit of measurement.

  PREFIX qudt: <http://qudt.org/vocab/unit/>

  SELECT ?f WHERE {
    :C geo:hasGeometry ?cGeom .
    ?cGeom geo:asWKT ?cWKT .
    ?f geo:hasGeometry ?fGeom .
    ?fGeom geo:asWKT ?fWKT .
    FILTER (?fGeom != ?cGeom)
  }
  ORDER BY ASC(geof:distance(?cWKT, ?fWKT, qudt:M))
  LIMIT 3

Example — distance between two municipalities in km:

  SELECT (geof:distance(?aGeo, ?bGeo, qudt:KiloM) AS ?distance) WHERE {
    ?a :municipalityName "Amsterdam" ;
       geo:hasGeometry ?aGeom .
    ?aGeom geo:asGeoJSON ?aGeo .
    ?b :municipalityName "Groningen" ;
       geo:hasGeometry ?bGeom .
    ?bGeom geo:asGeoJSON ?bGeo .
  }

3b. geof:union — merge two geometries
--------------------------------------

Returns a geometry representing all points in either geometry.

  SELECT ?f WHERE {
    ?f :hasExactGeometry ?fGeom .
    ?fGeom geo:asWKT ?fWKT .
    :A :hasExactGeometry ?aGeom .
    ?aGeom geo:asWKT ?aWKT .
    :D :hasExactGeometry ?dGeom .
    ?dGeom geo:asWKT ?dWKT .
    FILTER (
      geof:sfTouches(?fWKT, geof:union(?aWKT, ?dWKT))
    )
  }

3c. geof:intersection — common area of two geometries
------------------------------------------------------

Returns a geometry representing points shared by both geometries.

  SELECT (geof:intersection(?aGeo, ?bGeo) AS ?intersection) WHERE {
    ?a :municipalityName "Amsterdam" ;
       geo:hasGeometry ?aGeom .
    ?aGeom geo:asGeoJSON ?aGeo .
    ?b :provinceName "Noord-Holland" ;
       geo:hasGeometry ?bGeom .
    ?bGeom geo:asGeoJSON ?bGeo .
  }

3d. Other non-topological functions
------------------------------------

  geof:difference(?geom1, ?geom2)    — points in geom1 but not geom2
  geof:symDifference(?geom1, ?geom2) — points in either but not both
  geof:convexHull(?geom)             — smallest convex polygon enclosing geometry
  geof:envelope(?geom)               — axis-aligned bounding box
  geof:centroid(?geom)               — center point of geometry
  geof:isEmpty(?geom)                — true if geometry has no points


4. GEOHASH INDEXING
===================

Geohash is an algorithm that encodes geographic locations into short strings
(e.g., lat 37.889, lon -122.129 encodes as "9q9psceun6"). AllegroGraph uses
geohash indexing to accelerate spatial queries.

CRITICAL: The geohash index is REQUIRED for:
  - Simple Features magic properties (geo:sfWithin, geo:sfContains, etc.)
  - geoext:nearby proximity queries

Build the index BEFORE using these features.

4a. Build geohash index (incremental — only indexes new triples):

  PREFIX geoext: <http://franz.com/ns/allegrograph/3.0/geosparql/ext#>

  SELECT ?success WHERE {
    ?success geoext:buildGeohashIndex () .
  }

  Returns "true"^^xsd:boolean on success, "false" on failure.

4b. Rebuild geohash index (deletes existing index and rebuilds from scratch):

  SELECT ?success WHERE {
    ?success geoext:rebuildGeohashIndex () .
  }

  Use rebuildGeohashIndex when data has changed significantly or index
  seems corrupted. Use buildGeohashIndex for incremental updates.

4c. Verify the index — after indexing, each geometry is linked to its geohash:

  PREFIX geoext: <http://franz.com/ns/allegrograph/3.0/geosparql/ext#>

  SELECT ?place ?geometry ?geohash WHERE {
    ?place geo:hasGeometry ?geometry .
    ?geometry geoext:geohash ?geohash .
  }


5. SIMPLE FEATURES MAGIC PROPERTIES (INDEXED)
==============================================

After building the geohash index, you can use Simple Features relations as
magic properties (triple patterns) instead of FILTER functions. These use the
geo: namespace (not geof:) and leverage the geohash index for much better
performance on large datasets.

Magic property syntax vs. function syntax:

  FUNCTION (no index needed, slower on large data):
    FILTER (geof:sfContains(?aWKT, ?fWKT))

  MAGIC PROPERTY (requires geohash index, faster):
    ?aGeom geo:sfContains ?fGeom .

Note: Magic properties operate on GEOMETRY RESOURCES (the geo:Geometry nodes),
not on the WKT/GeoJSON literal values directly. The one exception is that
you can use an inline literal on the right side.

5a. geo:sfContains as magic property:

  SELECT ?f WHERE {
    :A :hasExactGeometry ?aGeom .
    ?f :hasExactGeometry ?fGeom .
    ?aGeom geo:sfContains ?fGeom .
    FILTER (!sameTerm(?aGeom, ?fGeom))
  }

5b. geo:sfWithin with inline WKT literal:

  SELECT ?f WHERE {
    ?f :hasPointGeometry ?fGeom .
    ?fGeom geo:sfWithin '''
      <http://www.opengis.net/def/crs/OGC/1.3/CRS84>
        Polygon ((-83.4 34.0, -83.1 34.0,
                  -83.1 34.2, -83.4 34.2,
                  -83.4 34.0))
    '''^^geo:wktLiteral
  }

5c. geo:sfWithin with GeoJSON literal (useful for polygon-based search):

  SELECT ?name WHERE {
    ?s :name ?name ;
       geo:hasGeometry ?geom .
    ?geom geo:sfWithin '''{"type": "Polygon",
      "coordinates": [[[-122.15, 37.88], [-122.10, 37.88],
                        [-122.10, 37.90], [-122.15, 37.90],
                        [-122.15, 37.88]]]}'''^^geo:geoJSONLiteral
  }

5d. Finding which province a municipality is in (geohash-accelerated):

  PREFIX franzOption_clauseReorderer: <franz:identity>

  SELECT ?provinceLabel ?muniLabel WHERE {
    ?province geo:hasGeometry ?provinceGeo ;
              a :Province ;
              rdfs:label ?provinceLabel .
    ?municipality geo:hasGeometry ?muniGeo ;
                  rdfs:label ?muniLabel .
    ?muniGeo geo:sfWithin ?provinceGeo .
  }

  NOTE: The franzOption_clauseReorderer hint can help the query optimizer
  when joining spatial and non-spatial patterns.


6. ALLEGROGRAPH EXTENSIONS
==========================

6a. geoext:haversineDistance — great-circle distance
----------------------------------------------------

Computes the haversine (great-circle) distance between two points on the
Earth's surface. More accurate than geof:distance for large distances.

Default unit is meters. Optional third argument specifies unit.

  PREFIX geoext: <http://franz.com/ns/allegrograph/3.0/geosparql/ext#>

  SELECT ?haversineDistance WHERE {
    BIND (geoext:haversineDistance(
      "POINT(-74.006 40.7128)"^^geo:wktLiteral,
      "POINT(-0.1278 51.5074)"^^geo:wktLiteral
    ) AS ?haversineDistance)
  }

With unit (e.g., feet):

  PREFIX qudt: <http://qudt.org/vocab/unit/>

  SELECT ?distanceFeet WHERE {
    BIND (geoext:haversineDistance(
      "POINT(-74.006 40.7128)"^^geo:wktLiteral,
      "POINT(-0.1278 51.5074)"^^geo:wktLiteral,
      qudt:FT
    ) AS ?distanceFeet)
  }

6b. geoext:nearby — proximity search (magic property)
-----------------------------------------------------

Returns all geometries within a specified radius of a given point.
REQUIRES geohash index to be built first.

Syntax:
  ?geom geoext:nearby (<point> <radius> <unit>)

  - Argument 1: Point geometry as WKT literal (REQUIRED)
  - Argument 2: Radius value (REQUIRED)
  - Argument 3: Unit of measurement (OPTIONAL, default: qudt:M)

Example — find restaurants within 1 km of a location:

  PREFIX geoext: <http://franz.com/ns/allegrograph/3.0/geosparql/ext#>
  PREFIX qudt: <http://qudt.org/vocab/unit/>

  SELECT ?restaurant ?distance WHERE {
    :FranzGeom geo:asWKT ?franzWKT .

    ?restaurant a :Restaurant ;
           geo:hasGeometry ?restaurantGeom .
    ?restaurantGeom geo:asWKT ?restaurantLiteral .

    BIND (geof:distance(?franzWKT, ?restaurantLiteral, qudt:KiloM)
          AS ?distance)

    ?restaurantGeom geoext:nearby (?franzWKT 1 qudt:KiloM) .
  }
  ORDER BY ASC(?distance)

Example — nearby with stored point geometry:

  SELECT ?place ?distance WHERE {
    :MyLocation geo:hasGeometry ?myGeom .
    ?myGeom geo:asWKT ?myWKT .

    ?place geo:hasGeometry ?placeGeom .
    ?placeGeom geo:asWKT ?placeWKT .

    ?placeGeom geoext:nearby (?myWKT 5 qudt:KiloM) .

    BIND (geof:distance(?myWKT, ?placeWKT, qudt:KiloM) AS ?distance)
  }
  ORDER BY ASC(?distance)


7. ETL PATTERNS FOR LOADING GEOSPATIAL DATA
============================================

When loading geospatial data, follow this RDF pattern:

  1. Create a geo:Feature for each real-world entity
  2. Create a geo:Geometry for each feature
  3. Link them with geo:hasGeometry
  4. Attach the serialized geometry via geo:asWKT or geo:asGeoJSON

Turtle example (point locations with WKT):

  @prefix : <http://example.org/> .
  @prefix geo: <http://www.opengis.net/ont/geosparql#> .

  :Franz a geo:Feature ;
         geo:hasGeometry :FranzGeom .
  :FranzGeom a geo:Geometry ;
             geo:asWKT "POINT(-122.12956 37.88963)"^^geo:wktLiteral .

  :OyamaSushi a :Restaurant, geo:Feature ;
              geo:hasGeometry :OyamaSushiGeom .
  :OyamaSushiGeom a geo:Geometry ;
                  geo:asWKT "POINT(-122.12755 37.89076)"^^geo:wktLiteral .

Turtle example (polygon with GeoJSON):

  :NoordHolland a :Province, geo:Feature ;
                geo:hasGeometry :NoordHollandGeom .
  :NoordHollandGeom a geo:Geometry ;
                    geo:asGeoJSON '''{"type": "MultiPolygon",
                      "coordinates": [[[[4.73, 52.40], [4.73, 52.95],
                        [5.08, 52.95], [5.08, 52.40],
                        [4.73, 52.40]]]]}'''^^geo:geoJSONLiteral .

After loading, build the geohash index for optimal query performance:

  SELECT ?success WHERE {
    ?success geoext:buildGeohashIndex () .
  }


8. GOTCHAS AND COMMON MISTAKES
===============================

1. GEOMETRY LITERALS MUST BE TYPED: Always use ^^geo:wktLiteral or
   ^^geo:geoJSONLiteral. Untyped string literals will NOT work.
   WRONG:  "POINT(10 20)"
   RIGHT:  "POINT(10 20)"^^geo:wktLiteral

2. WKT COORDINATE ORDER IS LONGITUDE LATITUDE (x y), not lat/lon.
   WRONG:  "POINT(37.889 -122.129)"^^geo:wktLiteral  (lat first)
   RIGHT:  "POINT(-122.129 37.889)"^^geo:wktLiteral   (lon first)
   Same for GeoJSON: "coordinates": [-122.129, 37.889]

3. POLYGONS MUST BE CLOSED: The first and last coordinate must be identical.
   WRONG:  "POLYGON((0 0, 1 0, 1 1, 0 1))"^^geo:wktLiteral
   RIGHT:  "POLYGON((0 0, 1 0, 1 1, 0 1, 0 0))"^^geo:wktLiteral

4. MAGIC PROPERTIES vs FUNCTIONS: The geo: namespace magic properties
   (geo:sfWithin, geo:sfContains, etc.) operate on geometry RESOURCES and
   REQUIRE the geohash index. The geof: namespace functions operate on
   geometry LITERALS (WKT/GeoJSON values) and work without an index.
   - Use geof: functions for small datasets or ad-hoc queries
   - Use geo: magic properties for large datasets (with geohash index)

5. BUILD GEOHASH INDEX BEFORE using magic properties or geoext:nearby.
   Forgetting to build the index will return empty results, not errors.

6. geoext:nearby ONLY WORKS WITH POINTS as the center. You cannot use
   a polygon as the center of a nearby search.

7. FILTER BOOLEAN COMPARISON: When using geof: functions in BIND + FILTER
   (instead of directly in FILTER), compare against typed boolean:
   WRONG:  FILTER (?result = true)
   RIGHT:  FILTER (?result = "true"^^xsd:boolean)
   BETTER: FILTER (geof:sfWithin(?wkt1, ?wkt2))  — use directly in FILTER

8. DISTANCE REQUIRES THREE ARGUMENTS: geof:distance needs two geometries
   AND a unit. Omitting the unit may cause errors.
   RIGHT:  geof:distance(?a, ?b, qudt:M)

9. HAVERSINE vs DISTANCE: geof:distance computes Euclidean-style distance.
   geoext:haversineDistance computes great-circle distance on Earth's surface.
   For geographic coordinates, haversineDistance is more accurate for large
   distances. For nearby points, the difference is negligible.

10. ADDING DATA AFTER INDEX: If you add new geospatial triples after building
    the index, run geoext:buildGeohashIndex again (incremental — only indexes
    new triples). Or use geoext:rebuildGeohashIndex for a full rebuild.


9. RECOMMENDED WORKFLOW
========================

When a user asks for geospatial queries:

Step 1: Examine the data schema (get_shacl)
  - Look for geo:hasGeometry, geo:asWKT, geo:asGeoJSON properties
  - Identify which classes have geometry data
  - Check what serialization format is used (WKT vs GeoJSON)

Step 2: Explore the geometric data
  - Query for sample geometries to understand the data:

    SELECT ?feature ?type ?wkt WHERE {
      ?feature a geo:Feature ;
               a ?type ;
               geo:hasGeometry ?geom .
      ?geom geo:asWKT ?wkt .
    } LIMIT 10

  - Or with GeoJSON:

    SELECT ?feature ?geojson WHERE {
      ?feature geo:hasGeometry ?geom .
      ?geom geo:asGeoJSON ?geojson .
    } LIMIT 10

Step 3: Build geohash index (if using magic properties or nearby)

    SELECT ?success WHERE {
      ?success geoext:buildGeohashIndex () .
    }

Step 4: Choose the right query approach
  - Point-in-polygon / containment → geof:sfWithin or geof:sfContains
  - Intersection test → geof:sfIntersects
  - Proximity search → geoext:nearby (needs index) or geof:distance + ORDER BY
  - Distance calculation → geof:distance or geoext:haversineDistance
  - Geometry operations → geof:union, geof:intersection, geof:difference

Step 5: For large datasets, prefer magic properties over FILTER functions
  - Build geohash index first
  - Use geo:sfWithin instead of FILTER(geof:sfWithin(...))
  - Use geoext:nearby for radius searches

Step 6: Present results clearly
  - Include distance values with units for proximity queries
  - ORDER BY ASC(?distance) for nearest-first results
  - LIMIT for manageable result sets
