Introduction
AllegroGraph implements the SPARQL 1.1 standard, with certain conformance notes. Deviations from the standard are considered bugs.
SPARQL query execution in AllegroGraph is influenced by the configuration of the SPARQL dataset, setting query options, and by the choice of query engine (see SPARQL Query Engines).
The query engine will generate query execution warnings for suspicious query patterns.
AllegroGraph also provides support for SPIN, and reification of triples using triple IDs.
AllegroGraph can also be queried using Prolog.
SPARQL implementation notes
SPARQL CONSTRUCT
query form combines the triples constructed from the bindings produced by the WHERE
clause into a single RDF graph by set union (i.e. all the duplicates are removed). For example, given the data set
:s :p 1.
:s :p 2.
:s :p 3.
the following query
CONSTRUCT { ?s ?p 4 }
WHERE { ?s ?p ?o }
produces the single triple
:s :p 4.
instead of three triples
:s :p 4
:s :p 4
:s :p 4
which is one of the possible interpretations of the specification here. Note that some of the implementations return the latter result.
Executing SPARQL queries
Queries can be run in various ways:
- WebView: New Query
- Java: AGRepositoryConnection.prepareTupleQuery().evalutate()
- Lisp: function run-sparql
- Python: RepositoryConnection.prepareTupleQuery().evaluate()
- HTTP: /catalogs/[catname]/repositories/[name]/sparql
Limiting results of a query
Query results can be limited with a LIMIT clause, or by control boxes in AGWebView. And individual users can have the number of results limited (as well as export and duplicates export operations). See Limiting the results a user can see for details.
The size of the limit (for all users and repos) is set by the QueryResultsLimit configuration directive, whose default value is 1000.
For complex queries the limit is applied using the rule
<query-limit> + <query-offset> <= <query-results-limit>
that is, if query asks for results beyond the results limit with its limit and offset modifiers, the new modifier values will be chosen to fit the above condition.
Query results caching
AllegroGraph supports caching of SPARQL query results. When results caching is enabled, the same query (with perhaps different offsets and limits) can be performed multiple times without having to do the actual query processing.
This behavior has to be enabled explicitly on per-query basis using the allowCachingResults
query option.
Results cache size can be configured using the configuration directives QueryResultsCacheSize
and QueryResultsCacheStorageSize
.
See allowCachingResults
query option for more details.
Comparing SPARQL and Prolog
To give a comparison of Prolog and SPARQL when it comes to querying an AllegroGraph repository:
Prolog is a general purpose logic programming language, while SPARQL is a query language specific for triple stores.
Prolog makes it easy to write rules and express concepts unrelated to triple data, whereas SPARQL requires either using SPIN or magic properties.
SPARQL is linked to RDF when it comes to data types, operators and functions, so that SPARQL can do e.g. string and numerical operations. Within Prolog you would have have to use the corresponding Lisp function.
The Prolog and the SPARQL query engines are separate components with different performance characteristics:
SPARQL has two query engines (see SPARQL Query Engines), SBQE and MJQE. SBQE query execution can be depth-first or breadth-first, depending on the query and the Chunk at a Time query option. The MJQE engine uses a different method altogether.
Prolog queries are executed depth-first.
Both SPARQL engines have extensive knowledge of SPARQL query patterns and their optimal evaluation strategy.
Please see the Prolog tutorial and Prolog select documentation.
First class triples
AllegroGraph currently permits you to make assertions about triples via the tripleId magic property or Lisp function triple-id.
A tripleId
value is special numerical value that can be projected from a query, but which does not interact well with other SPARQL features like FILTER
or ORDER BY
. Please contact us if you have questions or requests in this area.
AllegroGraph provides complete (according to the current version of the draft spec) support for RDF-star and SPARQL-star.
SPARQL dataset
The SPARQL dataset refers to the the set of IRIs that make up the default graph and named graphs in which triples are looked up. In principle the FROM
and FROM NAMED
clauses in a query specify the dataset. AllegroGraph offers options to customize and override this.
Dataset loading
In the Lisp client the handling of FROM
and FROM NAMED
clauses can be overridden by specifying a :load-function
argument to run-sparql, or by setting *dataset-load-function*
. This enables dynamically loading triples into the repository during query execution.
Default dataset handling
For queries that do not provide a FROM
or FROM NAMED
clause the defaultDatasetBehavior query option controls which triples are in the default graph and named graphs of the dataset. In the Lisp client this can also be configured via the run-sparql :default-dataset-behavior
argument, defaulting to *default-dataset-behavior*
.
Query execution options
AllegroGraph provides control over SPARQL query execution via various options. These options can be specified per query by including a special PREFIX
line of the form:
PREFIX franzOption_optionName: <franz:optionValue>
It is also possible option values to be used for all queries in the Server Configuration.
You can specify query options on the Query page in AGWebView. See the WebView document, in particular the New Query Page section.
The following query execution options are currently available:
If yes
, query results may be cached on disk.
Query will be executed ignoring the limit
and offset
modifiers and its results will be stored on disk. All subsequent calls of the same query, potentially with different limit
and offset
values, will read results from the cache, apply the new limit
and offset
and return them.
Here is an example of using query results caching for paging SPARQL query results:
Let us assume a complex SPARQL query that takes a long time to run and returns a lot of results
SELECT <variables> { <patterns> }
The results of the query can be split into pages of size 1000 using the combination of limit
and offset
modifiers:
SELECT <variables> { <patterns> } LIMIT 1000 OFFSET 0
SELECT <variables> { <patterns> } LIMIT 1000 OFFSET 1000
SELECT <variables> { <patterns> } LIMIT 1000 OFFSET <N * 1000>
but this is very inefficient because the execution of the query for a given page takes about the same amount of time as the whole query. With query results caching, the first page query will take the usual amount of time
PREFIX franzOption_allowCachingResults: <franz:yes>
SELECT <variables> { <patterns> } LIMIT 1000 OFFSET 0
and will cache all the results of the query, so all subsequent calls to
PREFIX franzOption_allowCachingResults: <franz:yes>
SELECT <variables> { <patterns> } LIMIT <N> OFFSET <M>
for any values of limit
and offset
will read the results from cache and will be significantly faster.
Please note that query results caching is not always possible. If it is not possible, the query will be executed as usual, but a query warning will be returned, explaning why caching is impossible.
Query results cache is currently stored on disk and both the maximum number of cache entries and the maximum disk space allowed can be configured using the respective configuration options. The possible values are:
yes
- turn the option onno
- turn the option off
Example of specifying the query option value via a PREFIX
:
PREFIX franzOption_allowCachingResults: <franz:no>
The default value is: no
If true, then warnings found during query parsing, planning and execution will cause a query to fail immediately rather than continuing.
Warnings include things like unknown variables in a ORDER BY clause or FILTER expression, constants in the query that cannot be in the store and so on.The possible values are:
yes
- turn the option onno
- turn the option off
Example of specifying the query option value via a PREFIX
:
PREFIX franzOption_cancelQueryOnWarnings: <franz:no>
The default value is: no
Controls whether to use Chunk at a Time (CaaT) processing.
It can be:
possibly
- use CaaT for unordered queries with small limits and use the single-set approach otherwise. Note that this works best when solutions are found in the first several chunks processed which means that the query can finish quickly. If a large portion of the search space must be scanned, then the single-set approach can be faster.yes
- always use CaaT when possible (some query clauses like EXISTS filters and SPIN magic properties do not yet support CaaT).no
- always use the single set approach and never use CaaT.
The default value is possibly
which means that AllegroGraph is optimizing for speed rather than space. The no
option is focused on speed at the possible cost of higher memory use whereas the yes
option is more constrained in memory use at the cost of slower queries.
Example of specifying the query option value via a PREFIX
:
PREFIX franzOption_chunkProcessingAllowed: <franz:possibly>
Specifies the maximum amount of memory used by a single chunk.
Controls the size (in bytes) of the chunks used by the CaaT executor. This option takes precedence over the deprecated chunkProcessingSize option.
The minimium allowed value is 200M.
See the chunkProcessingAllowed option for additional query control.
Example of specifying the query option value via a PREFIX
:
PREFIX franzOption_chunkProcessingMemory: <franz:4294967296>
The default value is: 4,294,967,296
(Deprecated) Specifies the chunk processing size in rows
Deprecated in favor of the chunkProcessingMemory option.
Control the size (in rows of answers) of the chunks used by the CaaT executor. The higher the number, the larger the chunks processed will be which is both more efficient and more memory intensive. A typical value is 400000 or 1000000.
See the chunkProcessingAllowed option for additional control.
Example of specifying the query option value via a PREFIX
:
PREFIX franzOption_chunkProcessingSize: <franz:400000>
The default value is: 400,000
The strategy used to reorder triple patterns in a query.
This option controls how the triple patterns in a single Basic Graph Pattern (BGP) are reordered.
The available strategies will depend on the query engine being used but will always include identity
which tells the query planner to not reorder the triple patterns of the BGPs. Another common choice is statistical
which uses the statistics of the triple-store to try to reorder clauses most efficiently.
Note that other query planning algebraic manipulations may cause BGPs in your query to be merged and that reordering does not extend to larger query structures (like UNION or OPTIONAL).
Example of specifying the query option value via a PREFIX
:
PREFIX franzOption_clauseReorderer: <franz:statistical>
The clause reorderer defaults to statistical
Specify the default attributes to assign to any triple created by a SPARQL update command.
The attributes must be specified in URL encoded JSON format. So the example below is using the URL encoded form of which is the URL encoded form of {"rank": "High" }.
Example of specifying the query option value via a PREFIX
:
PREFIX franzOption_defaultAttributes: <franz:%7B%22rank%22%3A%20%22High%22%20%7D>
No attributes
In a SPARQL query the FROM
clause specifies the default graph of the dataset, and the FROM NAMED
clause specifies the named graphs of the dataset (see Specifying RDF Datasets in the SPARQL 1.1 standard).
If a SPARQL query does not specifiy FROM
or FROM NAMED
then it is up to the implementation to choose a behaviour. AllegroGraph offers three possibilities. The table below indicates which triples are present in which part of the dataset for each possible behaviour:
+--------------------------+
default dataset behaviour: | all | rdf | default|
+ - - - -| - - - -| - - - -|
dataset's default graph / named graphs: | DG NG | DG NG | DG NG |
| | | |
triple (s,p,o) in default graph: | x | x | x |
triple (s,p,o,g) in named graph: | x x | x | | "x" means present
+--------+--------+--------+
The possible values are:
all
- All triples are present in the dataset's default graph. Triples in a named graph are present in the dataset in that named graph. Triples in a named graph thus occur twice in the dataset: once in the default graph, and once in the named graph.rdf
- Triples in the default graph are present in the dataset's default graph. Triples in a named graph are present in the dataset in that named graph.default
- Triples in the default graph are present in the dataset's default graph. The dataset has no named graphs.
Example of specifying the query option value via a PREFIX
:
PREFIX franzOption_defaultDatasetBehavior: <franz:all>
The default value is: all
Specifies the number of solutions to keep in memory before writing temporary files.
This should be a number like 500000 or 100m. The larger the value, the more memory AllegroGraph will use during query processing. Smaller values can be more memory efficient but also can perform more slowly because the will be more I/O activity.
Note that this setting controls the memory used to hold completed solutions not the memory used to hold intermediate solutions. See the chunkProcessingMemory option for more details.
Example of specifying the query option value via a PREFIX
:
PREFIX franzOption_diskChunkRowCount: <franz:500000>
The default value is: 500,000
The query engine to use.The possible values are:
mjqe
- New merge-join based query engine for supported queries ~ (e.g.select
); fallback to:sbqe
for other queries.sbqe
- Old set-based query engine (with or without Chunk-at-a-Time) ~ for all queries
Example of specifying the query option value via a PREFIX
:
PREFIX franzOption_engine: <franz:sbqe>
The default value is: sbqe
When true, the query engine prints backtraces of any errors that happen during query execution.
Only applicable when queries are executed through HTTP endpoint.The possible values are:
yes
- turn the option onno
- turn the option off
Example of specifying the query option value via a PREFIX
:
PREFIX franzOption_logBacktraceOnQueryFailure: <franz:no>
The default value is: no
Controls the length of query log lines.
The logLineLength
query option limits the maximum length of each line of the query log. This can make the log easier to read at the cost of removing some information. Use zero to print the entirety of every log message.
See the logQuery
option for more details.
Example of specifying the query option value via a PREFIX
:
PREFIX franzOption_logLineLength: <franz:512>
The default value is: 512
Controls whether or not query execution details are logged.
logQuery can be 'no', 'yes', or 'onFailure'.
The length of the log lines can be limited by using the logLineLength
query option.
If logging is on, the query engine prints additional information to the AllegroGraph log file as it plans and executes a query. If logging is onFailure
, then query log information is gathered but not emitted unless there is a query failure.
Logging on failure has a small cost especially when the amount of data logged is high (e.g., when chunkProcessingAllowed is turned on). We recommend setting the value to onFailure
during development and then turning it to no
for production.
Example of specifying the query option value via a PREFIX
:
PREFIX franzOption_logQuery: <franz:no>
The default value is: no
Specifies an upper limit on the number of solutions that are allowed during query processing before a warning is logged.
Queries run best when the solution space is kept small. This warning is in an indication that a query is generating many intermediate results. This is a normal part of query processing but can indicate that a query should be optimized
Example of specifying the query option value via a PREFIX
:
PREFIX franzOption_maximumSolutionsSize: <franz:100k>
The default is to warn when the intermediate solution space is larger than 100,000,000 solutions
This option limits the number of VALUES that AllegroGraph will send to a SPARQL endpoint when executing a SERVICE clause. Sending partial results to the endpoint can help it answer the query more quickly but if the number of partial results is very large, the cost of data transfer can offset the help of supplying the data.
If the number of VALUES exceeds the limit, the the query will be sent to the endpoint with no VALUES supplied.
Example of specifying the query option value via a PREFIX
:
PREFIX franzOption_maximumValuesCountForService: <franz:1048576>
no limit
Specify how much system memory must be free for a query to continue.
If the query process is using more than this setting's percentage of total physical memory, then the query will be canceled. The default value is 90%.
Example of specifying the query option value via a PREFIX
:
PREFIX franzOption_memoryExhaustionWarningPercentage: <franz:90.0>
The default value is: 90.0
Specifies the memory limit per query.
If a query tries to use more than this, it will be canceled.
Example of specifying the query option value via a PREFIX
:
PREFIX franzOption_memoryLimit: <franz:8G>
The default value will be 85% of the physical memory on the server
The OpenAI API key needed for the magic predicate and SPARQL functions in LLM namespace.
Example of specifying the query option value via a PREFIX
:
PREFIX franzOption_openaiApiKey: <franz:sk-U01ABc2defGHIJKlmnOpQ3RstvVWxyZABcD4eFG5jiJKlmno>
You must provide an openai API key to use LLM Magic Predicates.
The timezone in which xsd:dateTimes and xsd:times are serialized.
For example, if presentationTimeZone is "-02:00", then "2013-10-01T15:21:23+03:00" is serialized as "2013-10-01T10:21:23-02:00". Zoneless xsd:datetimes and xsd:times are always presented without a timezone. This option has no effect on what is stored in the database. The allowed values are strings representing the timezone. The format of these strings is the same as in xsd:dateTimes. The special value "none" means that no conversion will take place.
Example of specifying the query option value via a PREFIX
:
PREFIX franzOption_presentationTimeZone: <franz:-5:00>
The default is set to none
.
Controls the depth of the profile outline.
Only applicable when the outline
format value is used for the profileOutputFormat
option, ignored otherwise.
Example of specifying the query option value via a PREFIX
:
PREFIX franzOption_profileOutlineDepth: <franz:100>
The default value is: 100
Controls the profile output format.
profileOutputFormat
can be call-graph
, flat
or outline
, which correspond to the prof:show-call-graph
, prof:show-flat-profile
and prof:show-outline-profile
respectively.
See https://franz.com/support/documentation/current/doc/runtime-analyzer.htm for more information on call-graph
, flat
and outline
options.
This option is only applicable for space
and time
profiling, and ignored if the value of profileQuery
is perf
. The possible values are:
call-graph
- Use call graph profile output formatflat
- Use flat profile output formatoutline
- Use outline profile output format (seeprofileOutlineDepth
for depth configuration)
Example of specifying the query option value via a PREFIX
:
PREFIX franzOption_profileOutputFormat: <franz:flat>
The default value is: flat
List of perf events to use when profiling a query with perf
tool.
Value should be a list of comma-separated words denoting perf events (see perf list
for a list of pre-defined events). If the list is empty, perf
will be run without --events
argument, falling back to the default set of events.
See https://perf.wiki.kernel.org/index.php/Main_Page for more information on perf
.
Example of specifying the query option value via a PREFIX
:
PREFIX franzOption_profilePerfEvents: <franz:cache-misses,cache-references>
Default value is an empty list (i.e. no --events
argument).
Controls whether or not the profile information is collected during query execution.
profileQuery
can be no
, space
, time
or perf
.
If profiling is enabled, the query profile will be written to the same place where the query log is written. If the value is perf
but the tool is not available, a warning will be written to the query log and no information will be collected. The possible values are:
no
- Do not collect profile informationspace
- Collect memory usage profile informationtime
- Collect CPU usage profile informationperf
- Collect profile information usingperf
tool
Example of specifying the query option value via a PREFIX
:
PREFIX franzOption_profileQuery: <franz:no>
The default value is: no
Specifies a query timeout value in seconds.
Note that the timeout is not an interrupt; AllegroGraph checks for query timeout relatively infrequently so that a query can run for many seconds longer than the specified timeout. This is especially true for operations involving reasoning or non-triple-pattern based queries like free-text indexing or SNA path planning operators.
Setting the timeout to zero is the same as having no timeout.
Example of specifying the query option value via a PREFIX
:
PREFIX franzOption_queryTimeout: <franz:30>
The default is to have no query timeout. I.e., queries will run until complete.
Controls whether or not AllegroGraph interleaves query execution and triple-pattern selection.
If no
, then AllegroGraph will perform all reordering during query planning. If yes
, then AllegroGraph will defer reordering until query execution time. In many cases, the additional information available at execution time can enhance query performance.
Note that interleaving reordering is not always a win because performing all ordering at query planning time allows for the query engine to introduce joins which can sometimes enhance query performance.
This option may not be supported on all query engines. If specified yes
for an engine where it is not supported, the option is silently ignored.
See the clauseReorderer
option for additional information. The possible values are:
yes
- turn the option onno
- turn the option off
Example of specifying the query option value via a PREFIX
:
PREFIX franzOption_reorderDuringExecution: <franz:no>
The default value is: no
The number of seconds to wait before a remote query times out.
This will also have an effect on SPARQL Federated query (i.e., using the SERVICE clause).
Example of specifying the query option value via a PREFIX
:
PREFIX franzOption_serviceTimeout: <franz:120>
The default value is: 120
Specifies query duration threshold in milliseconds that triggers slow query logging.
If a query's runtime exceeds the threshold, it will be logged either to the file specified by the slowQueryLogFile configuration setting or to agraph.log.
Example of specifying the query option value via a PREFIX
:
PREFIX franzOption_slowQueryLogThreshold: <franz:1000>
The default is to not log slow queries.
Specifies the maximum number of results to return from a given SOLR query.
Example of specifying the query option value via a PREFIX
:
PREFIX franzOption_solrQueryLimit: <franz:100>
The default value is: 100
Specifies the maximum amount of temporary file space that may be used by a query.
If a query tries to use more file space than this, it will be canceled.
Queries write intermediate results to the filesystem when they will not fit in memory. With a huge query it is possible for such temporary files to fill the filesystem. In order to prevent this, the temporaryFilesystemSpaceLimit query option may be set.
The minimum allowable value for this setting is 2-gigabytes.
Example of specifying the query option value via a PREFIX
:
PREFIX franzOption_temporaryFilesystemSpaceLimit: <franz:1017198592>
The default value is to use the minimum of 8-gigabytes and one quarter of the available filesystem space at the time the query begins.
If yes, then range queries will not scan typed literal triples.
This means that only encoded triples will be considered. The only reason to set this option to no
is if your triple-store contains typed literals that are not encoded (i.e., that are in the string-table) which could happen if you disabled AllegroGraph's datatype mapping.The possible values are:
yes
- turn the option onno
- turn the option off
Example of specifying the query option value via a PREFIX
:
PREFIX franzOption_trustEncodedDatatypesForRangeQueries: <franz:yes>
The default value is: yes
If yes, then predicate type mappings will be used for range queries.
This means that any triples whose encoded data-type does not match their predicate mapping will be ignored. This could happen only if a predicate mapping was added or changed after triples had been added.The possible values are:
yes
- turn the option onno
- turn the option off
Example of specifying the query option value via a PREFIX
:
PREFIX franzOption_trustPredicateTypeMappingsForRangeQueries: <franz:yes>
The default value is: yes
If true, when reasoning is used, all occurences of resources in results that are declared to be owl:sameAs
other resources will be replaced with a canonical member of its owl:sameAs
equivalence class and deduplicated. For example, given the following data
@prefix owl: <http://www.w3.org/2002/07/owl#>.
@prefix : <http://example.org/>.
:Batman :livesIn :Gotham;
owl:sameAs :Bruce_Wayne.
the following query would normally returns 2 different results
PREFIX : <http://example.org/>
SELECT * { ?s :livesIn :Gotham }
----------------
| s |
================
| :Bruce_Wayne |
| :Batman |
----------------
but with this query option, it will return a single one
PREFIX franzOption_unifyOwlSameAsDuplicates: <franz:yes>
PREFIX : <http://example.org/>
SELECT * { ?s :livesIn :Gotham }
-----------
| s |
===========
| :Batman |
-----------
Please note that the canonical instance is chosen using the shortest string representation criterion. In the example above, :Batman
is the shortest-string instance of the given owl:sameAs
class, so it is chosen as the canonical instance. If another instance with shorter string representation is added to the triple store, for example:
:Bruce owl:sameAs :Bruce_Wayne.
the result will change because :Bruce
is shorter than :Batman
:
PREFIX franzOption_unifyOwlSameAsDuplicates: <franz:yes>
PREFIX : <http://example.org/>
SELECT * { ?s :livesIn :Gotham }
----------
| s |
==========
| :Bruce |
----------
The possible values are:
yes
- turn the option onno
- turn the option off
Example of specifying the query option value via a PREFIX
:
PREFIX franzOption_unifyOwlSameAsDuplicates: <franz:no>
The default value is: no
Use subject and object UPI type-codes to improve constraint inference
If yes, then the query engine will gather information about the subjects and objects associated with particular predicates. This can be used in constraint analysis and query transformations. As an example, suppose we have a query like:
?one ex:date ?date1 .
?two ex:date ?date2 .
filter( ?date1 > ?date2 )
If there is no predicate type-mapping, then the query engine can not make any assumptions about the range comparison. If there is a predicate type-mapping and trustPredicateTypeMappingsForRangeQueries
is true, then the engine can know that the filter can be treated as a date comparison. If usePredicateConstrainedUpiTypeInformation
is yes, then the query engine will check the triple-store to determine which UPI type-codes the subjects and objects associated with ex:date
can take on. If the objects of ex:date
only have, e.g., UPI type-code +rdf-date+, then the filter will be handled more efficiently.
The type-code information is cached but if the store is changing rapidly, then the cache will often be invalid and this computation will slightly add to the cost of queries.The possible values are:
yes
- turn the option onno
- turn the option off
Example of specifying the query option value via a PREFIX
:
PREFIX franzOption_usePredicateConstrainedUpiTypeInformation: <franz:yes>
The default value is: yes
Use typed-literal XSD types to improve constraint inference.
Similar to usePredicateConstrainedUpiTypeInformation but involves a scan of all typed-literals (which can be expensive). This is currently not cached!The possible values are:
yes
- turn the option onno
- turn the option off
Example of specifying the query option value via a PREFIX
:
PREFIX franzOption_usePredicateConstrainedXsdTypeInformation: <franz:no>
The default value is: no
Specify the user attributes to use while evaluating the query.
The attributes must be specified in URL encoded JSON format. So the example below is using the URL encoded form of {"access-level": "medium", "department": "hr"}.
Example of specifying the query option value via a PREFIX
:
PREFIX franzOption_userAttributes: <franz:%7B%22access-level%22%3A%20%22medium%22%2C%20%22department%22%3A%20%22hr%22%7D>
empty
Query execution warnings
SPARQL is relatively lax when it comes to accepting and evaluating queries. For example, this query is valid SPARQL but it is probably not what was intended, as ?var
is not actually bound by the triple patterns in the rest of the query:
select ?var {
?s ?p ?o .
}
A more typical example could be caused by a typo. For example, changing the case of a variable:
select ?subjectOne ?class {
?subjectone rdf:type ?Class .
}
Neither ?subjectOne
nor ?class
will be bound since the query uses ?subjectone
and ?Class
. When AllegroGraph plans and executes a query, it will detect problems like the above and generate query warnings. The cancelQueryOnWarnings query option can be used to stop execution at the first warning.
The following is a list of the currently defined warnings:
This warning indicates that one or more basic graph patterns (BGPs) in the query create a cross product. I.e., there are patterns in the query that have disjoint sets of variables which will cause the SPARQL engine to find all possible matches between the sets which can lead to very large solution sets. For example:
SELECT * {
?a ?b ?c .
?d ?e ?f .
}
Since the first triple-pattern and the second triple-pattern share no variables, the above query will find all possible combinations of each pair of triples in the underlying repository.
This warning is signaled when a query specifies a DATASET and one or more graphs in its FROM or FROM NAMED portions are not in the repository.
For example, if 'http://example#22` is not in the repository, then these queries will signal the warning:
SELECT *
FROM <http://example#1>
FROM <http://example#22> {
?s ?p ?o .
}
SELECT *
FROM NAMED <http://example#22> {
?s ?p ?o .
}
run-sparql
or in the HTTP request), and these two values do not match. When this happens, the smaller of the two values is used.
run-sparql
or in the HTTP request). In this case, the sum of the two offsets will be used.
This warning is signaled when a SPARQL query uses BIND in such a way that no solutions are available to be processed. For example, the BIND in this query appears before anything else so there is nothing for it to do and ?X
will always be unbound:
SELECT * {
BIND(?object as ?X)
?s ?p ?object .
}
This warning is signaled when a query uses a DATASET and the graph of a GRAPH clause is not in its FROM NAMED portion. For example
SELECT *
FROM NAMED <http://graph1>
FROM NAMED <http://graph2> {
GRAPH <http://example#22> {
?s ?p ?o
}
}
FILTER ( false )
can never succeed.
FILTER( ?x IN ( ) )
.
This warning indicates that a FILTER expression puts invalid constraints on the language of a variable. For example:
SELECT * { ?s ?p ?o . FILTER( LANG(?o) = 'es' && LANG(?o) = 'en') }
cannot succeed because the LANG(?o) cannot be both 'es' and 'en'.
This warning is signaled when a query has a filter with multiple alternatives, and it is known at plan time that none of the alternatives will succeed.
Alternatives can occur in various forms:
{ ?s (<ex:foo> | <ex:bar>) ?o } -- matches <ex:foo> or <ex:bar>
FILTER(?var in (<ex:foo>, <ex:bar>)) -- matches <ex:foo> or <ex:bar>
FILTER(str(?var) = 'ex:foo') -- matches 'ex:foo' or <ex:foo>
FILTER( ?o > 3 && ?o < 0 )
.
This warning indicates that a FILTER expression cannot succeed because it is comparing a string with a non-string. E.g.,
SELECT * { ?s ?p ?o . FILTER( STR(?o) = 45 ) }
This warning is signaled at query time when a CONSTRUCT or UPDATE template generates invalid RDF triples. For example, this query will emit no triples because ?s
takes on only literal bindings and these are not valid in the subject position:
CONSTRUCT {
?s a example:Car .
} WHERE {
VALUES ?s { 'mazda 3' 'ford pinto' 'bmw 300i' }
}
This warning indicates that the query algebra contains one or more cross products. I.e., there are portions of the algebra that have disjoint sets of variables which will cause the SPARQL engine to find all possible matches between the sets which can lead to very large solution sets. For example:
SELECT * {
?a a ?type .
VALUES ?FOO { <ex://a> <ex://b> }
}
Since the triple-pattern and the VALUES clause share no variables, the above query will find all possible combinations from the two sets.
This warning is signaled if a GRAPH clause specifies a literal for the GRAPH. For example:
SELECT * {
GRAPH ?g {
?s ?p ?o .
}
BIND( 'literal' as ?g )
}
?s ?p ?o . FILTER( ?s = 34 )
must fail because ?s
is bound to the subject of a triple and subjects cannot be literals.
This warning is signaled when a predicate has a type mapping which does not match the datatype of the values used in a range FILTER. For example, if <http://example#age>
has a mapping to xsd:byte
, then this query will signal the warning:
SELECT * {
?s <http://example#age> ?age .
FILTER( ?age > '2001-10-15'^^xsd:date )
}
because the FILTER cannot succeed.
This warning is signaled when a query contains a constant value (e.g. IRI or literal value) that is not present in the repository. For example if
SELECT * { ?s <ex:pred> ?o } -- gives: "No such predicate <ex:pred>"
This warning is signaled when it can be determined that a variable used in an expression is not bound anywhere in the query. Examples include:
SELECT * { ?s ?p ?o . } ORDER BY ?missing
SELECT * { ?s ?p ?o . FILTER( ?missing > 5 )
CONSTRUCT { ?missing ?p ?o } WHERE { ?s ?p ?o }
and so on.
Standard SPARQL functions
AllegroGraph supports the standard functions specified by SPARQL 1.1, several XPath constructor functions, XPath mathematical functions, and a number of custom functions designed to help with using AllegroGraph's extensions such as Geospatial and Social Network Analysis.
XPath Constructor Functions
AllegroGraph supports the standard SPARQL casting operations. For details, refer to the SPARQL reference for more details.
- <http://www.w3.org/2001/XMLSchema#boolean> ( a )
- <http://www.w3.org/2001/XMLSchema#byte> ( a )
- <http://www.w3.org/2001/XMLSchema#date> ( a )
- <http://www.w3.org/2001/XMLSchema#dateTime> ( a )
- <http://www.w3.org/2001/XMLSchema#decimal> ( a )
- <http://www.w3.org/2001/XMLSchema#double> ( a )
- <http://www.w3.org/2001/XMLSchema#float> ( a )
- <http://www.w3.org/2001/XMLSchema#int> ( a )
- <http://www.w3.org/2001/XMLSchema#integer> ( a )
- <http://www.w3.org/2001/XMLSchema#long> ( a )
- <http://www.w3.org/2001/XMLSchema#short> ( a )
- <http://www.w3.org/2001/XMLSchema#string> ( a )
- <http://www.w3.org/2001/XMLSchema#time> ( a )
- <http://www.w3.org/2001/XMLSchema#unsignedByte> ( a )
- <http://www.w3.org/2001/XMLSchema#unsignedInt> ( a )
- <http://www.w3.org/2001/XMLSchema#unsignedLong> ( a )
- <http://www.w3.org/2001/XMLSchema#unsignedShort> ( a )
Functions on Dates and Times
- day ( dateTime )
- hours ( dateTime )
- minutes ( dateTime )
- month ( dateTime )
- now ( )
- seconds ( dateTime )
- timezone ( dateTime )
- tz ( dateTime )
- year ( dateTime )
Hash Functions
Functions on Numerics
SPARQL Operators
These functions are described in detail in the Operator Mappings section of the W3C SPARQL reference.
- != ( a, b )
- * ( a, b )
- + ( a, b )
- + ( a )
- - ( a, b )
- - ( a )
- / ( a, b )
- / ( a )
- < ( a, b )
- <= ( a, b )
- = ( a, b )
- > ( a, b )
- >= ( a, b )
Functions on RDF terms
- bnode ( )
- bnode ( identifier )
- datatype ( literal )
- iri ( iri )
- isBlank ( rdf-term )
- isIRI ( rdf-term )
- isLiteral ( rdf-term )
- isNumeric ( rdf-term )
- isURI ( rdf-term )
- lang ( literal )
- rdf-star-isTriple ( rdf-term )
- rdf-star-object ( rdf-term )
- rdf-star-predicate ( rdf-term )
- rdf-star-subject ( rdf-term )
- rdf-star-triple ( rdf-term, rdf-term, rdf-term )
- str ( literal )
- strdt ( literal, datatype )
- strlang ( literal, language )
- struuid ( )
- uri ( uri )
- uuid ( )
Functional Forms
- RDFterm-equal ( a, b )
- bound ( uri )
- coalesce ( ... )
- exists ( pattern )
- if ( expression, then )
- if ( expression, then, else )
- in ( expression, ... )
- logical-and ( a, b )
- logical-or ( a, b )
- not in ( expression, ... )
- not-exists ( pattern )
- sameTerm ( a, b )
Functions on Strings
- concat ( ... )
- contains ( string-literal1, string-literal2 )
- encode_for_uri ( string-literal )
- langMatches ( language-tag, language-range )
- lcase ( string-literal )
- regex ( target, regex, flags )
- replace ( literal, pattern, replacement )
- replace ( literal, pattern, replacement, flags )
- strafter ( string-literal1, string-literal2 )
- strbefore ( string-literal1, string-literal2 )
- strends ( string-literal1, string-literal2 )
- strlen ( string-literal )
- strstarts ( string-literal1, string-literal2 )
- substr ( string-literal, start )
- substr ( string-literal, start, length )
- ucase ( string-literal )
Supported XPath functions
AllegroGraph also supports several XPath functions:
Functions on Dates and Times
- <http://www.w3.org/2005/xpath-functions#current-date> ( )
- <http://www.w3.org/2005/xpath-functions#current-dateTime> ( )
- <http://www.w3.org/2005/xpath-functions#current-time> ( )
XPath Mathematical Functions
For additional details on the XPath mathematical functions, see https://www.w3.org/TR/xpath-functions-3/.
- <http://www.w3.org/2005/xpath-functions/math#acos> ( x )
- <http://www.w3.org/2005/xpath-functions/math#asin> ( x )
- <http://www.w3.org/2005/xpath-functions/math#atan> ( x )
- <http://www.w3.org/2005/xpath-functions/math#atan2> ( x, y )
- <http://www.w3.org/2005/xpath-functions/math#cos> ( x )
- <http://www.w3.org/2005/xpath-functions/math#exp> ( x )
- <http://www.w3.org/2005/xpath-functions/math#exp10> ( x )
- <http://www.w3.org/2005/xpath-functions/math#log> ( x )
- <http://www.w3.org/2005/xpath-functions/math#log10> ( x )
- <http://www.w3.org/2005/xpath-functions/math#pi> ( )
- <http://www.w3.org/2005/xpath-functions/math#pow> ( x, y )
- <http://www.w3.org/2005/xpath-functions/math#sin> ( x )
- <http://www.w3.org/2005/xpath-functions/math#sqrt> ( x )
- <http://www.w3.org/2005/xpath-functions/math#tan> ( x )
Miscellaneous functions
- <http://www.w3.org/2005/xpath-functions#compare> ( a, b )
- <http://www.w3.org/2005/xpath-functions#error> ( ... )
- <http://www.w3.org/2005/xpath-functions#false> ( )
- <http://www.w3.org/2005/xpath-functions#not> ( a )
- <http://www.w3.org/2005/xpath-functions#true> ( )
Functions on Numerics
- <http://www.w3.org/2005/xpath-functions#abs> ( a )
- <http://www.w3.org/2005/xpath-functions#ceil> ( a )
- <http://www.w3.org/2005/xpath-functions#floor> ( a )
- <http://www.w3.org/2005/xpath-functions#round> ( a )
- <http://www.w3.org/2005/xpath-functions#round-half-to-even> ( a )
- <http://www.w3.org/2005/xpath-functions#round-half-to-even> ( a, b )
Functions on Strings
- <http://www.w3.org/2005/xpath-functions#concat> ( a, b, ... )
- <http://www.w3.org/2005/xpath-functions#contains> ( string-literal1, string-literal2 )
- <http://www.w3.org/2005/xpath-functions#ends-with> ( string-literal1, string-literal2 )
- <http://www.w3.org/2005/xpath-functions#lower-case> ( string-literal )
- <http://www.w3.org/2005/xpath-functions#normalize-space> ( )
- <http://www.w3.org/2005/xpath-functions#normalize-space> ( string-literal )
- <http://www.w3.org/2005/xpath-functions#starts-with> ( string-literal1, string-literal2 )
- <http://www.w3.org/2005/xpath-functions#string-length> ( )
- <http://www.w3.org/2005/xpath-functions#string-length> ( string-literal )
- <http://www.w3.org/2005/xpath-functions#substring> ( string-literal, start )
- <http://www.w3.org/2005/xpath-functions#substring> ( string-literal, start, length )
- <http://www.w3.org/2005/xpath-functions#translate> ( string-literal, from, to )
- <http://www.w3.org/2005/xpath-functions#upper-case> ( string-literal )
AllegroGraph extensions
AllegroGraph supports several functions above and beyond those defined by the SPARQL standard. The additional functions are named by URI (or prefixed name) and can appear anywhere in a SPARQL expression (e.g., in BIND
, FILTER
or ORDER BY
).
nD Geospatial
These functions are useful in working with AllegroGraph nD-geospatial facilities:
- <http://franz.com/ns/allegrograph/5.0/geo/nd/fn#haversineLatLonLatLon> ( lat1, lon1, lat2, lon2, [ units ] )
- <http://franz.com/ns/allegrograph/5.0/geo/nd/fn#haversineLatLonLoc> ( lat, lon, loc, [ units ] )
- <http://franz.com/ns/allegrograph/5.0/geo/nd/fn#haversineLocLoc> ( loc1, loc2, [ units ] )
- <http://franz.com/ns/allegrograph/5.0/geo/nd/fn#ordinateValue> ( value, name )
- <http://franz.com/ns/allegrograph/5.0/geo/nd/fn#ordinatesToValue> ( value, ... )
2D Geospatial
These functions are useful in working with the older AllegroGraph 2D Geospatial facilities:
- <http://franz.com/ns/allegrograph/3.0/geospatial/fn/cartesianDistance> ( p1, p2 )
- <http://franz.com/ns/allegrograph/3.0/geospatial/fn/cartesianDistanceSquared> ( p1, p2 )
- <http://franz.com/ns/allegrograph/3.0/geospatial/fn/cartesianX> ( point )
- <http://franz.com/ns/allegrograph/3.0/geospatial/fn/cartesianY> ( point )
- <http://franz.com/ns/allegrograph/3.0/geospatial/fn/haversineKilometers> ( p1, p2 )
- <http://franz.com/ns/allegrograph/3.0/geospatial/fn/haversineMiles> ( p1, p2 )
- <http://franz.com/ns/allegrograph/3.0/geospatial/fn/haversineRadians> ( p1, p2 )
- <http://franz.com/ns/allegrograph/3.0/geospatial/fn/latitude> ( point )
- <http://franz.com/ns/allegrograph/3.0/geospatial/fn/longitude> ( point )
- <http://franz.com/ns/allegrograph/3.0/geospatial/fn/toPointLonLat> ( predicate, longitude, latitude )
- <http://franz.com/ns/allegrograph/3.0/geospatial/fn/toPointXY> ( predicate, x, y )
2D Geospatial (deprecated)
These deprecated 2D Geospatial functions are still supported but you should change your queries to use the above functions instead.
- <http://franz.com/ns/allegrograph/3.0/geospatial/fn/cartesian-distance> ( p1, p2 )
- <http://franz.com/ns/allegrograph/3.0/geospatial/fn/cartesian-distance-squared> ( p1, p2 )
- <http://franz.com/ns/allegrograph/3.0/geospatial/fn/cartesian-x> ( p )
- <http://franz.com/ns/allegrograph/3.0/geospatial/fn/cartesian-y> ( p )
- <http://franz.com/ns/allegrograph/3.0/geospatial/fn/haversine-km> ( p1, p2 )
- <http://franz.com/ns/allegrograph/3.0/geospatial/fn/haversine-miles> ( p1, p2 )
- <http://franz.com/ns/allegrograph/3.0/geospatial/fn/haversine-r> ( p1, p2 )
Social Networking (SNA) Related functions
These functions can help when using AllegroGraph's SNA extensions.
- <http://franz.com/ns/allegrograph/4.11/sna/egoGroup> ( generator, node, depth )
- <http://franz.com/ns/allegrograph/4.11/sna/neighborCache> ( generator, node, depth )
Large Language Models (LLM) LLMagic related functions
These functions are useful in queries using LLM. Please see the LLM examples file for examples of these functions being used. Search, e.g., for llm:response
in that file. There are also LLM SPARQL magic properties, see Large Language Models for more information
- <http://franz.com/ns/allegrograph/8.0.0/llm/node> ( prompt )
- <http://franz.com/ns/allegrograph/8.0.0/llm/response> ( prompt )
Miscellaneous other functions
These functions help connect SPARQL to various other AllegroGraph features.
- <http://franz.com/ns/allegrograph/6.5.0/fn#lookupRdfList> ( subject-or-object, [ predicate ] )
- <http://franz.com/ns/allegrograph/6.5.0/fn#makeSPARQLList> ( ... )
- <http://franz.com/ns/allegrograph/6.5.0/fn#makeSPARQLSet> ( ... )
- <http://franz.com/ns/allegrograph/fn#encodedIdId> ( resource )
Defining SPARQL extension functions
SPARQL allows for query engines to associate extension functions with URIs, and call them from within queries.
You can define your own URI functions through defurifun
, or associate existing functions with a URI through associate-function-with-uri
. defurifun
does some manipulation of the arguments, so you should use it whenever possible.
uri
, which is a string or a valid part, and the provided function
, which is a symbol or a function. If cache-now-p
, and function
is a symbol, its function binding is stored instead of the symbol itself.
stream
*standard-output*
by default).
name
, and associate it with uri
as with associate-function-with-uri
. args
is not evaluated, exactly as with defun
.
Here's an example: a function that will do an HTTP HEAD request against the provided URL, returning the HTTP status code as an integer literal, or 0 if there's a problem.
(The built-in functions are quite robust, so a Lisp integer will be treated as an RDF literal with data type xsd:integer
.)
(defurifun ex-head-request !<http://example.com/fn/head> (uri)
(or
(when uri
(ignore-errors
(format t "~&Performing HTTP HEAD request on <~A>...~%"
(upi->value uri))
(second
(multiple-value-list
(net.aserve.client:do-http-request (upi->value uri)
:method :head)))))
0))
You can use this function in a query exactly as you would a built-in function.
Using this data as an example:
<http://ex.com/a> <http://ex.com/foo> "200"^^<http://www.w3.org/2001/XMLSchema#integer> .
we can run a query like so:
sparql(54): (run-sparql
"
PREFIX f: <http://example.com/fn/>
SELECT ?x {
?x <http://ex.com/foo> ?y .
FILTER ( ?y = f:head("http://franz.com\") )
}"
:results-format :count)
which produces this output:
Performing HTTP HEAD request on <http://franz.com>...
1
:select
(?x)
… we know, then, that franz.com
is returning a 200 status code.
Note that these filter functions can be called an arbitrary number of times during the execution of a query. It's not a good idea to actually perform expensive operations like HTTP requests in your queries.