Introduction
AllegroGraph supports datatypes in two ways. You can define any datatype and tag data with that type. However, the data is actually stored as a string.
The second way is to encode the data so that AllegroGraph knows what it is. Encoded datatypes are not stored as strings but in a special format which allows fast lookup and conversion. Further, range and order queries on these encoded datatypes (where it is meaningful) are much faster: on unencoded types, such queries can require a full scan.
The following XSD datatypes can be encoded (all XSD datatypes are supported, but only these are encoded). The xsd:
namespace prefix is defined to be http://www.w3.org/2001/XMLSchema#
. xsd: will be used in examples. The XML Datatype Schema is described in http://www.w3.org/TR/xmlschema-2/. Note that because of limitations in the number of bytes available for encoding, the range may be narrower than the xsd range in some cases. Values outside the supported range can still be input, but will be stored as strings with the limitations of unencoded data.
- Integers
- 8-bit:
<http://www.w3.org/2001/XMLSchema#byte>
- 16-bit:
<http://www.w3.org/2001/XMLSchema#short>
- 32-bit:
<http://www.w3.org/2001/XMLSchema#int>
- 64-bit:
<http://www.w3.org/2001/XMLSchema#long>
- 8-bit:
- Unsigned Integers
- 8-bit:
<http://www.w3.org/2001/XMLSchema#unsignedByte>
- 16-bit:
<http://www.w3.org/2001/XMLSchema#unsignedShort>
- 32-bit:
<http://www.w3.org/2001/XMLSchema#unsignedInt>
- 64-bit:
<http://www.w3.org/2001/XMLSchema#unsignedLong>
- 8-bit:
- Floating point
- single-precision:
<http://www.w3.org/2001/XMLSchema#float>
- double-precision:
<http://www.w3.org/2001/XMLSchema#double>
- single-precision:
- Decimals
- decimal:
<http://www.w3.org/2001/XMLSchema#decimal>
- integer:
<http://www.w3.org/2001/XMLSchema#integer>
- decimal:
- Times and Dates
- times:
<http://www.w3.org/2001/XMLSchema#time>
- dates:
<http://www.w3.org/2001/XMLSchema#date>
- date-times:
<http://www.w3.org/2001/XMLSchema#dateTime>
- times:
Typed literals of these types will be encoded by default. Also, untyped literals in the object position may be automatically typed (and encoded if supported) based on the predicate. See Data-type and Predicate Mapping in the Lisp Reference Guide and the Type mapping section in the HTTP Protocol document.
The other encoded datatype is short (11 bytes or fewer) strings. Encoded strings behave like unencoded strings, except access is much faster.
XSD datatypes derived from IEEE numerical standards
The following XSD datatypes come from IEEE numeric standards are (usually) values represented in some number of bytes (1, 4, 8, or 16) on standard hardware:
- Integers
- 8-bit:
<http://www.w3.org/2001/XMLSchema#byte>
- 16-bit:
<http://www.w3.org/2001/XMLSchema#short>
- 32-bit:
<http://www.w3.org/2001/XMLSchema#int>
- 64-bit:
<http://www.w3.org/2001/XMLSchema#long>
- 8-bit:
- Unsigned Integers
- 8-bit:
<http://www.w3.org/2001/XMLSchema#unsignedByte>
- 16-bit:
<http://www.w3.org/2001/XMLSchema#unsignedShort>
- 32-bit:
<http://www.w3.org/2001/XMLSchema#unsignedInt>
- 64-bit:
<http://www.w3.org/2001/XMLSchema#unsignedLong>
- 8-bit:
- Floating point
- single-precision:
<http://www.w3.org/2001/XMLSchema#float>
- double-precision:
<http://www.w3.org/2001/XMLSchema#double>
- single-precision:
All are supported over their full ranges, which are defined in http://www.w3.org/TR/xmlschema-2/.
xsd:decimal and xsd:integer types
The xsd:decimal type differs from the IEEE numbers defined above by storing decimal rather than floating point values. This does not make that much of a difference for integers but is very important for numbers with fractional parts (such as monetary values) since floating point arithmetic cannot handle fractional decimal digits accurately.
xsd:decimal is described in http://www.w3.org/TR/xmlschema-2/#decimal. The derived type xsd:integer is for whole numbers only.
The xsd:integer range is
-154742504910672534362390528
154742504910672534362390527
The xsd:decimal range is
min -9999999999999999999900000000000000000000000000000000000000000000
max 9999999999999999999900000000000000000000000000000000000000000000
That is [20 9's][44 0's] for both.
The smallest positive decimal is
0.000000000000000000000000000000000000000000000000000000000000001
That is 62 zeros after the decimal point, followed by a 1.
Similarly, the largest negative decimal is
-0.000000000000000000000000000000000000000000000000000000000000001
That is 62 zeros after the decimal point, followed by a 1.
xsd:decimals are specified by up to 20 significant digits (from the first non-zero digit through the last) with a decimal point optionally placed somewhere.
xsd:integers may have 27 digits (more than decimals, but no fractional parts).
Human readable representation
The Lisp functions part->concise and part->terse return strings which represent xsd:decimals and xsd:integers in short human-readable format:
(value->upi "123456.7890123456" :encoded-decimal)
returns
["\"123456.7890123456\"^^<http://www.w3.org/2001/XMLSchema#decimal>"]
(part->concise (value->upi "123456.7890123456" :encoded-decimal))
returns
"123456.7890123456"
XSD Date and Time datatypes
AllegroGraph supports xsd:date, xsd:time, and xsd:dateTime data types (see the ISO 8601 Date and Time Formats.
The xsd:date format is [-]YYYYYYY-MM-DD[Z | +/-HH:MM]. (Though most typical dates use four year (Y) digits). December 18, 1948 is 1948-12-18, for example. The range is
-8917687-01-25
8921486-12-07
The range is for encoded dates. Unencoded dates are stored as strings and have no limits (but note operations on unencoded values are slower and you must take care that AllegroGraph does not try to encode an out of range date as that will cause an error). Encoding will happen if the type is specified (such as "2014-02-23"^^xsd:date, where the ^^xsd:date indicates the type) or if there is a predicate mapping specifying the type (see Data-type and Predicate Mapping in the Lisp Reference Guide and the Type mapping section in the HTTP Protocol document).
The year range is quite suitable for all human history applications, but is likely too small for geology and dinosaurs.
The optional timezone can be Z (meaning Zulu time, which is Greenwich Mean Time) or +/-HH:MM. The sign and both hour and minute digits must be included. Examples of proper dates are 1948-12-18Z, 1948-12-18+12:00. The improper dates 1948-12-1805:00, 1948-12-18+12, and 1948-12-18-5:00 will signal errors (the first is missing a + or - and the second has no minute digits, and the third has a one digit hour). The timezone range is -14:00 to 14:00.
The xsd:time format is HH:MM:SS[.SSSSSSSS][Z | +/-HH:MM]. 10 minutes after midnight is 00:10:00 and 4 minutes, 3.5 seconds after noon is 12:04:03.5. All times of the day, with second fractions up to 8 digits are supported.
The optional timezone can be Z (meaning Zulu time, which is Greenwich Mean Time) or +/-HH:MM. The sign and both hour and minute digits must be included. Examples of proper times are 12:04:03.5Z, 12:04:03-05:00, and 12:04:03.5+12:30. The improper times 12:04:0305:00, 12:04:03+05, and 12:04:03-5:00 will signal errors (the first is missing a + or -, the second is missing the minutes, the third has a one digit hour). The timezone range is -14:00 to 14:00.
Note that you can supply more than 8 fractional digits for seconds. The extra digits will be rounded and truncated:
(value->upi "00:00:00.123456789" :time) => {00:00:00.1234568}
The rounding formula is
(round (* fraction 10000000))
where fraction is the fractional seconds expressed as a rational number, so in our example
(round (* 123456789/1000000000 10000000))
which is 1234568.
As usual, you can append a type value, in the form ^^xsd:type, to a data string. For example:
"10:05:01.1234"^^xsd:time
The xsd:dateTime format is [xsd:date format]T[xsd:time format][TZ], where TZ is the (optional) time zone. Time zone Z
means Zulu, which is Greenwich Mean Time. Otherwise TZ is in the range -14:00 to +14:00. Resolution is to one minute (though most but not all timezones are an integer number of hours). The + or - must be specified.
So, ten minutes after midnight on December 18, 1948 in Washington DC, USA is 1948-12-18T00:10:00-05:00.
dateTime strings can be tagged with ^^xsd:dateTime to tell AllegroGraph that the data is a dateTime:
"2014-02-23T08:18:59+05:00"^^xsd:dateTime
"2014-02-23T03:18:59Z"^^xsd:dateTime
"2014-02-23T03:18:59"^^xsd:dateTime
The last does not have a timezone specified and that fact is stored (rather than assuming the timezone is Z, for example).
The encoded xsd:dateTime range is
minimum -8917687-01-25T13:15:44
maximum 8921486-12-07T10:44:15.
The limits are for encoded dateTimes. There are no limits on unencoded values (which are stored in the string table -- note operations on unencoded values are slower). If AllegroGraph tries to encode a value outside the limits, an error is signaled. Encoding will happen if the type is specified (such as "2014-02-23T08:18:59+05:00"^^xsd:dateTime, where the ^^xsd:dateTime indicates the type) or if there is a predicate mapping specifying the type (see Data-type and Predicate Mapping in the Lisp Reference Guide and the Type mapping section in the HTTP Protocol document).
Supressing automatic encoding
Note that you can suppress the automatic encoding of a data type by removing the datatype mapping, though this is not recommended since it slows down handling of the now untyped data (which are now stored as strings). To do so, see the Lisp function datatype-mapping and the Type mapping section in the HTTP Protocol document for information on modifying automatic mappings.
Comparing dateTimes, dates, and times
Comparing two dates, dateTimes, or times, both with timezones is easy: the dateTime and the time can both be adjusted to a reference timezone, like Zulu (which is Greenwich Mean Time), and then the dateTimes represent actual instants, and either they are equal, or one is greater than (later than) the other. With times, the rule is assign the same (arbitrary) date to each, and compare as dateTimes. One date is earlier than another if its starting instant is earlier. Again, convert the starting instants of the two dates with timezones to dateTimes in the Zulu timezone, and the earlier is less than the later (or they are equal if the starting instants are the same).
Comparing two dates, dateTimes, or times, both without timezones is also straightforward: just assume they are Zulu timezone and proceed as above.
But comparing a date with a timezone to one without, or a dateTime with a timezone to one without, or a time with a timezone to one without is harder, and the meaning is not intuitive. For this reason, having some data with timezone and some without is not recommended unless absolutely necessary.
Here are the comparison rules:
Such pairs (two dateTimes, two dates, two times, with one having a timezone and the other not having a timezone) are never equal. One may be less than the other, or greater than the other, or neither greater than or less than the other.
A date/dateTime/time with a timezone is greater than a date/dateTime/time without a timezone if there is no timezone (in the allowable range -14:00 to +14:00) which can be assigned to the date/dateTime/time without a timezone which makes it greater than or equal to the original date/dateTime/time with a timezone.
A date/dateTime/time with a timezone is less than a date/dateTime/time without a timezone if there is no timezone (in the allowable range -14:00 to +14:00) which can be assigned to the date/dateTime/time without a timezone which makes it less than or equal to the original date/dateTime/time with a timezone. We have examples below.
Note that when comparing two dates/dateTimes/times with timezones, they are either equal or one is greater than the other. Similarly for two dates/dateTimes/times without timezones. Therefore, if you have tested any two of equal-to/greater-than/less-than, and both are false, you can conclude the third is true and do not have to test. But when one object has a timezone and one does not, you know they are not equal, but you have to test both greater than and less than since both may be false. You cannot infer from one not being greater than the other that it is therefore less than the other.
Comparing dates with times, dates with dateTimes, and times with dateTimes
It makes no sense (and so is an error) to compare a time with a date or a dateTime. (A time is unanchored by any date: it is meaningless to ask if 9:00 AM is greater than 2:00 AM, July 4, 1776.)
While comparing a date and a dateTime might be defined, it instead is raises an error in AllegroGraph. The date must be cast as a datetime to compare them (or the dateTime converted to a date).
When is one date earlier than another?
You compare dates by comparing their starting instants, and the later starting instant is the later date (even if the durations of the dates overlap). Two dates are equal if they have the same starting instant. Two dates with the same timezone, or two dates without timezones use regular calendar order. So:
1948-12-18Z is equal to 1948-12-18Z,
1948-12-18Z is greater (later) than 1947-12-18Z
1948-12-18Z is less (earlier) than 1948-12-19Z
1948-12-18 is equal to 1948-12-18,
1948-12-18 is greater (later) than 1947-12-18
1948-12-18 is less (earlier) than 1948-12-19
2001-03-04-03:00 has starting dateTime in Zulu 2001-03-04T03:00:00Z
2001-03-04+11:00 has starting dateTime in Zulu 2001-03-03T13:00:00Z
So
2001-03-04-03:00 is greater than 2001-03-04+11:00 since
2001-03-04T03:00:00Z is greater than 2001-03-03T13:00:00Z
1948-12-18-12:00 is equal to 1948-12-19+12:00
(When Dec. 18 is starting in the -12:00 TZ,
Dec. 19 is just starting in the +12:00 TZ.)
time comparison examples
You compare two times by assuming they are on the same date. If neither has a timezone, then compare them as if they were both in the same timezone, so
01:03:00 is less than 02:00:00 is less than 12:00:00 is less than 18:22:13
01:03:00 equals 01:03:00 and is greater than 00:10:10
If they both have timezones, assume the same date (say 2013-01-01) and compare as dateTimes:
You are in the -13:00 timezone, and your clock says 1:00 AM.
Assume it is January 1, 2013. Around the world at that instant clocks
and calendars say:
TZ Clock time Calendar date Equal to
2013-01-01T01:00:00-13:00?
--------------------------------------------------------------------
-13:00 01:00:00 2013-01-01 Yes
-10:00 04:00:00 2013-01-01 Yes
-05:00 09:00:00 2013-01-01 Yes
Z 14:00:00 2013-01-01 Yes
+03:15 17:15:00 2013-01-01 Yes
+09:00 23:00:00 2013-01-01 Yes
+10:00 00:00:00 2013-01-02 No
+11:00 01:00:00 2013-01-02 No
Note that the last two entries are not equal to our reference entry, even though people looking at clocks in those timezones will see the times indicated -- midnight (00:00:00) and 1:00 AM, when they look at the calendar, they see a different day.
Comparing a dateTime with a timezone with a dateTime without a timezone
This case is defined by convention (unlike the cases where both have timezone or neither have timezones, relying on the plain meaning of the terms leads to ambiguity). We have a dateTime with no timezone, dt-wo-tz, and a dateTime with a timezone, dt-w-tz.
Let us consider two additional dateTimes with timezones: one with the date and time of dt-wo-tz and with timezone -14:00, which we call dt-tz-latest, and one with the date and time of dt-wo-tz and with timezone +14:00, which we call dt-tz-earliest. Here is an example:
Consider the dateTime without timezone
1066-01-10T12:00:00
Now consider
1066-01-10T12:00:00-14:00
This is the same instant as
1066-01-11T02:00:00Z
since Zulu time is 14 hours later than timezone -14:00. Note the
date is different (the 11th rather than the 10th).
This is dt-tz-latest.
And consider
1066-01-10T12:00:00+14:00
This is the same instant as
1066-01-09T22:00:00Z
Since Zulu time is 14 hours earlier than timezone +14:00.
This is dt-tz-earliest. (Again, the date is different.)
So now we have three dateTimes with timezones, dt-w-tz, dt-tz-latest, and dt-tz-earliest. Here then are the comparison rules:
dt-w-tz and dt-wo-tz are never equal.
dt-w-tz is greater than (later than) dt-wo-tz if it is greater than dt-tz-latest.
dt-w-tz is less than (earlier than) dt-wo-tz if it is less than dt-tz-earliest.
dt-w-tz is neither greater than nor less than (nor equal to) dt-wo-tz if dt-w-tz is later than or equal to dt-tz-earliest and earlier than or equal to dt-tz-latest.
Here are some examples:
Again, dt-wo-tz is 1066-01-10T12:00:00, so
1066-01-09T22:00:00Z is dt-tz-earliest and
1066-01-11T02:00:00Z is dt-tz-latest.
Here are some values for dt-w-tz, the same date time in Zulu
time, and the comparison with dt-wo-tz. 'Undecided' means neither
earlier (<) nor later (>) nor equal (a dateTime with a timezone can
never equal a dateTime without a timezone). Note that SPARQL >, =, and <
comparisons return FALSE for UNDECIDED.
1066-01-09T17:00:00-05:00 = 1066-01-09T22:00:00Z Undecided
because equals dt-wo-tz-earliest
(so FALSE since undecided)
2013-06-01T02:13:14.5Z [same] later (hundreds of years later
than dt-wo-tz-latest)
1066-01-10T17:00:00+03:12 1066-01-10T13:48:00Z Undecided because
between dt-wo-tz-earliest and dt-wo-tz-latest
(so FALSE since undecided)
1066-01-09T15:00:00-01:00 1066-01-09T16:00:00Z earlier because
before dt-wo-tz-earliest
Comparing two dates with timezones
A date YYYY-MM-DD with timezone TZ has starting instant YYYY-MM-DDT00:00:00TZ, for example 2013-01-01-08:00 has starting dateTime 2013-01-01T00:00:00-08:00. To compare two dates with timezones, convert them to dateTimes with time 00:00:00, and you now have two dateTimes with timezones. Comparing two dateTimes with timezones is discussed above.
Comparing dates without timezones
Dates without timezones are treated as if they both have the same timezone. Thus, they are equal if they specify the same calendar date, and one is greater than the other if it has a later calendar date.
Comparing dates where one has a timezone and the other does not
Just as with comparing a dateTime without a timezone to one with a timezone, the comparison of two dates, one with and one without a timezone, is done by finding the dateTime the timezoneless date starts if it had timezone +14:00, call this dateTime+14; and the dateTime the timezoneless date starts if it had timezone -14:00, call this dateTime-14. We now have three datetimes:
dateTime+14
dateTime-14
dateTime-wtz = the starting instant of the date with timezone
If dateTime-14 < dateTime-wtz THEN date-wo-tz < date-w-tz
If dateTime-wtz < dateTime+14 THEN date-w-tz < date-wo-tz
If dateTime+14 <= dateTime-wtx <= dateTime-14 THEN
date-w-tz is neither greater than nor less than (nor equal to)
date-wo-tz
So, compare date 1948-12-19+05:00 with 1948-12-18
1948-12-19+05:00 starts at 1948-12-18T19:00:00Z
1948-12-18+14:00 starts at 1948-12-17T10:00:00Z
1948-12-18-14:00 starts at 1948-12-18T14:00:00Z
Since 1948-12-18T19:00:00Z is later than
1948-12-18T14:00:00Z
1948-12-19+05:00 is greater than 1948-12-18
Human readable representation
The Lisp functions part->concise and part->terse return strings with xsd:dates and xsd:dateTimes in a readable format:
(part->concise (value->upi "2013-01-03T10:34:00-05:00" :date-time))
=> "2013-01-03T10:34:00-05:00"
Temporal datatypes derived from dateTimes
There are various temporal datatypes derived from dateTimes, described here.
Geospatial datatypes
Geospatial datatypes are described here