Introduction
agtool is a program for performing a variety of operations on an AllegroGraph server or repository. Its general calling sequence for tools that operate on repositories is
% agtool <tool> [options] [REPO-SPEC] [arguments]
Some tools operate just on the running AllegroGraph server. The general for for such tools is
% agtool <tool> [--server SERVER-SPEC] [options] [arguments]
Most tools are documented in separate documents linked to from this document.
REPO-SPECs
A repository specification or REPO-SPEC argument identifies a repository and also contains the necessary SERVER-SPEC information. The Repository Specification document describes the REPO-SPEC format. In its concise form, a REPO-SPEC encodes all the necessary information to identify a repository. Its general form is:
[USER:PASSWORD@][[HOST][:[PORT][s]]/][CATALOG:]REPO
The various elements are:
- The scheme
- Either
http
orhttps
. The default ishttp
. The scheme is encoded by the presence or absence of thes
following thePORT
:https
when thes
is persent,http
when it is not. - USER and PASSWORD
- A valid AllegroGraph user and that user's password.
- HOST
- The host on which the AllegroGraph server is running. The default is
127.0.0.1
, which is the same aslocalhost
. - PORT
- The port which the HOST is listening on. The default is 10035.
- CATALOG
- The name of the catalog which contains the specified REPOSITORY. Catalogs are defined in the configuration file and can only be created at server startup time so the specified CATALOG must already exist. If not specified, defaults to the root catalog.
- REPOSITORY
- The repository of interest. Depending on the tool, this repository may or may not exist (some tools create repositories).
In its simplest form, a REPO-SPEC is just a repository name, which then expands to
localhost:10035/repository-name
If anything must have a value other than the default, more elements must be used. See Repository Specification for numerous examples.
So for example, to load an Ntriples file mydata.nt into the my-repository
repository, the command might be
% agtool load my-repository mydata.nt
or equivalently
% agtool load localhost:10035/my-repository mydata.nt
Tools that operate on repositories accept REPO-SPECs for identifying repositories.
Older repository specification arguments
In earlier releases, most tools which acted on repositories had options --scheme
, --user
, --host
, --port
, and --catalog
(some of which had abbreviations). --catalog
is no longer supported at all. For the others, some tools still accept the remaining arguments. For those tools, use of these older arguments is deprecated and will signal a warning when used. If these arguments are accepted and used and a REPO-SPEC is also specified, values must match the values specified in the REPO-SPEC argument.
Because the required matching includes default values, some commands may have unmatched values even though they are not apparent. Note the username and password are considered part of the host but do not have default values. The --scheme
argument is handled specially. Here are some examples using agtool archive backup:
# Allowed, as no host and port values are present in the REPO-SPEC:
$ agtool archive backup repo1 \
--host user:pass@host --port 30035 repo1.bak
# Allowed, as host and port values are the same.
$ agtool archive backup user:pass@host:30035/repo1 \
--host user:pass@host --port 30035 repo1.bak
# Not allowed, since --host conflicts with the REPO-SPEC.
# --host user:pass@host2 without a --port arguments (as above)
# is interpreted as --host user:pass@host2:10035 which conflicts
# with user:pass@host2:30035 in the REPO-SPEC.
$ agtool archive backup user:pass@host2:30035/repo1 \
--host user:pass@host2 repo1.bak
# Not allowed, the host in the REPO-SPEC defaults to 127.0.0.1
# (i.e. localhost) which does not match the value of --host (host2).
$ agtool archive backup user:pass@30035/repo1 \
--host user:pass@host2 --port 30035 repo1.bak
The following two examples show how --scheme
is handled specially.
# Allowed to override the default scheme (http).
$ agtool archive backup http://host:10035/repositories/repo1 \
--scheme https repo1.bak
# Not allowed to override non-default scheme.
$ agtool archive backup https://host:10035/repositories/repo1 \
--scheme http repo1.bak
SERVER-SPECs
A SERVER-SPEC specifies the running AllegroGraph server which the tool will operate on. If the tool operates on a repository, the SERVER-SPEC information is included in the REPO-SPEC which identifies the repository but some tools, like agtool user, do not operate on repositories.
A SERVER-SPEC is typically the value of the --server
option (some tools use other option names to identify servers; replicate
, for example, uses --primary
and --secondary
). The option defaults to
--server http://127.0.0.1:10035
You must specify --server SERVER-SPEC
if any of the defaults do not apply.
A SERVER-SPEC can be
HOSTNAME
HOSTNAME:PORT
SCHEME://[user:password@]HOSTNAME
SCHEME://[user:password@]HOSTNAME:PORT
Missing elements except for user
and password
are taken from the default above, so --server myhost
becomes --server http://myhost:10035
.
SCHEME
can be http
or https
.
Note that the SERVER-SPEC does not use the same encoding as a concise REPO-SPEC. In particular, the scheme cannot be specified by an s
after the port number as it can be in a concise REPO-SPEC.
If the user who started the AllegroGraph server is executing agtool on the machine on which AllegroGraph server is running, agtool will then run under the same uid that the AllegroGraph server is running under. In that case for most tools, agtool acts as if it is being run by a superuser if no user and password are specified. This behavior is called OS authentication. It is common for production versions to be run this way, and it is also typical for users who are testing personal copies.
A superuser can generally run any agtool operation. Other users can run some, depending on their actual permissions. See Managing Users for information on user permissions.
Help
% agtool --help
displays a help string giving a brief description of each tool.
% agtool TOOL --help
displays a help string for the specific TOOL.
Tools
agtool can run the following tools. Use agtool TOOL --help to get more information on any tool.
- archive -- see Repository Backup and Restore
- auto
- cancel-purge-deleted-triples
- catalogs -- create and delete dynamic catalogs and list all catalogs
- create-db -- alias for repo create
- define-attribute
- delete-attribute-definition
- delete-static-attribute-filter
- export -- see Repository Export
- get-metadata
- gruff -- see Gruff in AllegroGraph
- load -- see Data Import
- lookup-attribute-definitions
- materialize -- see Materializer
- memory-lock
- memory-unlock
- namespaces
- optimize
- purge-deleted-triples
- purge-rate-limit
- query -- see Querying using agtool
- query-options
- [read-only-mode(#read-only-mode)
- recover -- see Point-in-Time Recovery
- repl -- see Multi-master Replication
- repos -- create new repos, delete existing ones, and list all repos
- replicate -- see Replication
- roles -- see Managing Users
- shacl-validate -- see SHACL
- scheduler -- see Event Scheduler
- set-purge-rate-limit
- set-static-attribute-filter
- storage-report
- tokens
- triple-count
- upgrade -- see the agtool upgrade program in the Repository Upgrading document
- users -- see Managing Users
- version
- view-tlog
- vload and virtualized graphs
There are additional agtool tools that are not documented here. Some provide information useful for dealing with problem reports and user may be asked to run those as part of dealing with a problem report. Others are associated with other AllegroGraph features and should only be run as part of using those features. Those tools are documented with the feature rather than here.
Most agtool tools take the same options as the program they are replacing, but some have either new options or no longer accept previous options (usually the abbreviation is not accepted but the long form is). These changes are noted for each relevant tool.
Accessing and operating on files on Amazon S3
The following options allow some agtool commands to access files located on Amazon S3.
- --aws-access-key-id ACCESS-KEY-ID
- The Amazon access key id.
- --aws-configuration-file CONFIGURATION
- The location of the Amazon configuration file. Default:
~/.aws/config
. See also the description of--aws-credentials-file
just below. - --aws-credentials-file CREDENTIALS
- The location of the Amazon credentials file. The default location is
~/.aws/credentials
. See also the description of--aws-configuration-file
just above. - --aws-profile PROFILE
- The profile to select in the Amazon configuration file. This defaults to the default profile if --aws-access-key-id and --aws-secret-access-key are not specified.
- --aws-secret-access-key SECRET-ACCESS-KEY
- Amazon secret access key.
Either --aws-profile
or both --aws-access-key-id
and --aws-secret-access-key
must be given in order to change the source or destination to S3. The profile name refers to an entry in the file indicated by --aws-configuration-file
, which defaults to ~/.aws/config
.
These options can be used with agtool commands that operate on (read or write) files that are stored on Amazon S3. To denote such files, preface the file path with s3://
. For example
s3://bucketname/a/b/c/filename
Tools that operate on files include load, export, and archive.
Temporary credentials
AWS allows users to generate temporary credentials that include a session token besides the usual access key ID and the secret access key itself. When temporary credentials are used, failure to specify the session token results in a 403 HTTP response with the error message like
<Code>
InvalidAccessKeyId
</Code>
<Message>
The AWS Access Key Id you provided does not exist in our
records.
</Message>
...
Session tokens can be included in the .aws/config
/.aws/credentials
files under the field name aws_session_token
or specified explicitly as command line arguments, as in the following agtool archive command:
$ agtool archive --aws-access-key-id <key-id> \
--aws-secret-access-key <access-key> \
--aws-session-token <session-token> \
backup <repository> s3://<bucket-name>
Attribute and filter support
AllegroGraph supports triple attributes (key/value pairs associated with individual triples) and static filters (statements that can restrict access to triples based on their attribute values. These features are described in the Triple Attributes document.
Attributes can only be associated with a triple when the attribute has been defined. agtool can be used to define attributes (agtool define-attribute), delete attribute definitions (agtool delete-attribute-definition), lookup attribute definitions (agtool lookup-attribute-definitions), and set and delete a static attribute filter (agtool set-static-attribute-filter and agtool delete-static-attribute-filter). To see the calling sequence for these commands, execute
% agtool COMMAND --help
For example
% agtool define-attribute --help
The archive tool
Command calling template:
% agtool archive [options] command [command-args]
options are prefixed with double dashes (single dash in some cases) and may also take arguments. To get usage information, enter
% agtool archive --help
The commands which write to or read from files (backup, backup-all, backup-settings, restore, restore-all, and restore-settings) are passed a directory and perhaps additional information like a database name. The specific files are located and named within that directory following standard rules (described here. Unless the --supersede
option is specified to backup commands, the archive directory must either not exist or be empty.
agtool archive can be used to upgrade a database from the format on one AllegroGraph release to the format of a later release. This is the recommended way to upgrade since it provides a backup copy of the database in the earlier format which allows easy recovery if there are problems with the later version. The agtool upgrade tool does not (by itself) backup a database before upgrading.
See the Repository Backup and Restore document for a complete list of commands and options.
The auto tool
The auto tool sets up background tools that run to perform certain tasks, either forever (until the server is restarted) or for a specified period of time. In this release, there is only one task: optimize which we describe now.
agtool auto optimize [options] repo-spec
The options for optimize are:
--idle
SECONDS- An index must be idle for this number of seconds (default 100) before being considered for optimization, thus (most times) preventing optimization from interfering with queries and other tasks.
--interval
SECONDS- Seconds between checking for possible optimization. Default: 120.
--operate
SECONDS :Operate for this many seconds and then exit (optimizations which have started will complete). Zero means run forever (until the agtool process is interrupted with Control-C or killed). Default: 0.
--quiet
YES-OR-NO :If yes
messages will only be printed when there is an index to optimize. When no
, message are printed every interval
number of seconds about what the optimizer is doing. Default is no
--verbose
:If specified, print more information. Can be specified multiple times.
--workers
POSITIVE-INTEGER :Maximum number of optimization operations to peform at once. Default: 1.
:Once auto optimize is started, it optimizes indices whose oscrores are greater than 1. The automatic optimizer can also be started and stopped in AGWebView. Triple indices are descibed in the Triple Indices document. See particularly the Optimizing Indices section.
The catalogs tool
The catalogs tool can be used to create and delete dynamic catalogs, which are described in the Catalog definitions section of the Server Configuration and Control document. You must have a DynamicCatalog definition in your configuration file to be able to create dynamic catalogs. If you do have such a definition, then this command will create one or more new dynamic catalogs:
agtool catalogs create SERVER_SPEC CATALOG_NAME*
Dynamic catalogs can be deleted with
agtool catalogs delete SERVER_SPEC CATALOG_NAME*
Deleting a dynamic catalog also deletes all the repos it contains.
Finally, all catalogs (dynamic and static) are listed by
agtool catalogs list [SERVER_SPEC]
The SERVER_SPEC
can only be left out if the server is http://localhost:10035
.
The repos tool
The repos tool can be used to:
- Create new repos including multi-master replica sets and FedShard distributed repos
- Delete existing repos
- List existing repos
We describe each action in turn.
Using 'agtool repos create' to create new or supersede existing repos
agtool repos create will create new ordinary repos, new multi-master replica sets, each of which has a controlling instance and one or more replicas, or new distributed repositories.
The older tool agtool create-db is still supported but simply as an alias for the agtool repos create command.
Here we create a new, ordinary repository:
agtool repos create http://user:password@host:port/repositories/newrepo
We have provided a complete REPO_SPEC. Bits can be left out, as described in the REPO_SPEC section above.
If newrepo already exists, the command will fail unless the --supersede
option is specified, in which case the existing repo will be deleted (along with all its data) and the new empty repo will be created.
Here we create a new multi-master replica set with a controlling instance and a single additional instance:
agtool repos create '<http://test:xyzzy@localhost:10700/repositories/mmrep1>|<http://test:xyzzy@localhost:10700/repositories/mmrep2>'
Some notes:
- Neither repo (mmrep1 and mmrep2) should already exist.
- At least two repos must be specified, with specs enclosed in angle brackets and separated by a |.
- The repo specs must be enclosed in quotes (as shown) so the | is not interpreted as a shell character.
- The
--supersede
option is not supported.
Here we create a new FedShard distributed repository cluster.
ag710/bin/agtool repos create http://test:xyzzy@localhost:10700/repositories/ClusterRepoName
The --supersede
option is not supported.
Using 'agtool repos delete' to delete repos
ag710/bin/agtool repos delete REPO_SPEC
Using 'agtool repos list' to list repos
agtool repos list [SERVER_SPEC]
lists all existing repos.
The SERVER_SPEC
can only be left out if the server is http://localhost:10035
.
The materializer tool
Command calling template:
% agtool materialize [options] command [command-args]
See the Command Line Interface section in the Materializer document for details on the options.
The gruff tool
Gruff is a graph visualization and graphical query builder designed to work with AllegroGraph. It runs in a browser initiated by links from AGWebView. The gruff tools can be used to manage Gruff in AllegroGraph. See Gruff in AllegroGraph.
The namespaces tool
With agtool namespaces you can list, add and remove namespaces that will be used in queries. See the Namespaces and query options document. In brief, this command
% agtool namespaces add user:pass@aghost:10035/repo \
ex 'http://example.org#' \
franz 'http://franz.com#'
This adds to the namespace prefixes ex
and franz
to the user namespace list for the repo.
The optimize tool
See also agtool auto optimize below. The optimize command calling template is
% agtool optimize [ OPTIONS ] REPO-SPEC
The options are:
--indices
INDICES- A comma separated list of indices to optimize (such as
spogi,psogi
). If unspecified orall
, all indices are optimized. --level
2-or-3- Optimization effort level: 2 (the default) writes out in-memory triple data and then performs an index optimization of all overlapping chunks;
3
- same as2
, but will process all chunks regardless of overlap (useful for ensuring that all purgeable deleted triples are purged). --wait
BOOLEAN- If
true
, do not return until optimization completes. Default isfalse
.
The query tool
Command calling template:
% agtool query [options] REPO-SPEC [QUERY-FILE]*
See the Querying using agtool document for details on the arguments and options.
The query-options tool
With agtool query-options you can list, add and remove query options that will be used in queries. See the Namespaces and query options document. In brief, this command
% agtool query-options add user:pass@aghost:10035/repo \
engine mjqe \
memoryLimit 2G
This sets the query engine to mjqe
and the memoryLimit to 2G. agtool query-options list REPO
prints the options in effect. agtool query-options remove REPO QUERY-OPTIONS
removes the listed option.
Command calling template
% agtool read-only-mode COMMAND [ OPTIONS ] REPO-SPEC
COMMAND
can be:
- help - Display read-only-mode usage.
- enable - Enable read-only (no-commit) mode.
- disable - Disable read-only (no-commit) mode.
- show - Show if given repository is currently in read-only (no-commit) mode.
When a repository is in read-only mode, no commits may be made. Although triples can be added and deleted in a session as usual, those changes will not be seen by other users and cannot be committed. Only a superuser can enable or disable read-only-mode. (See the Managing Users document for information on superusers.)
The export tool
Command calling template:
% agtool export [options] REPO-SPEC FILE
This will export the data in the repository identified by REPO-SPEC to FILE. For example,
% agtool export --output-format rdfxml user1:my-pw@agmachine/lesmis lesmis.rdf
exports the data in the lesmis repo (in the root catalog, since no catalog is specified) to the file lesmis.rdf. The format is rdfxml
. user1 must have read permission in lesmis.
See the Repository Export document for details on the options.
The load tool
The agtool load tool can be used to load data into a store from a file or a collection of files. See the Data Import document for details.
The tool calling template is:
% agtool load [options] REPO-SPEC SOURCE*
where REPO-SPEC identifies the repository into which the data should be loaded (the catalog can be specified as part of the REPO-SPEC, as can user and connection information). The SOURCEs are the file(s) to be loaded. Names of files on Amazon S3 must be preceded by s3://
. The SOURCE can be standard input. Numerous options control the loading process. All are described in the Data Import document for details on the options.
agtool load can be run on the same machine as the AllegroGraph server by the same user who started the server, in which case no username or password is required, or on a different machine and/or by a different user, in which case the username and password of a user with AllegroGraph superuser privileges must be specified as part of the REPO-SPEC.
Here are a couple of examples of typical uses. In this first example, the user executing the command must be the user who started the AllegroGraph server and the command must be run on the same machine as the server:
% agtool load my-repository mydata.nt
Load the Ntriples file mydata.nt into the store my-repository in the root catalog. Since no port is specified, the port is the default, 10035.
In this example, the command is again run on the same machine as the Allegrograph server and by the user who started AllegroGraph. A non-default port is used.
% agtool load 10077/mycatalog:my-repo2 mydata-2.ttl mydata-3.ttl
Load the turtle format files mydata-2.ttl and mydata-3.ttl (the format determined by the file extensions) into the store my-repo2 in the catalog mycatalog. The server listening on port 10077.
Finally, a call from a different machine. Here the host running the server must be specified by name and the user and password must be provided:
% agtool load test:xyzzy@agmachine:10077/repo-7 mydata.ttl mydata.nq
Load the turtle format file mydata.ttl and an NQUAD format file mydata.nq (the format determined by the file extensions) into the store repo7 in the (default) root catalog. The server is running on the host agmachine and is listening on port 10077.
The recover tool
Command calling template:
% agtool recover [options] archive database
See the Point-in-Time Recovery document for details on the options and general usage.
The repl tool
The repl tool allows you to manage a multi-master replication cluster. See Using the agtool repl command in the Multi-master Replication document for more information.
A multi-master cluster should not be confused with single-master replication described below. Single-master replication allows for one master repository, where modifications (adding and deleting triples) can be made and replicas which cannot make modifications. Multi-master replication allows each replica to make modifications.
The replicate tool
The replication tool allows you to set up single-master replication where you have several identical instances of AllegroGraph running simultaneously, so if one fails one of the others can take over. The replicate command calling template is:
% agtool replicate [options]
See Replication Details document and also the Replication document for more information on agtool replicate and its options.
Single-master replication should not be confused with multi-master replication, described above.
The storage-report tool
This tool (command line agtool storage-report repo-spec
) shows storage used by the specified repository. Each index is listed with its storage use. The percentage of space taken by deleted triples is also displayed. See the Purging Deleted Triples document for more information in deleted triples.
Here is a portion of a storage report for a repo (information on some indices is deleted, as shown by the syspension points ([...]):
% ag710/bin/agtool storage-report http://test:xyzzy@localhost:10700/repositories/sapnew
## DB storage report: sapnew
-- Mem chunks --
Chunk #xffffd44df8c (offset #x1e7698c): Type: mem, Refcount: 7, Used: 68896
-- Flavors --
[ 0] spogi: style: 1 last-id: 0 #chunks: 1 Oscore: 1.000, size: 1.5 MiB (1,560,576), deleted 27%, 15.61 bytes/triple, idle
[ 8] posgi: style: 1 last-id: 0 #chunks: 1 Oscore: 1.000, size: 2.0 MiB (2,146,304), deleted 27%, 21.46 bytes/triple, idle
[... lines deleted for space]
[22] gospi: style: 1 last-id: 0 #chunks: 1 Oscore: 1.000, size: 1.9 MiB (2,035,712), deleted 27%, 20.36 bytes/triple, idle
[24] i : style: 1 last-id: 0 #chunks: 1 Oscore: 1.000, size: 1.3 MiB (1,388,544), deleted 27%, 13.89 bytes/triple, idle
Total live disk chunks size : 12.2 MiB (12,836,864)
Triple count : 142,009
Total live disk chunk triples : 700,000
Index triples deleted : 16%
Bytes/triple/index (disk chunks only): 18.338
Active index count : 7
Bytes/triple (disk chunks only) : 128.369 (assumes each triple exists in all active indices)
Obsolete chunks : 0 (0.0% of all available)
Total obsolete disk chunks size : 0 bytes (0)
nil
%
The Oscore value of each index shows the number of index chunks which must be examined to find triples matching a pattern. The optimal Oscore value is 1.0. When you see values higher than 1.0, consider optimizing an index, as queries will run faster when Oscores are lower (sometimes very significantly faster). See agtool optimize.
The information on each index includes the approximate percentage of deleted triples and the summary at the end also shows the approximate percentage of deleted triples. A graphical display of much of this information is found in the Triple indices graph, one of the AGWebView storage reports.
The upgrade tool
agtool ugrade will do an in-place upgrade of a single database from an earlier release to the current release (the release containing the agtool being run). The upgrade is done without backing the database up. Backing up databases prior to upgrading is strongly recommended. See Repository Backup and Restore.
Command calling template:
% agtool upgrade GROUND-REPO-SPEC
A GROUND-REPO-SPEC specifies a single repository (not one that is federated or includes reasoning). See the agtool upgrade program in the Repository Upgrading document for further details. Be sure to backup any database before upgrading.
Managing users and roles
agtool users can be used to add and remove users and to set user permissions and access rights. agtool roles can be used to create and manage roles. See Managing Users for information on managing users with agtool and with AGWebView.
SHACL, the SHApe Constraint Language supported by AllegroGraph, can be used to verify that a set of triples meets various constraints. The validation is done by a call to agtool shacl-validate. See the SHACL document for more information.
The event scheduler tool
Users can schedule events to run scripts at specified times in the future. The scripts can be run once or repeated on a regular schedule. See the Event Scheduler document for more information. The utility for using agtool to schedule events is scheduler. The various commands are:
bin/agtool scheduler --help
Usage: agtool scheduler COMMAND [ OPTIONS ] ...
where COMMAND is one of:
* help - Display scheduler usage.
* delete-events - Delete events.
* log - Display the log of events
* events - List pending events.
* create-event - Create a new event.
To describe a particular command run:
agtool scheduler COMMAND --help
The triple-count tool
The triple-count tool returns the number of triples in a repository. The command format is
agtool triple-count REPO-SPEC
For example, running on the same computer as the AllegroGraph server and as the same user who started the server (so username:password is not needed) we find the kennedy database example has 1214 triples:
% agtool triple-count localhost:10035/kennedy
1214
The vload tool and virtualized graphs
The vload tool communicates with an instance of the Ontop program which materializes a relational database as RDF triples. agtool vload then loads the materialized triples. See the Creating Virtualized Graphs document for more information.