Introduction
AllegroGraph supports online full backups. A full backup contains all data files and information required to restore a database to the state it was in at the time of the backup. Backups are performed while the database is running.
The backup program is used both for backing up data for use later by the same server and for backing up data which will be upgraded to a new (later) version.
The backup utility backs up the data (all the triples) and the server configuration. This makes upgrading straightforward. See below and Database Upgrade.
Backing up/restoring AllegroGraph releases prior to 4.12.2
Please contact [email protected] if you wish to backup data from a running AllegroGraph release prior to 4.12.2 or restore data backup up from such a release. Various programmatic changes means the procedure is different from what is in this document.
Backing up/restoring AllegroGraph releases prior to 6.2.0
In releases prior to 6.2.0, the program for backing up and restoring data was agraph-backup. Starting in release 6.2.0, it is the agtool archive program. (Most command-line programs have been folded into the single agtool program, whose first argument specifies the utility to run.) The calling template for the older *agraph-backup and the new agtool archive are the same. Anywhere in this document where it may say to run agtool archive in a release prior to 6.2.0, run agraph-backup instead.
The agtool archive program
The single program agtool archive does both backing up and restoring, and a few additional bookkeeping tasks. The general calling template for agtool archive is:
agtool archive [options] command [command-args]
options are prefixed with double dashes (single dash in some cases) and may also take arguments.
The commands which write to or read from files (backup, backup-all, backup-settings, restore, restore-all, and restore-settings) are passed a directory and perhaps additional information like a database name. The specific files are located and named within that directory following standard rules described below. Unless the --supersede
option is specified to backup commands, the archive directory must either not exist or be empty. See below for more information.
The supported commands are as follows:
- backup dbname archive
- This command saves the contents of the database dbname to a file in the directory named by archive. See more on this command, including details of the options, below.
- backup-all archive
- This command backs up all databases in the running AllegroGraph server to the directory archive. It also backs up settings (unless the --nosettings option is specified). See below for more on this command, including details of the options.
- restore dbname archive [archive-dbname]
- This command restores the data from archive-dbname from the archive archive into dbname. If archive-dbname is omitted, the archived repository searched for is dbname. archive can also be a specific backup file, which will be restored into dbname. When archive is a file, archive-dbname, even if supplied, is ignored. See below for more on this command, including details of the options.
- restore-all archive
- This command restores multiple databases and settings information previously archived in archive by the backup-all command. See below for more on this command, including details of the options.
- backup-settings archive
- This command backs up the settings information (users, stored queries/procedures, etc.) The settings themselves are copied directly from the filesystem. Knowledge of where the settings reside comes from either a running server or the config file. See below for more on this command, including details of the options.
- restore-settings archive
- This command restores the settings information (users, stored queries/procedures, etc.) stored in the settings subdirectory of archive. If settings are restored, the server must be restarted for them to take effect. See below for more on this command, including details of the options.
- list archive-file
- This command shows the contents of archive-file, which must be a backup file that contains repository data. See below for more details.
- no options or commands specified
- agtool archive called with no options and no command prints help documentation.
Backup archive directory structure
Most commands take a directory as a location argument. The <archive> directory has two subdirectories, archives/ and settings/. Here is the structure of the archives/ directory for a repository:
<archive>/archives/<catalog>/<dbname>/
This directory contains repository data for <dbname> as well as settings specific to that repository. Triples data is stored in a file named
Server settings information is stored in files and subdirectories of
<archive>/settings/
agtool archive can find the files it needs based on the supplied archive directory, catalog name (which defaults to 'root'), and database name. There is in general no need to refer to specific filenames, although the restore command will accept a file argument instead of a directory. This allows restoring a single database from an earlier release.
Note that this structure is new in release 4.12.2. agtool archive knows the directory structures of earlier releases, so upgrading also only requires a directory name, but the information in this document does not apply to earlier releases.
Overwriting files and databases
If a non-empty directory is given as an argument to the backup or backup-all or backup-settings commands, these commands will fail (with an error message) unless the --supersede
option is specified. If --supersede
is specified, the contents of the <archive> directory will be removed entirely, and then the new backup files will be written. Warning: you cannot update the backup of a specific database with the backup command or update saved settings with backup-settings to an existing archive directory because if you specify --supersede, all data in the acrchive directory is deleted.
Similarly, if you try to restore an existing database, that restore will fail unless the --supersede
option is specified. When the --supersede
option is specified, repositories that exist on the server but not in the archive will not be removed.
The restore-settings command ignores the --supersede
option because there are always settings, so restoring settings of necessity overwrites existing settings.
agtool archive notes
agtool archive must be run by the user id under which the AllegroGraph server is running (commonly the
agraph
user).The agtool archive program must be run on the same machine as the AllegroGraph server.
There are many options and most apply to only certain commands. Specifying an option not relevant to the command specified is not an error. The irrelevant option will be ignored.
Using agtool archive to upgrade
When you have a new version of AllegroGraph, you can migrate your databases from the old version to the new using agtool archive. Let us say you are upgrading from AllegroGraph 6.1 to AllegroGraph 6.2.1. The steps are:
Start the AllegroGraph 6.1 server using port P. Choose a directory D (which must not exist) for the backup archive.
su to the user id under which the AllegroGraph server is running (usually the agraph user).
Run the 6.1 version of agraph-backup with the backup-all command (for additional options, see the AllegroGraph 6.1 documentation -- note we call agraph-backup because agtool was introduced in version 6.2.0):
Stop the AllegroGraph 6.1 server.
Start the 6.2.1 server, using port P1.
Run the 6.2.1 version of agtool archive using port P1 with the restore-all command (see below for the full set of command options):
agraph-backup --port P backup-all D
agtool archive --port P1 restore-all D
Please contact [email protected] if you plan to restore AllegroGraph v4.0 backups into AllegroGraph v4.1 or newer, or any backups from a pre-4.0 version of AllegroGraph.
The backup and backup-all commands
The backup and backup-all commands backup a single database or all databases, respectively. backup-all additionally backs up settings unless the --nosettings
option is specified. backup takes a database (also called respository) name and an archive directory as its arguments. backup-all takes an archive directory as its argument.
The archive must either not exist or be empty unless the --supersede
option is specified. If --supersede
is specified, and the archive exists and is not empty, archive will be cleared of all contents before new data is written to it. Therefore, you cannot update the backup of a single database to an existing archive with backup because other data in the directory will be deleted.
For backup only, <archive> can be -, which means send the output to standard output rather than to a file. See below for details.
With either backup command, database data archives are written to files named <archive>/archives/<catalog>/<dbname>/<dbname>.agbackup.
Here are the relevant options. Unless indicated, they apply to either command:
- --port port | -p port
- the port with which to communicate with the running server.
- --config config-file
- if supplied, settings location will come from the config file. This argument can be supplied but is not necessary as the config file location is available from the server, which must be running and listening on the specified port.
- --catalog
- the catalog that contains the repository.
--catalog
defaults to the root catalog. Not relevant for backup-all (where all catalogs are always saved). - --supersede
- If specified, an existing, non-empty archive directory will be emptied (all existing files and subdirectories removed) before its standard directory structure is reestablished, and the desired backup files are written (for a single database for backup, for all databases and for settings for backup-all).
- --nosettings
- For backup-all, do not save settings (such as users, roles and stored procedures). This is an unusual option. For backup, do not save settings for the specified repository. This is an unusual option.
- --ext-backup-command
- This option has been deprecated.
So, the template for a call to agtool archive using the backup command is:
agtool archive [--port port | -p port] [--supersede] [--catalog catalog] backup <archive>
And the template using the backup-all command is
agtool archive [--port port | -p port] [--supersede] [--nosettings] backup-all <archive>
The restore and restore-all commands
The restore and restore-all commands restore one or more databases archived with agtool archive with the backup or backup-all commands. restore-all also restores settings if present, unless --nosettings
is specified. If settings are restored, you must restart the server before continuing.
During a restore, a progress report is periodically printed to stdout, showing the fraction of the data file which has been processed, and an estimate of the completion time.
The catalog specified by --catalog
(defaults to root) must exist for restore. For restore-all, all archived catalogs must exist (the subdirectories of <archive>/archives/ are catalog names) in the running server.
Here are the relevant options. Unless indicated, they apply to either command:
- --port port | -p port
- the port with which to communicate with the running server.
- --config config-file
- if supplied, settings location will come from the config file. This argument can be supplied but is not necessary as the config file location is available from the server, which must be running and listening on the specified port.
- --nocommit
- When specified, the newly restored database will be set to no-commit mode. no-commit mode means that the database will not accept commits. Replicas must be created in no-commit mode. See Replication.
- --newuuid
- When specified, a new uuid will be generated for the database.
- --recover
- Shorthand for specifying both
--nocommit
and--newuuid
. - --replica
- When specified, the restored database is assumed to be a replication secondary (warmstandby client). It will cause
--nocommit
to be set. This flag is incompatible with--newuuid
and--recover.
Warm standby is discussed in Replication Details. - --noconvert
- Only relevant when the AllegroGraph version which created the archive is different (earlier) than the one which is restoring the archive. Normally, the restore operation converts such archives to the new version. When
--noconvert
is specified, it does not do that conversion. if you specify this option, you will need to run the agtool upgrade command before accessing the restored database. - --catalog
- The catalog that contains the repository. This catalog must exist in the server. Defaults to root. This argument is ignored by *restore-all.
- --nosettings
- With restore-all, when specified, do not restore settings information from the restored archive. (If backup-all was called with the
--nosettings
option, the settings will not be available in any case.) With restore, when specified, do not restore repository settings information. - --supersede
- if specified, existing databases in the server will be overwritten if backup data exists in <archive> (for the single database being restored by restore or all databases for restore-all).
So, the template for a call to agtool archive using the restore command is:
agtool archive [--port port | -p port] [--nocommit] [--newuuid] [--recover] [--replica] [--catalog catalog] [--supersede] restore <dbname> <archive> [<archive-dbname>]
And the template using the restore-all command is
agtool archive [--port port | -p port] [--nocommit] [--newuuid] [--recover] [--replica] [--nosettings] [--supersede] restore-all <archive-dir>
The archive argument for restore and restore-all should name a directory created by the backup command or the backup-all command. (Directories created by earlier AllegroGraph version will also work.)
For restore, <archive> may also name a backup file or may be -, which means get the input from standard input rather than from a file. See below for details. If the optional <archive-dbname> is given, it is the repository name looked for, which will be restored to <dbname>, allowing you to change the name of the repository you are restoring. If omitted, <archive-dbname> defaults to <dbname>. If <archive> is a file or -, <archive-dbname> is ignored even if supplied.
The dbname argument for the restore command is the name that will be given to the restored database. For restore-all the names will be taken from the various archive files.
Here are some restore examples:
agtool archive [options] restore my-db archive
This command will search archive for data in the dbname repository.
agtool archive [options] restore my-db agbackup-file
This command read data from the file backup-file and puts it
into the my-db repository.
agtool archive [options] restore my-db archive my-old-db
This command find the stored data for my-old-db in archive
and puts it into a repository named my-db.
The backup-settings and restore-settings commands
The backup-settings command will store settings (users, stored queries/procedures, etc.) information. The <archive> directory must either not exist or must be empty unless the --supersede
option is specified, in which case all files and subdirectories of an existing directory will be deleted before the settings information is written. Therefore, you cannot specify an existing backup archive directory and just have the settings superseded. Settings are written to the settings/ subdirectory of <archive>. You can manually replace the settings/ subdirectory of one backup archive with a different settings/ subdirectory if you want to change the saved settings in an archive.
The options are:
- --config
- must be the path of an AllegroGraph config file. If specified (and no value is specified for
--port
), agtool archive will look in the config file for the location on disk of settings information. The server need not be running. - --port port | -p port
- the port with which to communicate with the running server. Not needed if the optional config-file argument is specified. If both are specified, the port takes precedence.
- --supersede
- if specified, the <archive> directory, if it exists and is non-empty, will be cleared (all subdirectories and files deleted) before the standard subdirectory structure is reestablished and settings information is written out. (Do not specify
--supersede
if you just want to update settings information in an existing backup archive as doing so will cause all other data to be deleted.)
The call template thus is:
agtool archive [--port port | -p port] [--config <config-file>] [--supersede] backup-settings <archive>
The restore-settings command replaces settings in a running server. It takes an <archive> argument and finds the settings data in that file and replaces the existing stored settings of the running server with the new settings. (If <archive> does not have any settings data, no change is made.) The --supersede
option is ignored since there are always settings, so restoring settings to a running server of necessity supersedes the existing ones.
You must restart the server for the new settings to take effect.
The call template thus is:
agtool archive [--port port | -p port] restore-settings <archive>
The list command
The list command list the contents of a archive file, which is a single dbase.agbackup file. These files can be found in the
<archive>/archives/<catalog>/<dbname>/
directory and have names and types
<dbname>.agbackup
The call template is:
agtool archive list <archive-file>
Reading from standard input/ writing to standard output
If you specify - (a dash) in place of an archive with the backup and restore commands, the backup command will write to standard output and the restore command will read from standard input. This is useful when you wish to copy a database, as we describe in the next section, and also for streaming backups over a network.
Using agtool archive to copy databases
agtool archive can be used to copy a database to a new name and/or catalog. This is achieved by setting up a pipe with agtool archive backup writing the database to standard output, and agtool archive restore reading the backup archive into a different database. For example, to make a copy of the database lubm1 under the name lubm1 in the catalog experiments, the following command line could be used:
agtool archive backup lubm1 - | agtool archive --catalog experiments --newuuid restore lubm1 -
Note that a triple-store copy needs a --newuuid if it is to run on the same server as the original triple store. Alternately, you could specify the --port number to send the copy to a different server on the same computer. In that case the --newuuid isn't necessary.
Restoring a Pre-v4.1 backup
AllegroGraph versions prior to v4.1 used a different backup mechanism. Please contact [email protected] if you require assistance restoring v4.0 backups into AllegroGraph v4.1 or newer.
agtool archive help
Calling agtool archive --help displays help information.