Introduction
AllegroGraph supports online full backups. A full backup contains all data files and information required to restore a database to the state it was in at the time of the backup. Backups are performed while the database is running.
The backup program is used both for backing up data for use later by the same server and for backing up data which will be upgraded to a new (later) version. The program convertdb, which was previously used to convert databases for new AllegroGraph versions, is no longer needed (except in unusual circumstances described below).
The backup utility backs up the data (all the triples) and the server configuration. This makes upgrading straightforward. See below and Database Upgrade.
AllegroGraph 4.12.2 release notes for agraph-backup
The agraph-backup program was significantly modified in release 4.12.2. It is still compatible with earlier versions (meaning it can be used to restore databases backed up in earlier versions) but certain details of operation are significantly different. In particular, note the following:
File locations are generally specified by a single directory, even if the operation is on a single file. In earlier releases, backup-all and restore-all each took a directory as an argument but backup and restore were passed filenames. Now all these commands (and backup-settings and restore-settings) take a directory and write or find the desired files within the specified directory.
The new
--supersede
option allows you to permit operations to overwrite existing files (for backups) or existing databases (for restores). Operations will fail entirely if--supersede
is not specified and any overwriting is determined to be necessary.Backing up settings has changed. Previously, repository settings (such as namespaces) were not saved by the backup command and now they are. backup-settings now just saves system settings. System settings are restored by restore-settings. Repository settings are handled by backup and restore (and backup-all and restore-all). Settings can be backed up with backup-settings, even when the server is not running, by specifying the location of the config file.
The agraph-backup program
The single program agraph-backup does both backing up and restoring, and a few additional bookkeeping tasks. The general calling template for agraph-backup is:
agraph-backup [options] command [command-args]
options are prefixed with double dashes (single dash in some cases) and may also take arguments.
The commands which write to or read from files (backup, backup-all, backup-settings, restore, restore-all, and restore-settings) are passed a directory and perhaps additional information like a database name. The specific files are located and named within that directory following standard rules described below. Unless the --supersede
option is specified to backup commands, the archive directory must either not exist or be empty. See below for more information.
The supported commands are as follows:
- backup dbname archive
- This command saves the contents of the database dbname to a file in the directory named by archive. See more on this command, including details of the options, below.
- backup-all archive
- This command backs up all databases in the running AllegroGraph server to the directory archive. It also backs up settings (unless the --nosettings option is specified). See below for more on this command, including details of the options.
- restore dbname archive [archive-dbname]
- This command restores the data from archive-dbname from the archive archive into dbname. If archive-dbname is omitted, the archived repository searched for is dbname. archive can also be a specific backup file, which will be restored into dbname. When archive is a file, archive-dbname, even if supplied, is ignored. See below for more on this command, including details of the options.
- restore-all archive
- This command restores multiple databases and settings information previously archived in archive by the backup-all command. See below for more on this command, including details of the options.
- backup-settings archive
- This command backs up the settings information (users, stored queries/procedures, etc.) The settings themselves are copied directly from the filesystem. Knowledge of where the settings reside comes from either a running server or the config file. See below for more on this command, including details of the options.
- restore-settings archive
- This command restores the settings information (users, stored queries/procedures, etc.) stored in the settings subdirectory of archive. If settings are restored, the server must be restarted for them to take effect. See below for more on this command, including details of the options.
- list archive-file
- This command shows the contents of archive-file, which must be a backup file that contains repository data. See below for more details.
- no options or commands specified
- agraph-backup called with no options and no command prints help documentation.
Backup archive directory structure
Most commands take a directory as a location argument. The <archive> directory has two subdirectories, archives/ and settings/. Here is the structure of the archives/ directory for a repository:
<archive>/archives/<catalog>/<dbname>/
This directory contains repository data for <dbname> as well as settings specific to that repository. Triples data is stored in a file named
Server settings information is stored in files and subdirectories of
<archive>/settings/
agraph-backup can find the files it needs based on the supplied archive directory, catalog name (which defaults to 'root'), and database name. There is in general no need to refer to specific filenames, although the restore command will accept a file argument instead of a directory. This allows restoring a single database from an earlier release.
Note that this structure is new in release 4.12.2. agraph-backup knows the directory structures of earlier releases, so upgrading also only requires a directory name, but the information in this document does not apply to earlier releases.
Overwriting files and databases
If a non-empty directory is given as an argument to the backup or backup-all or backup-settings commands, these commands will fail (with an error message) unless the --supersede
option is specified. If --supersede
is specified, the contents of the <archive> directory will be removed entirely, and then the new backup files will be written. Warning: you cannot update the backup of a specific database with the backup command or update saved settings with backup-settings to an existing archive directory because if you specify --supersede, all data in the acrchive directory is deleted.
Similarly, if you try to restore an existing database, that restore will fail unless the --supersede
option is specified. When the --supersede
option is specified, repositories that exist on the server but not in the archive will not be removed.
The restore-settings command ignores the --supersede
option because there are always settings, so restoring settings of necessity overwrites existing settings.
agraph-backup notes
agraph-backup must be run by the user id under which the AllegroGraph server is running (commonly the
agraph
user).The agraph-backup program must be run on the same machine as the AllegroGraph server.
There are many options and most apply to only certain commands. Specifying an option not relevant to the command specified is not an error. The irrelevant option will be ignored.
Using agraph-backup to upgrade
When you have a new version of AllegroGraph, you can migrate your databases from the old version to the new using agraph-backup. Let us say you are upgrading from AllegroGraph 4.8 to AllegroGraph 6.1.1. The steps are:
Start the AllegroGraph 4.8 server using port P. Choose a directory D (which must not exist) for the backup archive.
su to the user id under which the AllegroGraph server is running (usually the agraph user).
Run the 4.8 version of agraph-backup with the backup-all command (for additional options, see the AllegroGraph 4.8 documentation):
Stop the AllegroGraph 4.8 server.
Start the 6.1.1 server, using port P1.
Run the 6.1.1 version of agraph-backup using port P1 with the restore-all command (see below for the full set of command options):
agraph-backup --port P backup-all D
agraph-backup --port P1 restore-all D
Please contact [email protected] if you plan to restore AllegroGraph v4.0 backups into AllegroGraph v4.1 or newer, or any backups from a pre-4.0 version of AllegroGraph.
The backup and backup-all commands
The backup and backup-all commands backup a single database or all databases, respectively. backup-all additionally backs up settings unless the --nosettings
option is specified. backup takes a database (also called respository) name and an archive directory as its arguments. backup-all takes an archive directory as its argument.
The archive must either not exist or be empty unless the --supersede
option is specified. If --supersede
is specified, and the archive exists and is not empty, archive will be cleared of all contents before new data is written to it. Therefore, you cannot update the backup of a single database to an existing archive with backup because other data in the directory will be deleted.
For backup only, <archive> can be -, which means send the output to standard output rather than to a file. See below for details.
With either backup command, database data archives are written to files named <archive>/archives/<catalog>/<dbname>/<dbname>.agbackup.
Here are the relevant options. Unless indicated, they apply to either command:
- --port port | -p port
- the port with which to communicate with the running server.
- --config config-file
- if supplied, settings location will come from the config file. This argument can be supplied but is not necessary as the config file location is available from the server, which must be running and listening on the specified port.
- --catalog
- the catalog that contains the repository.
--catalog
defaults to the root catalog. Not relevant for backup-all (where all catalogs are always saved). - --supersede
- If specified, an existing, non-empty archive directory will be emptied (all existing files and subdirectories removed) before its standard directory structure is reestablished, and the desired backup files are written (for a single database for backup, for all databases and for settings for backup-all).
- --nosettings
- For backup-all, do not save settings (such as users, roles and stored procedures). This is an unusual option. For backup, do not save settings for the specified repository. This is an unusual option.
- --ext-backup-command
- This option allows specifying the path of an agraph-backup version earlier that the one running so it will be called with the archive-dir argument to backup its databases (presumably its server is running). When given, this command will be invoked to archive the repository data into <archive>, while the current agraph-backup will collect any repository and systems settings into <archive>.
So, the template for a call to agraph-backup using the backup command is:
agraph-backup [--port port | -p port] [--supersede] [--catalog catalog] [--ext-backup-command path] backup <archive>
And the template using the backup-all command is
agraph-backup [--port port | -p port] [--supersede] [--nosettings] [--ext-backup-command path] backup-all <archive>
The archive argument must name a directory.
The restore and restore-all commands
The restore and restore-all commands restore one or more databases archived with agraph-backup with the backup or backup-all commands. restore-all also restores settings if present, unless --nosettings
is specified. If settings are restored, you must restart the server before continuing.
During a restore, a progress report is periodically printed to stdout, showing the fraction of the data file which has been processed, and an estimate of the completion time.
The catalog specified by --catalog
(defaults to root) must exist for restore. For restore-all, all archived catalogs must exist (the subdirectories of <archive>/archives/ are catalog names) in the running server.
Here are the relevant options. Unless indicated, they apply to either command:
- --port port | -p port
- the port with which to communicate with the running server.
- --config config-file
- if supplied, settings location will come from the config file. This argument can be supplied but is not necessary as the config file location is available from the server, which must be running and listening on the specified port.
- --nocommit
- When specified, the newly restored database will be set to no-commit mode. no-commit mode means that the database will not accept commits. Replicas must be created in no-commit mode. See Replication.
- --newuuid
- When specified, a new uuid will be generated for the database.
- --recover
- Shorthand for specifying both
--nocommit
and--newuuid
. - --replica
- When specified, the restored database is assumed to be a replication secondary (warmstandby client). It will cause
--nocommit
to be set. This flag is incompatible with--newuuid
and--recover.
Warm standby is discussed in Replication Details. - --noconvert
- Only relevant when the AllegroGraph version which created the archive is different (earlier) than the one which is restoring the archive. Normally, the restore operation converts such archives to the new version. When
--noconvert
is specified, it does not do that conversion. if you specify this option, you will need to run the convertdb command before accessing the restored database. - --catalog
- The catalog that contains the repository. This catalog must exist in the server. Defaults to root. This argument is ignored by *restore-all.
- --nosettings
- With restore-all, when specified, do not restore settings information from the restored archive. (If backup-all was called with the
--nosettings
option, the settings will not be available in any case.) With restore, when specified, do not restore repository settings information. - --supersede
- if specified, existing databases in the server will be overwritten if backup data exists in <archive> (for the single database being restored by restore or all databases for restore-all).
So, the template for a call to agraph-backup using the restore command is:
agraph-backup [--port port | -p port] [--nocommit] [--newuuid] [--recover] [--replica] [--catalog catalog] [--supersede] restore <dbname> <archive> [<archive-dbname>]
And the template using the restore-all command is
agraph-backup [--port port | -p port] [--nocommit] [--newuuid] [--recover] [--replica] [--nosettings] [--supersede] restore-all <archive-dir>
The archive argument for restore and restore-all should name a directory created by the backup command or the backup-all command. (Directories created by earlier AllegroGraph version will also work.)
For restore, <archive> may also name a backup file or may be -, which means get the input from standard input rather than from a file. See below for details. If the optional <archive-dbname> is given, it is the repository name looked for, which will be restored to <dbname>, allowing you to change the name of the repository you are restoring. If omitted, <archive-dbname> defaults to <dbname>. If <archive> is a file or -, <archive-dbname> is ignored even if supplied.
The dbname argument for the restore command is the name that will be given to the restored database. For restore-all the names will be taken from the various archive files.
Here are some restore examples:
agraph-backup [options] restore my-db archive
This command will search archive for data in the dbname repository.
agraph-backup [options] restore my-db agbackup-file
This command read data from the file backup-file and puts it
into the my-db repository.
agraph-backup [options] restore my-db archive my-old-db
This command find the stored data for my-old-db in archive
and puts it into a repository named my-db.
The backup-settings and restore-settings commands
The backup-settings command will store settings (users, stored queries/procedures, etc.) information. The <archive> directory must either not exist or must be empty unless the --supersede
option is specified, in which case all files and subdirectories of an existing directory will be deleted before the settings information is written. Therefore, you cannot specify an existing backup archive directory and just have the settings superseded. Settings are written to the settings/ subdirectory of <archive>. You can manually replace the settings/ subdirectory of one backup archive with a different settings/ subdirectory if you want to change the saved settings in an archive.
The options are:
- --config
- must be the path of an AllegroGraph config file. If specified (and no value is specified for
--port
), agraph-backup will look in the config file for the location on disk of settings information. The server need not be running. - --port port | -p port
- the port with which to communicate with the running server. Not needed if the optional config-file argument is specified. If both are specified, the port takes precedence.
- --supersede
- if specified, the <archive> directory, if it exists and is non-empty, will be cleared (all subdirectories and files deleted) before the standard subdirectory structure is reestablished and settings information is written out. (Do not specify
--supersede
if you just want to update settings information in an existing backup archive as doing so will cause all other data to be deleted.)
The call template thus is:
agraph-backup [--port port | -p port] [--config <config-file>] [--supersede] backup-settings <archive>
The restore-settings command replaces settings in a running server. It takes an <archive> argument and finds the settings data in that file and replaces the existing stored settings of the running server with the new settings. (If <archive> does not have any settings data, no change is made.) The --supersede
option is ignored since there are always settings, so restoring settings to a running server of necessity supersedes the existing ones.
You must restart the server for the new settings to take effect.
The call template thus is:
agraph-backup [--port port | -p port] restore-settings <archive>
The list command
The list command list the contents of a archive file, which is a single dbase.agbackup file. These files can be found in the
<archive>/archives/<catalog>/<dbname>/
directory and have names and types
<dbname>.agbackup
The call template is:
agraph-backup list <archive-file>
Reading from standard input/ writing to standard output
If you specify - (a dash) in place of an archive with the backup and restore commands, the backup command will write to standard output and the restore command will read from standard input. This is useful when you wish to copy a database, as we describe in the next section, and also for streaming backups over a network.
Using agraph-backup to copy databases
agraph-backup can be used to copy a database to a new name and/or catalog. This is achieved by setting up a pipe with agraph-backup backup writing the database to standard output, and agraph-backup restore reading the backup archive into a different database. For example, to make a copy of the database lubm1 under the name lubm1 in the catalog experiments, the following command line could be used:
agraph-backup backup lubm1 - | agraph-backup --catalog experiments --newuuid restore lubm1 -
Note that a triple-store copy needs a --newuuid if it is to run on the same server as the original triple store. Alternately, you could specify the --port number to send the copy to a different server on the same computer. In that case the --newuuid isn't necessary.
Restoring a Pre-v4.1 backup
AllegroGraph versions prior to v4.1 used a different backup mechanism. Databases saved using 4.0.x agraph-backup need to be restored using the agraph-restore-v4.0. Please contact [email protected] if you require assistance restoring v4.0 backups into AllegroGraph v4.1 or newer.
agraph-backup help
Calling agraph-backup* without arguments displays help information.