A backup is a snapshot of the database at a given time. The backup consists of three main parts:
-
Metadata. The names and definitions of all database tables
-
Table records. The data actually stored in the database tables at the time that the backup was made
-
Transaction log. A sequential record telling how and when data was stored in the database
Each of these parts is saved on all nodes participating in the backup. During backup, each node saves these three parts into three files on disk:
-
BACKUP-
backup_id
.node_id
.ctlA control file containing control information and metadata. Each node saves the same table definitions (for all tables in the cluster) to its own version of this file.
-
BACKUP-
backup_id
-0.node_id
.dataA data file containing the table records, which are saved on a per-fragment basis. That is, different nodes save different fragments during the backup. The file saved by each node starts with a header that states the tables to which the records belong. Following the list of records there is a footer containing a checksum for all records.
-
BACKUP-
backup_id
.node_id
.logA log file containing records of committed transactions. Only transactions on tables stored in the backup are stored in the log. Nodes involved in the backup save different records because different nodes host different database fragments.
In the listing just shown, backup_id
stands for the backup identifier and node_id
is the unique identifier for the node creating the file.
The location of the backup files is determined by the BackupDataDir
parameter.