backup

Syntax

backup(backupDir, dbPath|sqlObj, [force=false], [parallel=false], [snapshot=true], [tableName], [partition])

Arguments

backupDir is a string indicating the directory to save the backup. For an AWS S3 directory, it must begin with s3://.

dbPath is a string indicating the database path. If specified, back up the database by copying partitions. If backupDir is an AWS S3 directory, only dbPath can be specified.

sqlObj is metacode of SQL statements indicating the data to be backed up. If specified, only the queried data is backed up.

force (optional) is a Boolean value. True means to perform a full backup, otherwise to perform an incremental backup.

parallel (optional) is a Boolean value indicating whether to back up partitions in parallel. The default value is false.

The following parameters only take effect when dbPath is specified:

snapshot (optional) is a Boolean value indicating whether to synchronize the deletion of tables/partitions to the backup database. It only takes effect when the parameter partition is empty. The default value is true, indicating the system will remove the deleted table/partitions from the backup files.

tableName (optional) is a STRING scalar or vector indicating the name of table to be backed up. If unspecified, all tables of the database are backed up.

partition (optional) indicates the partitions to be backed up. It can be:

  • a STRING scalar or vector indicating the path(s) to one or multiple partitions of a database, and each path starts with "/". Note that for a compo-partitioned database, the path must include all partition levels.

  • filter condition(s). A filter condition can be a scalar or vector where each element represents a partition.

    • For a single-level partitioning database, it is a scalar.

    • For a compo-partitioned database, it is a tuple composed of filter conditions with each element for a partition level. If a partition level has no filter condition, the corresponding element in the tuple is empty.

  • unspecified to indicate all partitions.

Details

Back up all or specified partitions of a distributed table. Return an integer indicating the number of partitions that have been backed up successfully. It must be executed by a logged-in user.

Note:

  • For a database created with chunkGranularity='DATABASE', it can only be backed up with SQL statements.

  • If backup files already exist under the backupDir, the next backup must adopt the same method (by specifying dbPath or sqlObj). Otherwise the backup would fail.

  • If backupDir is an AWS S3 directory, the configuration file of the data node must specify preloadModules=plugins::awss3, and configure s3AccessKeyId, s3SecretAccessKey and s3Region.

The following table compares the 2 types of backup:

Feature copying files (specifying parameter dbPath) SQL statements (specifying parameter sqlObj)
back up an entire database ×
data consistency fully guaranteed partially guaranteed
incremental backup of modified and added partitions
incremental backup of deleted partitions ×
resume data transmission from a breakpoint ×
back up data with filter conditions ×
performance lower memory usage higher memory usage

Examples

Create a DFS database dfs://compoDB

n=1000000
ID=rand(100, n)
dates=2017.08.07..2017.08.11
date=rand(dates, n)
x=rand(10.0, n)
t=table(ID, date, x)

dbDate = database(, VALUE, 2017.08.07..2017.08.11)
dbID=database(, RANGE, 0 50 100);
db = database("dfs://compoDB", COMPO, [dbDate, dbID])
pt = db.createPartitionedTable(t, `pt, `date`ID)
pt.append!(t)

Example 1. Back up the table pt.

backup("/home/DolphinDB/backup",<select * from loadTable("dfs://compoDB","pt")>,true);
// output
10

Example 2. Back up the partitions of table pt with date>2017.08.09.

backup("/home/DolphinDB/backup",<select * from loadTable("dfs://compoDB","pt") where date>2017.08.10>,true);
// output
2

Example 3. Back up tables in a database.

(1) Back up tables

// create table pt1 under database dfs://compoDB
pt1 = db.createPartitionedTable(t, `pt1, `date`ID)
pt1.append!(t)

// back up 2 tables
backup(backupDir="/home/DolphinDB/backup",dbPath="dfs://compoDB",force=true);
// output
20

(2) Back up specified partitions

// back up 5 partitions of table pt
partitions=["/20170807/0_50","/20170808/0_50","/20170809/0_50","/20170810/0_50","/20170811/0_50"]
backup(backupDir="/home/DolphinDB/backup",dbPath="dfs://compoDB",force=true,tableName=`pt,partition=partitions);
5

(3) Back up partitions with filter conditions. Note that for a range domain, any value within the range can be specified to indicate the entire partition.

// back up the second-level partition 50_100 under the first-level partition 20170807
partitions=[2017.08.07,50]
backup(backupDir="/home/DolphinDB/backup",dbPath="dfs://compoDB",force=true,tableName=`pt,partition=partitions);
// output
1

// back up the second-level partition 50_100 under all first-level partitions
partitions=[,[0]]
backup(backupDir="/home/DolphinDB/backup",dbPath="dfs://compoDB",force=true,tableName=`pt,partition=partitions);
// output
5  

Example 4. The following examples explains how to use the parameter snapshot

// delete partition "/20170811/0_50" form table pt
db.dropPartition("/20170811/0_50",`pt)

// back up again and set snapshot=false
backup(backupDir="/home/DolphinDB/backup1",dbPath="dfs://compoDB",force=true,snapshot=false,tableName=`pt);
// output
9

// restore from the backup and you can see that partition "/20170811/0_50" was not deleted
restore(backupDir="/home/DolphinDB/backup1",dbPath="dfs://compoDB",tableName=`pt,partition="%",force=true)
["dfs://compoDB/20170807/0_50/9m9","dfs://compoDB/20170807/50_100/9m9","dfs://compoDB/20170808/0_50/9m9","dfs://compoDB/20170808/50_100/9m9","dfs://compoDB/20170809/0_50/9m9","dfs://compoDB/20170809/50_100/9m9","dfs://compoDB/20170810/0_50/9m9","dfs://compoDB/20170810/50_100/9m9","dfs://compoDB/20170811/0_50/9m9","dfs://compoDB/20170811/50_100/9m9"]

//delete partition "/20170811/0_50" form table pt
db.dropPartition("/20170811/0_50",`pt)

// back up again and set snapshot=true
backup(backupDir="/home/DolphinDB/backup",dbPath="dfs://compoDB",force=true,snapshot=true,tableName=`pt);
// output
9
    
// restore from the backup and you can see that partition "/20170811/0_50" was deleted
restore(backupDir="/home/DolphinDB/backup",dbPath="dfs://compoDB",tableName=`pt,partition="%",force=true)
// output
["dfs://compoDB/20170807/0_50/9m9","dfs://compoDB/20170807/50_100/9m9","dfs://compoDB/20170808/0_50/9m9","dfs://compoDB/20170808/50_100/9m9","dfs://compoDB/20170809/0_50/9m9","dfs://compoDB/20170809/50_100/9m9","dfs://compoDB/20170810/0_50/9m9","dfs://compoDB/20170810/50_100/9m9","dfs://compoDB/20170811/50_100/9m9"]

Related functions: backupDB, backupTable, restore, migrate