Cross-Cluster Asynchronous Replication

Cross-cluster asynchronous replication allows data to be replicated automatically from a DolphinDB server (the master cluster) to one or more servers (slave clusters) for data consistency and offsite disaster recovery. DolphinDB's cluster replication provides enhanced fault tolerance with convenient maintenance.

Compatibility

Currently, cluster replication is only supported on DFS databases. Operations regarding access control management, tiered storage and configuration for storage engines cannot be replicated.

The following table displays functions and DDL/DML operations supported in cluster replication:


Function	Operation Type
append / tableInsert	APPEND, APPEND_CHUNK_GRANULARITY
delete	SQL_DELETE
update	SQL_UPDATE
upsert!	UPSERT
dropTable	DROP_TABLE
dropPartition	DROP_PARTITION
dropDatabase	DROP_DB
addRangePartitions	ADD_RANGE_PARTITION
addValuePartitions	ADD_VALUE_PARTITION
database	CREATE_DOMAIN
createcreatePartitionedTablecreateDimensionTable	CREATE_TABLECREATE_PARTITIONED_TABLE
addColumn	ADD_COLUMN
dropColumns	DROP_COLUMN
renameTable	RENAME_TABLE
truncate	TRUNCATE_TABLE
replaceColumn	REPLACE_COLUMN
setColumnComment	SET_COLUMN_COMMENT
rename!	RENAME_COLUMN

Cluster Asynchronous Replication Process

Configure asynchronous replication among clusters.
Start the server. Call setDatabaseForClusterReplication(dbHandle, true) on a data node of the master cluster to enable asynchronous replication for one specific database.
After asynchronous replication is enabled on the master cluster, the slave cluster pulls the replication tasks from the master cluster and executes them one by one.
If a replication task fails repeatedly and interrupts the overall replication process (verified by checking the STOPPED status using function getClusterReplicationMetrics), you may try to:
- (optional) Call getSlaveReplicationStatus on the controller of the slave cluster to check the failure reasons and resolve the failures. If a failed task cannot be resolved, call skipClusterReplicationTask to skip the task.
- Call startClusterReplication on the controller of the slave cluster to restart asynchronous replication.
To stop the asynchronous replication for all databases, call stopClusterReplication on the controllers of the master and slave clusters separately. Ongoing tasks will continue, but the master cluster will stop putting replication tasks to the queue, and the slave cluster will stop pulling tasks.
To stop asynchronous replication for a specific database, call setDatabaseForClusterReplication(dbHandle, false). Ongoing tasks will continue, but no new replication tasks will be generated for that database.