Cross-Cluster Asynchronous Replication

Cross-cluster asynchronous replication allows data to be replicated automatically from a DolphinDB server (the master cluster) to one or more servers (slave clusters) for data consistency and offsite disaster recovery. DolphinDB's cluster replication provides enhanced fault tolerance with convenient maintenance.

Compatibility

Currently, cluster replication is only supported on DFS databases. Operations regarding access control management, tiered storage and configuration for storage engines cannot be replicated.

The following table displays functions and DDL/DML operations supported in cluster replication:

Function Operation Type
append / tableInsert APPEND, APPEND_CHUNK_GRANULARITY
delete SQL_DELETE
update SQL_UPDATE
upsert! UPSERT
dropTable DROP_TABLE
dropPartition DROP_PARTITION
dropDatabase DROP_DB
addRangePartitions ADD_RANGE_PARTITION
addValuePartitions ADD_VALUE_PARTITION
database CREATE_DOMAIN
createcreatePartitionedTablecreateDimensionTable CREATE_TABLECREATE_PARTITIONED_TABLE
addColumn ADD_COLUMN
dropColumns DROP_COLUMN
renameTable RENAME_TABLE
truncate TRUNCATE_TABLE
replaceColumn REPLACE_COLUMN
setColumnComment SET_COLUMN_COMMENT
rename! RENAME_COLUMN

Cluster Asynchronous Replication Process

  1. Configure asynchronous replication among clusters.
  2. Start the server. Call setDatabaseForClusterReplication(dbHandle, true) on a data node of the master cluster to enable asynchronous replication for one specific database.
  3. After asynchronous replication is enabled on the master cluster, the slave cluster pulls the replication tasks from the master cluster and executes them one by one.
  4. If a replication task fails repeatedly and interrupts the overall replication process (verified by checking the STOPPED status using function getClusterReplicationMetrics), you may try to:
    • (optional) Call getSlaveReplicationStatus on the controller of the slave cluster to check the failure reasons and resolve the failures. If a failed task cannot be resolved, call skipClusterReplicationTask to skip the task.
    • Call startClusterReplication on the controller of the slave cluster to restart asynchronous replication.
  5. To stop the asynchronous replication for all databases, call stopClusterReplication on the controllers of the master and slave clusters separately. Ongoing tasks will continue, but the master cluster will stop putting replication tasks to the queue, and the slave cluster will stop pulling tasks.
  6. To stop asynchronous replication for a specific database, call setDatabaseForClusterReplication(dbHandle, false). Ongoing tasks will continue, but no new replication tasks will be generated for that database.