dropPartition
Syntax
dropPartition(dbHandle, partitionPaths, tableName, [forceDelete=false],
[deleteSchema=false])
Arguments
dbHandle is a DolphinDB database handle.
-
By path: partitionPaths is a STRING scalar/vector indicating the path of one or multiple partitions. Each string must start with "/". For composite partitions, the path must include all partition levels.
-
By condition: partitionPaths is a scalar or vector indicating the value(s) in the partitioning column. The system will drop all partitions containing these values. For composite partitions, partitionPaths is a tuple where each element is a filtering condition for each partition level (starting from the first level). If you do not want to apply filtering at a certain partition level, leave the corresponding element empty.
tableName is a string indicating a table name. It can be left empty if the database chunk granularity is at DATABASE level (i.e., database: chunkGranularity = 'DATABASE'). Otherwise, it is a required parameter.
forceDelete (optional) is a Boolean value. If set to true, the specified partition(s) will be deleted even if the partition(s) is recovering. The default value is false.
dropPartition
function
with forceDelete=false, the number of available replicas for the chunks
involved in the transaction must be greater than or equal to the configured
dfsReplicationFactor.schema().partitionSchema
) is kept. When the following
conditions are satisfied, you can delete the schema of the selected partitions along
with the partition data by setting deleteSchema to true:- There's only one table in the database.
- The partitioning type of the database is VALUE.
- For composite partitions, the first level of partitioning type must be VALUE, and only the first level of partitions are selected for deletion.
Details
Delete data from one or multiple partitions from a DFS database.
If tableName is specified: delete one or multiple partitions of the given table.
If tableName is not specified: delete one or multiple partitions of all tables with this partition.
Examples
The script should be executed on a data node or compute node of a cluster.
n=1000000
ID=rand(150, n)
dates=2017.08.07..2017.08.11
date=rand(dates, n)
x=rand(10.0, n)
t=table(ID, date, x)
dbDate = database(, VALUE, 2017.08.07..2017.08.11)
dbID = database(, RANGE, 0 50 100 150)
db = database("dfs://compoDB", COMPO, [dbDate, dbID])
pt = db.createPartitionedTable(t, `pt, `date`ID)
pt.append!(t);
The script above created a database with composite partition. The first level is a value partition with partitioning column of date, and the second level is a range partition with partitioning column of ID.
Example 1. Delete one partition
Use either of the following ways to delete the partition "/20170807/0_50".
(1) Specify the partition path.
dropPartition(db,"/20170807/0_50");
(2) Specify the filtering condition.
dropPartition(db,[2017.08.07, 0]);
Here 0 means the partition of [0, 50). We can choose any number from 0 to 49 to represent this partition.
Example 2. Delete a first level partition
Use either of the following ways to delete the first level partition of 2017.08.08.
(1) Specify the path of all partitions under 2017.08.08.
partitions=["/20170808/0_50","/20170808/50_100","/20170808/100_150"]
dropPartition(db,partitions);
(2) Specify the filtering condition.
dropPartition(db, 2017.08.08, tableName=`pt);
After the deletion, check the table partitioning scheme with schema
schema(db);
/*output
partitionSchema->([2017.08.11,2017.08.10,2017.08.09,2017.08.08,2017.08.07],[0,50,100,150])
partitionSites->
partitionTypeName->[VALUE,RANGE]
atomic->TRANS
databaseDir->dfs://compoDB
chunkGranularity->TABLE
engineType->OLAP
partitionType->[1,2]
*/
We can see that "2017.08.08" is still in partitionSchema as
dropPartition
only deleted the data from this partition but
kept its schema. Since "2017.08.08" is a first level VALUE
partition, we can also delete its schema from the table partitioning scheme
along with its data by specifying deleteSchema = true:
dropPartition(db, 2017.08.08, tableName=`pt, deleteSchema = true);
Check the table partitioning scheme with schema - "2017.08.08" is no longer in partitionSchema:
schema(db);
/* output
partitionSchema->([2017.08.11,2017.08.10,2017.08.09,2017.08.07],[0,50,100,150])
...
*/
Example 3. Delete a second level partition
Use either of the following ways to delete the second level partition of [0,50).
(1) Specify the path of all partitions of [0,50).
partitions=["/20170807/0_50","/20170808/0_50","/20170809/0_50","/20170810/0_50","/20170811/0_50"]
dropPartition(db,partitions);
(2) Specify the filtering condition.
dropPartition(db,[,[0]]);
Example 4. Delete multiple same level partitions
To delete the second level partitions of [0,50) and [100,150):
dropPartition(db,[,[0,100]]);
Example 5: Modify data in a distributed table on disk
To revise data in a distributed table, we need to update the entire partitions that the rows to be updated belong to. The following example adds 10 to column x of the rows with date=2017.08.10 and ID=88 in the distributed table pt.
(1) First load the data of the partition with date=2017.08.10 and ID=88 into memory.
tmp=select * from loadTable("dfs://compoDB","pt") where date=2017.08.10 and 50<=ID<100 ;
(2) Then add 10 to column x of table tmp:
update tmp set x=x+10 where date=2017.08.10 and ID=88;
(3) Delete data in the relevant partition:
dropPartition(db,"/20170810/50_100",`pt);
(4) Append table tmp to table pt:
pt.append!(tmp);