Update Data
DolphinDB supports updating data through the UPDATE
statement,
assignment, and functions update!
, upsert!
, and
replaceColumn!
. The following table demonstrates the applicable
objects for each method:
DFS Table | Regular In-Memory Table | Keyed In-Memory Table | Indexed In-Memory Table | MVCC In-Memory Table | Partitioned In-Memory Table | |
---|---|---|---|---|---|---|
UPDATE |
√ | √ | √ | √ | √ | √ |
update! |
× | √ | √ | √ | √ | √ |
upsert! |
√ | × | √ | √ | × | × |
replaceColumn! |
Only supports changing data types in OLAP engine | √ | √ | √ | √ | × |
Assignment | × | √ | √ | √ | √ | √ |
Use the following script to create an in-memory table t, a keyed in-memory table kt, an indexed in-memory table it, an MVCC in-memory table mvcct, a partitioned in-memory table mpt, a partitioned DFS table pt, a dimension table dt.
// In-memory table
t = table(2023.10.01 + take(0..9,100) as date, take(["A001","B001","C001","D001"],100) as sym, 1..100 as val)
// Keyed in-memory table
kt = keyedTable(`date,1:0,`date`sym`val,[DATE,SYMBOL,INT])
kt.tableInsert(t)
// Indexed in-memory table
it = indexedTable(`date,1:0,`date`sym`val,[DATE,SYMBOL,INT])
it.tableInsert(t)
// MVCC in-memory table
mvcct = mvccTable(1:0,`date`sym`val,[DATE,SYMBOL,INT])
mvcct.tableInsert(t)
// Partitioned in-memory table
db1 = database(directory="", partitionType=VALUE, partitionScheme=2023.10.01..2023.10.10)
mpt = createPartitionedTable(dbHandle=db1, table=t, tableName=`pt, partitionColumns=`date)
mpt.tableInsert(t)
// Partitioned DFS table and dimension table
db = database("dfs://olapdemo",partitionType=VALUE, partitionScheme=2023.10.01..2023.10.10)
pt = createPartitionedTable(dbHandle=db, table=t, tableName=`pt, partitionColumns=`date)
pt.tableInsert(t)
dt = createDimensionTable(dbHandle=db, table=t, tableName=`dt)
dt.tableInsert(t)
Unless specified, "in-memory table" in the following text refers to non-stream in-memory tables, including regular, keyed, indexed, MVCC, and partitioned in-memory tables.
SQL UPDATE Statement
The UPDATE statement can be used to update in-memory tables and DFS tables.
For example, to update the "val" value to 101 where the date is 2023.10.01 in t, and to update the "val" value to 102 where the date is 2023.10.02 in pt:
// Update the in-memory table t
UPDATE t SET val=101 WHERE date=2023.10.01
// Update the partitioned DFS table pt
UPDATE pt SET val=102 WHERE date=2023.10.02
Note: When updating data with the UPDATE
statement, the
system loads data into memory and writes back the updated data using MVCC, which may
temporarily increase disk usage.
upsert!
The upsert! function can be used to update keyed, indexed, and DFS tables.
The following example inserts data from table t1 into the keyed table kt and the indexed table it. If the primary key "date" already exists, the corresponding value is updated.
t1 = table(2023.10.03..2023.10.05 as date, ["X001","Y001","Z001"] as sym, 66 77 88 as val)
upsert!(obj=kt, newData=t1, keyColNames=`date)
upsert!(obj=it, newData=t1, keyColNames=`date)
Since there is no key column in partitioned DFS tables and dimension tables, the keyColNames parameter must be specified, which forms keys with the partitioning column.
For example, to update the "sym" column in tables pt and dt:
t1 = table(2023.10.03..2023.10.05 as date, ["X001","Y001","Z001"] as sym, 66 77 88 as val)
upsert!(obj=pt, newData=t1, keyColNames=`sym)
upsert!(obj=dt, newData=t1, keyColNames=`sym)
Note: If parameter obj is a DFS table with duplicated "keys" (as
specified by keyColNames), upsert!
on rows with duplicated
keys only updates the first row.
update!
The update! function can be used to update columns of in-memory tables. If a column in colNames doesn't exist, the function creates a new column.
For example, to increase the "val" value for rows where date = 2023.10.03 and sym = "A001” by 2:
update!(table=t, colNames="val", newValues=<val+2>, filter=[<date=2023.10.03>,<sym="A001">])
To add a column named "val2" with values ranging from 1 to 100:
update!(table=t, colNames="val2", newValues=1..100)
replaceColumn!
The replaceColumn! function is used to replace column(s) in regular, keyed, indexed, and MVCC in-memory tables with the vector(s) specified by parameter newCol, which must contain the same number of elements as the rows of table.
For DFS tables in the OLAP database, replaceColumn!
can only modify
the data types of non-partitioning columns. In this case, the parameter
newCol is a vector specifying the target data types, with no restriction
on its length or values.
// Update the regular in-memory table t with 100 rows
replaceColumn!(table=t, colName="val", newCol=101..200)
// Update the keyed in-memory table kt and indexed in-memory table it with 10 rows
replaceColumn!(table=kt, colName="val", newCol=101..110)
replaceColumn!(table=it, colName="val", newCol=101..110)
// Change the data type of column "val" in the partitioned DFS table pt from INT to DOUBLE
replaceColumn!(table=pt, colName="val", newCol=array(DOUBLE,0))
Assignment
In-memory tables can also be updated by assigning values to an entire column or specified elements in the column.
For example, to assign value 66 to all rows in column "val":
t[`val] = 66
To assign value 100 to the first row in column "val":
t[`val,0] = 100