Catalog

As of version 3.00.0, DolphinDB introduced the organization of catalogs and schemas for managing DFS database system.

  • Catalog: The highest level of organization within a database system. A catalog holds one or more schemas and represents the complete set of schemas that a user or application can access.

  • Schema: A collection of tables, views, and procedures, grouped together within a catalog. In DolphinDB, a schema corresponds to a database. Note that the schema name should be used instead of database URL for catalog operations.

  • Table: The primary component of a schema. A data form that organizes information into rows and columns.

Operating Catalogs

Creating and Dropping

The name of a catalog or schema can contain letters, digits and underscores and must begin with a letter. The name is case-insensitive when referenced in statements or functions.

(1) Creating catalogs

To create a catalog, use function createCatalog. For example, create a catalog "trading":

createCatalog("trading")

To switch to a specific catalog, execute use statement and specify the catalog keyword. For example, switch to catalog "trading":

use catalog trading;

Once a catalog is used, subsequent operations would use the catalog by default unless a different catalog is explicitly specified.

(2) Creating schemas

To create a schema, use create database statement and specify catalog.schema instead of directory.

Syntax

create database catalog.schema partitioned by partitionType(partitionScheme),[partitionType(partitionScheme),partitionType(partitionScheme)],
[engine='OLAP'], [atomic='TRANS'], [chunkGranularity='TABLE']

For example, create a schema "stock" using VALUE partitions and OLAP storage engine.

create database stock partitioned by VALUE(1..10), engine='OLAP'

Since section (1) already specifies a default catalog "trading", stock in the above script is equivalent to trading.stock. If no catalog is used, the creation would fail and an error of "The catalog doesn't exist" will be thrown.

To add an existing database to a catalog, use createSchema. For example, add schema "stock2" referencing database "dfs://db1" to the catalog "trading".

database(directory="dfs://db1", partitionType=RANGE, partitionScheme=0 5 10)
createSchema("trading", "dfs://db1", "stock2")

(3) Creating tables

To create a table in a schema, use create table statement and specify catalog.schema.tableName instead of dbPath.

Syntax

create table catalog.schema.tableName(
    schema[columnDescription]
)
[partitioned by partitionColumns],
[sortColumns],
[keepDuplicates=ALL],
[sortKeyMappingFunction]

For example:

create table stock.quote (
     id INT,
     date DATE[comment="time_col", compress="delta"],
     value DOUBLE,
 )
 partitioned by id

Since section (1) already specifies a default catalog "trading", stock.quote in the above script is equivalent to trading.stock.quote.

(4) Dropping catalogs

To drop a catalog, call function dropCatalog. For example:

dropCatalog("catalog1")

(5) Dropping schemas from catalogs

To drop a schema from a catalog, use drop statement.

Syntax

drop table [if exists] catalog.schema

where catalog.schema specifies the schema to be dropped. If a default catalog is already used, the catalog can be omitted.

(6) Dropping tables from schemas

To drop a table from a schema, use drop statement.

Syntax

drop table [if exists] catalog.schema.tableName

where catalog.schema.tableName specifies the table to be dropped. If a default catalog is already used, the catalog can be omitted.

Using Catalogs with SQL

The following script shows how to use catelogs with SQL to append sample data into a table. The sample data contains 3 columns and 100 rows: the "id" column ranges from 1 to 10, the "date" column ranges from 2023.01.01 to 2023.01.3, and the "value" column consists of 100 random numbers within 10.0.
dbUrl = exec dbUrl from getSchemaByCatalog("trading") where schema = "stock"
data = table(take(1..10, 100) as id, take(2023.01.01..2023.01.30, 100) as date, rand(10.0, 100) as value)
loadTable(dbUrl[0], "quote").append!(data)

The following SQL statements can be used when managing catalogs and schemas:

Statement Type Description
create DDL Create schemas/tables
alter DDL Add columns to tables
drop DDL Drop schemas/tables
update DML Update tables
delete DML Delete records from tables
select DQL Access tables
Note: When using these SQL statements on catalogs/schemas, a default catalog must be used or a catalog must be specified in the statement, otherwise an error would be raised.

Example 1. Select data from "stock.quote" with filtering conditions:

select * from stock.quote where id = 1;
id date value
1 2023.01.01 1.4114
1 2023.01.11 3.5046
1 2023.01.21 1.927
1 2023.01.01 0.7042
1 2023.01.11 5.5941
1 2023.01.21 9.3329
1 2023.01.01 2.4815
1 2023.01.11 1.1198
1 2023.01.21 9.5832
1 2023.01.01 4.0467

Example 2. Update data of "stock.quote":

update stock.quote set value = -1.0 where id > 5;
select * from stock.quote where id > 5;
id date value
5 2023.01.01 -1.0000
5 2023.01.11 -1.0000
5 2023.01.21 -1.0000

Example 3. Delete data from "stock.quote":

delete from stock.quote where id < 2;
select * from stock.quote where id < 2;

Note: If a script contains a variable with the same name as schema, the query may fail to access the schema.

For example, if a variable stock is defined and shares the same name as the schema, stock in the script will be parsed as the variable name and anerror will be raised.

stock=1
select * from stock.quote where id = 1; 
//getMember method not supported

You may use undef to undefine such variable:

undef(`stock)

Managing Catalogs

(1) Checking default catalog

To view the default catalog used in the current session, call getCurrentCatalog.

use catalog trading;
getCurrentCatalog() 
// output: "trading"

use catalog trading2;
getCurrentCatalog() 
// output: "trading2"

(2) Renaming catalogs/schemas

To rename a catalog, call function renameCatalog.

renameCatalog("trading", "trading2") 
getAllCatalogs() 
// output: ["trading2"]

To rename a schema, call function renameSchema.

renameSchema("trading", "stock", "stock1") 
exec schema from getSchemaByCatalog("trading") 
// output: ["stock1"]

(3) Checking schema info

To get info on schemas within a catalog, call function getSchemaByCatalog. A schema created with create database has a unique identifier of dbUrl in the format "dfs://<name_timestamp>" which is unaltered even after renamed.

getSchemaByCatalog("trading")
schema dbUrl
stock dfs://trading_stock_1712077295373

(4) Checking logs on catalog operations

To view info of operations on catalogs and schemas, you can check the ACL Audit log on the controller. The log message is displayed in the following format:

ACL Audit: function createSchema [catalog=trading,dbUrl=dfs://db1,schema=stock2], called by user [xxx]

Managing User Access

Catalog-Level Access Control

The following access privileges with prefix "CATALOG_" can be set for databases/tables within a specific catalog.

Privilege Description
CATALOG_MANAGE MANAGE permission for operations including adding schemas, drop and rename catalogs
CATALOG_READ READ permission for all tables within a catalog
CATALOG_WRITE WRITE permission for write, update, and insert operations on all tables within a catalog
CATALOG_INSERT INSERT permission for all tables within a catalog
CATALOG_UPDATE UPDATE permission for all tables within a catalog
CATALOG_DELETE DELETE permission for all tables within a catalog

The parameter objs of function grant/deny/revoke should be the catalog name. For example, grant user Adam CATALOG_READ access on catalog "trading".

grant("Adam", CATALOG_READ, "trading")

Schema-Level Access Control

The following access privileges with prefix "SCHEMA_" can be set for schemas.

Privilege Description
SCHEMA_MANAGE MANAGE permission for schemas
SCHEMAOBJ_CREATE CREATE permission for all add DDL operations on tables within a schema
SCHEMAOBJ_DELETE DELETE permission for all delete DDL operations on tables within a schema
SCHEMA_READ READ permission for all tables within a schema
SCHEMA_WRITE WRITE permission for write, update, and insert operations on all tables within a schema
SCHEMA_INSERT INSERT permission for all tables within a schema
SCHEMA_UPDATE UPDATE permission for all tables within a schema
SCHEMA_DELETE DELETE permission for all tables within a schema

The parameter objs of function grant/deny/revoke should be in the format of catalog.schema. For example, grant user Adam SCHEMA_READ access on schema "trading.stock".

grant("Adam", SCHEMA_READ, "trading.stock")