PROTOCOL_DDB

PROTOCOL_DDB is DolphinDB's protocol for serializing and deserializing data. It is widely adopted by DolphinDB's APIs (Python, C++, Java, etc.) to transmit data. Among DolphinDB's data serialization protocols, PROTOCOL_DDB supports the greatest variety of DolphinDB data forms and data types.

Note

  • DolphinDB data forms refer to data structures, such as scalar, vector, table, etc.
  • DolphinDB data types refer to specific data types, such as INT, DOUBLE, DATETIME, etc.
  • In the following sections, the Python libraries NumPy and pandas will be referred to as np and pd, respectively.

Enabling PROTOCOL_DDB

To use PROTOCOL_DDB, we need to enable it in the DolphinDB session and DBConnectionPool objects by setting the protocol parameter to PROTOCOL_DDB.

import dolphindb as ddb
import dolphindb.settings as keys

s = ddb.Session(protocol=keys.PROTOCOL_DDB)
s.connect("localhost", 8848, "admin", "123456")

pool = ddb.DBConnectionPool("localhost", 8848, "admin", "123456", 10, protocol=keys.PROTOCOL_DDB)

Supported Data Forms

Additional ParameterData FormSerializationDeserialization
pickleTableToList=FalseScalar
pickleTableToList=FalseVector
pickleTableToList=FalsePair×
pickleTableToList=FalseMatrix
pickleTableToList=FalseSet
pickleTableToList=FalseDict
pickleTableToList=FalseTable
pickleTableToList=TrueTable×

Serialization: From Python to DolphinDB

In this section, we will demonstrate how Python objects are mapped to DolphinDB data types when uploaded to a DolphinDB server with PROTOCOL_DDB enabled. The upload() method is used as an example.

Scalar

The following table shows the data type mapping when uploading scalars to DolphinDB server ("---" indicates that there is no matching Python data type.

Python Data TypePython ExampleDolphinDB Data TypeDolphinDB Example
NoneTypeNoneVOIDNULL
boolTrueBOOLtrue
np.int8np.int8(12)CHARchar(12), 12c
np.int16np.int16(12)SHORTshort(12), 12h
np.int32np.int32(12)INTint(12), 12
np.int64np.int64(12)LONGlong(12), 12l
int12LONGlong(12), 12l
np.datetime64[D]np.datetime64("2012-01-02", "D")DATE2012.01.02
np.datetime64[M]np.datetime64("2012-01", "M")MONTH2012.01M
------TIME---
------MINUTE---
------SECOND---
np.datetime64[s]np.datetime64("2012-01-02T01:02:03", "s")DATETIMEdatetime(2012.01.02T01:02:03)
np.datetime64[ms]np.datetime64("2012-01-02T01:02:03.123", "ms")TIMESTAMPtimestamp(2012.01.02T01:02:03.123)
------NANOTIME---
np.datetime64[ns]np.datetime64("2012-01-02T01:02:03.123456789", "ns")NANOTIMESTAMPnanotimestamp(2012.01.02T01:02:03.123456789)
np.datetime64np.datetime64("")NANOTIMESTAMPnanotimestamp(NULL)
pd.Timestamppd.Timestamp("2012-01-02T01:02:03.123456789")NANOTIMESTAMPnanotimestamp(2012.01.02T01:02:03.123456789)
pd.NaTTypepd.NaTNANOTIMESTAMPnanotimestamp(NULL)
np.float32np.float32(1.1)FLOATfloat(1.1), 1.1f
np.float64np.float64(1.2)DOUBLEdouble(1.2)
float1.2DOUBLEdouble(1.2)
floatnp.nanDOUBLEdouble(NULL)
str"abc"STRING"abc"
str""STRING""
------SYMBOL---
------UUID---
np.datetime64[h]np.datetime64("2012-01-02T01", "h")DATEHOUR2012.01.02T01
------IPADDR---
bytesbytes("abc", encoding="UTF-8")BLOB"abc"
------DECIMAL32---
Decimaldecimal.Decimal("-10.21")DECIMAL64decimal64(-10.21, 2)
Decimaldecimal.Decimal("NaN")DECIMAL64decimal64(NULL, 2)
Decimaldecimal.Decimal("-10.00000000000000000021")DECIMAL128decimal128("-10.00000000000000000021", 20)

Note:

  • Starting from DolphinDB Python API version 1.30.22.6, the PROTOCOL_DDB supports uploading Decimal128 data. Scalar values with a scale less than or equal to 17 will be uploaded as Decimal64 data. Scalar values with a scale greater than 17 will be uploaded as Decimal128 data.
  • Starting from version 2.0.11.0, the length of the uploaded BLOB data is no longer limited and the length of the uploaded SYMBOL/STRING data is limited to less than 256 KB.

Vector

When a vector-like data structure is uploaded to DolphinDB, its data type in DolphinDB is determined as follows:

(1) Check if the uploaded Python object is a vector-like data structure (e.g., tuple, list, np.ndarray, pd.Series, etc.);

(2) If the object is type-specific (e.g. np.ndarray) and its dtype is not 'object', perform a direct data type conversion. This is the most efficient case.

(3) If the object is non-typed (e.g. list) or its dtype is 'object' (np.ndarray), iterate through the data: (a) If the object contains null values (None, np.nan, pd.NaT), use the type of the first null value as the object’s type; (b) If two or more data types (see Notes below) are detected, or the object contains embedded vector-like data structures, the object is uploaded as a DolphinDB ANY vector and each element will be converted independently. This requires another iteration and reduces performance.

(4) Data types of vector-like data structures are determined by mostly the same conversion rules as scalars. Similarly, Python objects cannot be directly uploaded as DolphinDB TIME, UUID or other unsupported vector types.

(5) If an np.ndarray's elements are vector-like data structures of the same length, upload it as a DolphinDB matrix.

(6) If the uploaded object is not an np.ndarray, but its elements are vector-like data structures of the same length, upload it as a DolphinDB ANY vector.

(7) If an object's elements are all nulls of different types, convert them as follows:

Types of Null Values in the Array / ListPython Data TypeDolphinDB Column Type
NoneobjectSTRING
np.NaN and Nonefloat64DOUBLE
pd.NaT and Nonedatetime64NANOTIMESTAMP
np.NaN and pd.NaTdatetime64NANOTIMESTAMP
None, np.NaN and pd.NaTdatetime64NANOTIMESTAMP
None / pd.NaT / np.nan and non-null values-the data type of the non-null values

Notes

(1) "Two or more data types" refers to DolphinDB data types. For example, np.array([np.float64(12),13]) contains only one data type.

(2) The DolphinDB Python API does not support pd.array vectors in the current version.

(3) Null values from a list or np.ndarray with dtype='object' (such as None/np.nan/pd.NaT; but decimal.Decimal(NaN) is not included) are treated as the same data type,

(4) Uploading objects as DolphinDB array vectors is not supported. Data such as np.array([[1], [2, 3]]) will be uploaded as DolphinDB ANY Vectors.

(5) For DolphinDB Python API versions lower than 3.0.1.0, Ensure all values in a DECIMAL column have the same number of decimal digits after the decimal point. Otherwise, the column's scale will default to match the first non-null value. The following script demonstrates how to align the scale of DECIMAL data:

>>> b = decimal.Decimal("1.23")
>>> b
Decimal('1.23')
>>> b = b.quantize(decimal.Decimal("0.000"))
>>> b
Decimal('1.230')

For versions 3.0.1.0 and above, there's no need to ensure that all data in DECIMAL type columns have the same number of decimal places. The system will automatically adjust the precision based on the first non-null DECIMAL value encountered.

(6) Uploading Python bytes values as DolphinDB BLOB values is not currently supported.

Below are common examples of uploading vector-like data structures to the DolphinDB.

Example 1. Upload BOOL, INT, DOUBLE, STRING and DATE vectors without null values.

>>> s.upload({'bool_v': np.array([True, False, False], dtype="bool")})
>>> s.upload({'int_v': np.array([1, 2, 4], dtype="int32")})
>>> s.upload({'double_v': [1.2, 2.456]})
>>> s.upload({'string_v': np.array(["abc", "123"], dtype="object")})
>>> s.upload({'date_v': np.array(["2012-01-02"], dtype="datetime64[D]")})

To check an uploaded object's data type in DolphinDB, call the server function typestr().

>>> s.run("typestr(bool_v)")
FAST BOOL VECTOR
>>> s.run("typestr(int_v)")
FAST INT VECTOR
>>> s.run("typestr(double_v)")
FAST DOUBLE VECTOR
>>> s.run("typestr(string_v)")
STRING VECTOR
>>> s.run("typestr(date_v)")
FAST DATE VECTOR

Example 2. Upload BOOL, INT, DOUBLE, STRING and DATE vectors containing Null values.

>>> s.upload({'bool_v': [True, None, False]})
>>> s.upload({'int_v': np.array([None, np.int32(2), np.int32(12)], dtype="object")})
>>> s.upload({'double_v': np.array([1.1, np.nan, 3.456])})
>>> s.upload({'string_v': ["", "abc", "123"]})
>>> s.upload({'date_v': [pd.NaT, None, np.nan, np.datetime64("2012-01-03", "D")]})

Check the uploaded object's data type in DolphinDB:

>>> s.run("typestr(bool_v)")
FAST BOOL VECTOR
>>> s.run("typestr(int_v)")
FAST INT VECTOR
>>> s.run("typestr(double_v)")
FAST DOUBLE VECTOR
>>> s.run("typestr(string_v)")
STRING VECTOR
>>> s.run("typestr(date_v)")
FAST DATE VECTOR

Example 3. Upload ANY vectors

There are two options to upload data as ANY vectors:

  • Specify dtype=object when constructing np.ndarray.
  • Upload data in a list or tuple object.

Note that both options require the the uploaded data to contain two or more data types, or contain nested vector-like data structures.

>>> s.upload({'list_v': [1.2, "abc"])})
>>> s.upload({'array_v': np.array([1, 1.2], dtype="object"})
>>> s.upload({'list_av': [[1, 2], [3]]})
>>> s.upload({'array_av': np.array([[1], [2, 3]], dtype="object")})

Check the uploaded object's data type in DolphinDB:

>>> s.run("typestr(list_v)")
ANY VECTOR
>>> s.run("typestr(array_v)")
ANY VECTOR
>>> s.run("typestr(list_av)")
ANY VECTOR
>>> s.run("typestr(array_av)")
ANY VECTOR

Pair

It is currently not supported to upload Python objects as Pairs using PROTOCOL_DDB.

Matrix

If an np.ndarray's elements are vector-like data structures of the same length, it will be treated as a DolphinDB matrix; If the uploaded object is not an np.ndarray, but its elements are vector-like data structures of the same length, it will be uploaded as a DolphinDB ANY vector.

Example

>>> s.upload({'int_m': np.array([[1, 2], [2, 3], [3, 4]])})
>>> s.run("typestr(int_m)")
FAST LONG MATRIX
>>> s.upload({'any_vec': [[1, 2], [2, 3], [3, 4]]})
>>> s.run("typestr(any_vec)")
ANY VECTOR

Note: Two-dimensional ndarrays with only one row will be uploaded as a vector instead of a matrix.

Set

When uploading a Python set object, DolphinDB iterates through its elements for type conversion, and then uploads the object as a DolphinDB set. The data type of the resulting DolphinDB set is determined by the converted type of its elements.

Example

 >>> s.upload({'long_set': {1, 2}})
>>> s.run("typestr(long_set)")
LONG SET
>>> s.upload({'double_set': {1.2, np.double(5.5), pd.NaT}})
>>> s.run("typestr(double_set)")
DOUBLE SET

Note

(1) DolphinDB sets do not support elements of multiple data types, or vector elements.

(2) During type conversion, null values are not treated as any data type. Also, a set comprised only of null values is not allowed.

Dictionary

Similar to sets, when uploading a Python dict object, DolphinDB iterates through the dict's elements and converts their types. It then uploads the object as a DolphinDB dictionary.

Example

>>> s.upload({'long_dict': {'a': None, 'b': 1}})
>>> s.run("typestr(long_dict)")
STRING->LONG DICTIONARY
>>> s.upload({'any_dict1': {'a': 1, 'b': [1.1, 2.4], 'c': np.array([1, "a"], dtype="object")}})
>>> s.run("typestr(any_dict1)")
STRING->ANY DICTIONARY
>>> s.upload({'any_dict2': {1: [[1], [2, 3]], 2: [[1.1, np.nan], [3.3]]}})
>>> s.run("typestr(any_dict2)")
STRING->ANY DICTIONARY

Note

(1) DolphinDB dictionaries do not allow keys of multiple data types, but the values can be of different data types.

(2) When converting a dictionary’s elements, vector-like elements with embedded vector-like elements are converted as DolphinDB ANY vectors, not array vectors.

(3) During type conversion, null values are not treated as any data type. Also, a dictionary comprised only of null values is not allowed.

Table

Python uses pd.DataFrame to represent tables. When converting a pd.DataFrame, DolphinDB converts the object column-by-column as vector-like data structures. Unlike when converting Python vector-like data structures, the vector-like objects in pd.DataFrame columns are treated as DolphinDB array vectors; columns cannot be of the ANY vector type.

Null values in a table will be converted in the same logic as the vector-like objects. Direct upload of data into certain DolphinDB data types like BLOB or IN128 is not supported. To upload these data types, a __DolphinDB_Type__ attribute must be specified when uploading tables to explicitly cast columns into particular DolphinDB data types (see Explicit Type Conversion).

Note:

pd.DataFrames only support np.datetime64[ns] as a temporal data type. If time values are uploaded to DolphinDB directly, they will all be converted to the DolphinDB’s NANOTIMESTAMP type.

Example 1

>>> df1 = pd.DataFrame({
...     'int_v': [1, 2, 3],
...     'long_v': np.array([None, 3, np.int64(3)], dtype="object"),
...     'float_v': np.array([np.nan, 1.2, 3.3], dtype="float32")
... })
>>> s.upload({'df1': df1})
>>> s.run("schema(df1)")['colDefs']
      name typeString  typeInt  extra comment
0    int_v       LONG        5    NaN        
1   long_v       LONG        5    NaN        
2  float_v      FLOAT       15    NaN   

When an uploaded object's dtype is "object", DolphinDB checks the data type of its elements and takes the type of the first non-null element as the object’s type. Null values (None, pd.NaT,np.nan) are not assigned a data type.

In the example above, the int values in the "int_v" column (without nulls) are converted to int64 by Pandas, so are eventually uploaded as the DolphinDB’s LONG type. The "long_v" column’s dtype is "object", so its type becomes that of its first non-null value, Python int 3, which maps to the DolphinDB’s LONG type. The "float_v" column’s dtype is "float", so it's uploaded as DolphinDB’s FLOAT type regardless of any null values.

Note: In NumPy and Pandas, null values are not allowed when an object's dtype is "bool", "int8", "int16", "int32" or "int64". Therefore, to upload table columns containing null values to DolphinDB, specify the dtype as "object" when constructing a pd.DataFrame.

Example 2

>>> df2 = pd.DataFrame({
...     'day_v': np.array(["2012-01-02", "2022-02-05"], dtype="datetime64[D]"),
...     'month_v': np.array([np.datetime64("2012-01", "M"), None], dtype="datetime64[M]"),
... })
>>> s.upload({'df2': df2})
>>> s.run("schema(df2)")['colDefs']
      name     typeString  typeInt  extra comment
0    day_v  NANOTIMESTAMP       14    NaN        
1  month_v  NANOTIMESTAMP       14    NaN    

Pandas supports only datetime64[ns] as temporal type, so data cannot be directly uploaded as some DolphinDB data types, such as DATE and MONTH. You can specify the __DolphinDB_Type__ attribute to explicitly cast columns into particular DolphinDB data types.

>>> import dolphindb.settings as keys
>>> df2.__DolphinDB_Type__ = {
...     "day_v": keys.DT_DATE,
...     "month_v": keys.DT_MONTH,
... }
...
>>> s.upload({'df2': df2})
>>> s.run("schema(df2)")['colDefs']
      name typeString  typeInt  extra comment
0    day_v       DATE        6    NaN        
1  month_v      MONTH        7    NaN 

Example 3

>>> df3 = pd.DataFrame({
...     'long_av': [[1, None], [3]],
...     'double_av': np.array([[1.1], [np.nan, 3.3]], dtype="object")
... })
>>> s.upload({'df3': df3})
>>> s.run("schema(df3)")['colDefs']
        name typeString  typeInt  extra comment
0    long_av     LONG[]       69    NaN        
1  double_av   DOUBLE[]       80    NaN  

In this example, when a column from a pd.DataFrame contains vector-like data structures, this column is converted as a DolphinDB array vector, not an ANY vector.

Example 4

Since version 3.0.0.0, Python API supports uploading the following pandas ExtensionDtype: Boolean/Int8/Int16/Int32/Int64/Float32/Float64/String.

>>> import pandas as pd
>>> df4 = pd.DataFrame({
...     'bool': pd.Series([True, False, None], dtype=pd.BooleanDtype()),
...     'int64': pd.Series([1, -100, None], dtype=pd.Int64Dtype()),
...     'float64': pd.Series([1.1, -0.23, None], dtype=pd.Float64Dtype()),
...     'string': pd.Series(["abc", "def", None], dtype=pd.StringDtype()),
... })
...
>>> df4.dtypes
bool              boolean
int64               Int64
float64           Float64
string     string[python]
dtype: object
>>> s.upload({'df4': df4})
>>> s.run("schema(df4)")['colDefs']
      name typeString  typeInt  extra comment
0     bool       BOOL        1    NaN        
1    int64       LONG        5    NaN        
2  float64     DOUBLE       16    NaN        
3   string     STRING       18    NaN    

The following table shows the data type mapping when uploading pandas ExtensionDtype to DolphinDB server. For detailed information on explicit type conversion, refer to Explicit Type Conversion.

pandas ExtensionDtypeDolphinDB Data TypeNote
BooleanDtypeBOOL
Int8DtypeCHAR
Int16DtypeSHORT
Int32DtypeINT
Int64DtypeLONG
Float32DtypeFLOAT
Float64DtypeDOUBLE
StringDtypeSYMBOLexplicit type conversion required
StringDtypeSTRING
StringDtypeUUIDexplicit type conversion required
StringDtypeIPADDRexplicit type conversion required
StringDtypeINT128explicit type conversion required
StringDtypeBLOBexplicit type conversion required

Example 5

From DolphinDB Python API 1.30.22.4, you can upload Pandas 2.0 DataFrames that use PyArrow as the storage backend.

>>> import pandas as pd # pandas version >= 2.0.0
>>> df4 = pd.DataFrame({
...     'int64': pd.Series([1, 2, None, 4], dtype="int64[pyarrow]"),
...     'float64': pd.Series([1.1, 2.2, None, 4.4], dtype="float64[pyarrow]"),
...     'string': pd.Series(["aa", "bb", None, "cc"], dtype="string[pyarrow]"),
... })
... 
>>> df5.dtyps
int64       int64[pyarrow]
float64    double[pyarrow]
string     string[pyarrow]
dtype: object
>>> s.upload({'df5': df5})
>>> s.run("schema(df5)")['colDefs']
      name typeString  typeInt  extra comment
0    int64       LONG        5    NaN        
1  float64     DOUBLE       16    NaN        
2   string     STRING       18    NaN  

The table below shows the data type mappings between DataFrame/Series, PyArrow and DolphinDB. For details on explicit type conversion, see Explicit Type Conversion.

DataFrame/SeriesPyArrowDolphinDBNote
bool[pyarrow]pa.bool_()BOOL
int8[pyarrow]pa.int8()CHAR
int16[pyarrow]pa.int16()SHORT
int32[pyarrow]pa.int32()INT
int64[pyarrow]pa.int64()LONG
date32[day][pyarrow]pa.date32()DATE
date32[day][pyarrow]pa.date32()MONTHexplicit type conversion required
time32[ms][pyarrow]pa.time32("ms")TIME
time32[s][pyarrow]pa.time32("s")MINUTEexplicit type conversion required
time32[s][pyarrow]pa.time32("s")SECOND
timestamp[s][pyarrow]pa.timestamp("s")DATETIME
timestamp[ms][pyarrow]pa.timestamp("ms")TIMESTAMP
time64[ns][pyarrow]pa.time64("ns")NANOTIME
timestamp[ns][pyarrow]pa.timestamp("ns")NANOTIMESTAMP
float[pyarrow]pa.float32()FLOAT
double[pyarrow]pa.float64()DOUBLE
dictionary<values=string, indices=int32, ordered=0>[pyarrow]pa.dictionary(pa.int32(), pa.utf8())SYMBOL
string[pyarrow]pa.utf8()STRING
fixed_size_binary[16][pyarrow]pa.binary(16)UUIDexplicit type conversion required
timestamp[s][pyarrow]pa.timestamp("s")DATEHOURexplicit type conversion required
string[pyarrow]pa.utf8()IPADDRexplicit type conversion required
fixed_size_binary[16][pyarrow]pa.binary(16)INT128
large_binary[pyarrow]pa.large_binary()BLOB
decimal128(38, S)[pyarrow]pa.decimal128(38, S)DECIMAL32(S)explicit type conversion required
decimal128(38, S)[pyarrow]pa.decimal128(38, S)DECIMAL64(S)
decimal128(38, S)[pyarrow]pa.decimal128(38,S)DECIMAL128(S)
list<item: T>[pyarrow], e.g.,list<item: int32>[pyarrow]pa.list_(T), e.g., pa.list_(pa.int32())ARRAYVECTORINT ARRAYVECTORList arrays created by pa.list_ maps to DolphinDB array vectors

Note: Starting from DolphinDB Python API version 1.30.22.6, the PROTOCOL_DDB supports uploading Decimal128 data. In earlier versions, pyarrow.decimal128 values were uploaded as Decimal64. Now pyarrow.decimal128 values are uploaded as Decimal128 by default. Alternatively, you can choose to upload pyarrow.decimal128 data as Decimal32 or Decimal64 by explicitly specifying those types.

Deserialization: From DolphinDB to Python (When pickleTableToList=False)

In this section, we will demonstrate how various DolphinDB data types and values are downloaded as Python objects with PROTOCOL_DDB enabled. We will use the run() method as an example.

Note: In the following tables, "np.datetime64[D]" in the "Python Data Type" column indicates that the object’s type is np.datetime64 and dtype=datetime64[D].

Scalar

DolphinDB Data TypeDolphinDB ExamplePython Data TypePython Example
VOIDNULLNoneTypeNone
INTint(NULL)NoneTypeNone
STRINGstring(NULL)NoneTypeNone
BOOLtrueboolTrue
CHAR'a'int97
SHORT224hint224
INT16int16
LONG3000lint3000
DATE2013.06.13np.datetime64[D]2013-06-13
MONTH2012.06Mnp.datetime64[M]2012-06
TIME13:30:10.008np.datetime64[ms]1970-01-01T13:30:10.008
MINUTE13:30mnp.datetime64[m]1970-01-01T13:30
SECOND13:30:10np.datetime64[s]1970-01-01T13:30:10
DATETIME2012.06.13T13:30:10np.datetime64[s]2012-06-13T13:30:10
TIMESTAMP2012.06.13T13:30:10.008np.datetime64[ms]2012-06-13T13:30:10.008
NANOTIME13:30:10.008007006np.datetime64[ns]1970-01-01T13:30:10.008007006
NANOTIMESTAMP2012.06.13T13:30:10.008007006np.datetime64[ns]2012-06-13T13:30:10.008007006
FLOAT2.1ffloat2.0999999046325684
DOUBLE2.1float2.1
SYMBOL---------
STRING"Hello"str"Hello"
UUIDuuid("5d212a78-cc48-e3b1-4235-b4d91473ee87")str"5d212a78-cc48-e3b1-4235-b4d91473ee87"
DATEHOURdatehour(2012.06.13T13:30:10)np.datetime64[h]2012-06-13T13
IPADDRipaddr("192.168.1.13")str"192.168.1.13"
INT128int128("e1671797c52e15f763380b45e841ec32")str"e1671797c52e15f763380b45e841ec32"
BLOBblob("xxxyyyzzz")str"xxxyyyzzz"
DECIMAL32decimal32(1.111, 4)decimal.Decimal1.1110
DECIMAL64decimal64(1.123456789, 5)decimal.Decimal1.12345

Note 1: In DolphinDB, null values can have data types other than VOID, such as INT and STRING. When downloaded to Python, DolphinDB's null scalar values will be converted to Python's None.

Note 2: DolphinDB does not have scalar SYMBOL values.

Vector

DolphinDB vectors are usually mapped to NumPy's numpy.ndarray in Python. However, DolphinDB's ANY vectors are mapped to Python's lists.

The following table shows how different types of vectors in DolphinDB are mapped to the dtype of numpy.ndarray in Python:

DolphinDB Data Typenp.dtype
BOOL (without Nulls)bool
CHAR (without Nulls)int8
SHORT (without Nulls)int16
INT (without Nulls)int32
LONG (without Nulls)int64
DATEdatetime64[D]
MONTHdatetime64[M]
TIME、TIMESTAMPdatetime64[ms]
MINUTEdatetime64[m]
SECOND、DATETIMEdatetime64[s]
NANOTIME、NANOTIMESTAMPdatetime64[ns]
FLOATfloat32
DOUBLE, CHAR (with Nulls), SHORT (with Nulls), INT (with Nulls), LONG(with Nulls)float64
DATEHOURdatetime64[h]
BOOL(with Nulls), SYMBOL, STRING, UUID, IPADDR, INT128, BLOB, DECIMAL32, DECIMAL64, Array Vectorobject

Note 1: NumPy's np.ndarray does not support INT null values. If a DolphinDB INT vector contains null values, the vector will be converted to float64 and the null values to np.nan.

Note 2: If a BOOL vector contains null values, the vector will be converted to np.ndarray with dtype=object rather than dtype=bool.

Note 3: DolphinDB's array vectors will be downloaded as Python objects with dtype=object. Each element of the Python object is an np.ndarray converted from one of the original array vector's elements.

Note 4: For DolphinDB Python API 1.30.17.2 or later, DolphinDB ANY vectors are downloaded as Python lists. For earlier versions, DolphinDB ANY vectors are downloaded as np.ndarray in Python.

Note 5: Versions starting from DolphinDB Python API 1.30.22.3 support downloading Decimal32 array vectors and Decimal64 array vectors.

Example

>>> re = s.run("[true, false]")
>>> re
[ True False]
>>> type(re)
<class 'numpy.ndarray'>
>>> re.dtype
bool

>>> re = s.run("[true, None]")
>>> re
[True None]
>>> re.dtype
object

In this example, we first download a BOOL vector without null values from DolphinDB. The result is a NumPy np.ndarray object with dtype=bool in Python. Then we download a Bool vector containing a null value, the downloaded object’s dtype becomes "object".

In contrast, when downloading INT vectors containing null values, the downloaded object’s dtype is float64:

>>> re = s.run("[1, 2, 3, NULL]")
>>> re
[ 1.  2.  3. nan]
>>> re.dtype
float64

When downloading a DolphinDB array vector, DolphinDB iterates through the elements of the array vector and converts the data type of each element as if it were a standard vector. In the following example, one element from the INT array vector to be downloaded contains a NULL. As a result, the other array vector elements were downloaded with dtype=int23, whereas the element with a NULL was downloaded with dtype=float64.

>>> s.run("arrayVector(2 3 4, [1, 2, 3, NULL])")
[array([1, 2], dtype=int32) array([3], dtype=int32) array([nan])]

When downloading an ANY vector, DolphinDB iterates through the elements of the ANY vector and converts each element based on the associated rules. In the following example, we download a DolphinDB ANY vector containing another ANY vector, and both are converted to a Python list.

>>> re = s.run('''(1, 2, [12, "aaa"])''')
>>> re
[1, 2, [12, 'aaa']]
>>> type(re)
<class 'list'>
>>> type(re[2])
<class 'list'>

For DolphinDB Python API 1.30.17.1 or earlier, ANY vectors are converted to np.ndarray with dtype=object in Python.

Pair

DolphinDB pairs are mapped to Python lists where each element is converted based on the rules for converting DolphinDB scalars.

Example

>>> s.run("100:0")
[100, 0]

Matrix

DolphinDB matrices are mapped to Python np.ndarray. The following table shows the mappings:

DolphinDB Data Typenp.dtype
BOOL (without nulls)bool
CHAR (without nulls)int8
SHORT (without nulls)int16
INT (without nulls)int32
LONG (without nulls)int64
DATEdatetime64[D]
MONTHdatetime64[M]
TIME、TIMESTAMPdatetime64[ms]
MINUTEdatetime64[m]
SECOND、DATETIMEdatetime64[s]
NANOTIME、NANOTIMESTAMPdatetime64[ns]
FLOATfloat32
DOUBLE, CHAR (with nulls), SHORT (with nulls), INT (with nulls), LONG (with nulls)float64
DATEHOURdatetime64[h]
BOOL (with nulls)object

Example

>>> s.run("""
...     mtx = 1..12$4:3;
...     mtx.rename!(1 2 3 4, `c1`c2`c3);
...     mtx
... """)
[array([[ 1,  5,  9],
       [ 2,  6, 10],
       [ 3,  7, 11],
       [ 4,  8, 12]], dtype=int32), array([1, 2, 3, 4], dtype=int32), array(['c1', 'c2', 'c3'], dtype=object)]

DolphinDB matrices are downloaded to Python lists with column and row names retained. If the matrix doesn’t contain row or column names, the corresponding element in the Python list is filled with a None.

Note: For matrices of the temporal data type, its conversion rules are similar to converting vectors with PROTOCOL_DDB. With the PROTOCOL_PICKLE, the time values are all mapped to datetime64[ns]. For more information, see PROTOCOL_PICKLE.

Set

DolphinDB sets are mapped to sets in Python. Each element in the set is converted based on the rules for converting scalars (see section "Scalar").

Note: Only sets of the CHAR, SHORT, INT, LONG, FLOAT, DOUBLE, STRING or SYMBOL type can be downloaded to Python.

Example

>>> re = s.run("set(1..5)")
>>> re
{1, 2, 3, 4, 5}
>>> type(re)
<class 'set'>

Dict

DolphinDB dictionaries are mapped to dicts in Python. During the conversion, DolphinDB iterates through the key-value pairs in the dictionary. Each key is converted based on the rules for converting scalars. Each value is converted based on its data form and type, as described in section Deserialization: "From DolphinDB to Python (When pickleTableToList=False)".

Example

>>> re = s.run('''{"a": 123, "b": [1.1, 2.2]}''')
>>> re
{'b': array([1.1, 2.2]), 'a': 123}
>>> type(re)
<class 'dict'>

Table

DolphinDB tables are mapped to pandas.DataFrame in Python. During conversion, each table column is converted based on the rules for converting vectors.

However, unlike when converting vectors, time values in the table are all converted to the datetime64[ns] type, which is the only time type supported by Python pandas.

Note: The DolphinDB Python API only supports downloading array vectors. Downloading ANY vector from DolphinDB is not supported.

Deserialization: From DolphinDB to Python (When pickleTableToList=True)

When the additional parameter pickleTableToList is enabled, if the return value of the executed script is a table, it will be downloaded as a Python list instead of a pd.DataFrame, where each element of the list (np.ndarray) represents a column of the table.

Table

Unlike converting the vector-like data structures, when converting a DolphinDB table, each column is not treated as a separate vector, but rather part of the table, so the time type is converted to datetime[ns].

Note: When downloading a table containing a column of array vectors, ensure that each element in the array vector has the same size.

Example

>>> re = s.run("table([1, NULL] as a, [2012.01.02, NULL] as b)", pickleTableToList=True)
>>> re
[array([ 1., nan]), array(['2012-01-02T00:00:00.000000000',                           'NaT'],
      dtype='datetime64[ns]')]
>>> type(re)
<class 'list'>
>>> re[0].dtype
float64
>>> re[1].dtype
datetime64[ns]

>>> s.run("table(arrayVector(1 2 3, [1, 2, 3]) as a)", pickleTableToList=True)
[array([[1],
       [2],
       [3]], dtype=int32)]

In the example above, with pickleTableToList enabled, the table is downloaded to Python as a list with each element being an np.ndarray in Python. INT columns containing null values are downloaded with dtype=float64. Columns of time values are downloaded with dtype=datetime64[ns].