pandas

Overview

Based on Python pandas (V2.1.0), DolphinDB Python Parser develops its own pandas library — DolphinDB pandas, which is different from Python pandas in the following aspects:

  • DolphinDB pandas introduces lazy mode and non-lazy modefor Series and DataFrame. These two modes feature different calculation methods:

    • In non-lazy mode, calculations on Series and DataFrames are executed immediately, as in Python pandas.

    • In lazy mode, calculations are deferred. Instead of computing results directly, function calls are stored to be invoked later, improving calculation performance.

  • Built in the DolphinDB server, DolphinDB pandas integrates seamlessly with its databases and computing engines to process large-scale data effectively. Therefore, DolphinDB pandas improves data analysis efficiency since it requires no data importing or data type conversion and can process data concurrently.

The current version of DolphinDB pandas only supports Series and DataFrame. For details about related functions and parameters, please refer to the following sections.

How to Use

To use DolphinDB pandas, import with the following command in DolphinDB server:

import pandas as pd

Compatibility Statement

The behaviors of functions in DolphinDB pandas differ from Python pandas in some aspects. This section mainly introduces behaviors incompatible with all functions in DolphinDB pandas:

  • Strings cannot be converted to numeric types for calculations.

  • A NULL value is defined as the smallest value in ranking.

  • A complex number will be converted to an integer before being sorted (with sort or groupby).

  • Assigning a parameter with "None" equals not specifying this parameter.

  • "None" equals "dolphindb.NULL", that is, both None==dolphindb.NULL and None==None return True.

  • When initializing Series or DataFrame objects, DolphinDB pandas converts the passed list whose elements are all "None" to "dolphindb.DOUBLE".

  • In the lazy mode, DolphinDB pandas returns an empty Series or DataFrame if the indices are out of bounds.