Series
As one of the most common data structures in DolphinDB pandas, Series is a one-dimensional labeled array capable of holding various data types, including numeric values, strings, and Boolean values. Each Series consists of two components:
-
Data: contains the actual values stored in Series.
-
Index: contains labels used to identify and access the actual values.
Constructor
Series(data=None, index=None, lazy=False)
-
data must be strongly typed and can be a Python list, DolphinDB vector, pandas Series, or None.
-
index must be strongly typed and can be a Python list, DolphinDB vector, pandas Series, index, or None. Multiple indices are currently not supported.
-
lazy (optional) is a Boolean value representing whether to create a lazy Series. The default value is True. A lazy Series is a view of the original object where calculations are not executed immediately; a non-lazy Series is a copy of the original object where calculations are executed immediately.
Note:
-
The current version of DolphinDB pandas does not support specifying data as the column of a partitioned table.
-
If data is NULL, it is automatically filled with None and the data type is DOUBLE.
-
If data is specified as a pandas Series, index cannot be specified.
-
If lazy = True, index cannot contain a list with all items being None, such as [None,None].
Pre-Use Note:
-
The dtype parameter of Series functions can only be of the data types supported by DolphinDB, such as ddb.DOUBLE and ddb.STRING.
-
Some Series functions only support part of the parameters and the rest of the parameters must be passed through keywords. For instance, the
Series.groupby
function in DolphinDB pandas only supports parameters by and dropna as follows:Series.groupby(by=["a", "a", "b", "b"], dropna=False)
.
Attributes
Axis
DolphinDB pandas currently supports the following axis attributes:
Function |
Description |
Compatibility Statement |
---|---|---|
Series.index | The index (axis labels) of the Series. | Multiple indices are not supported. |
Series.values | Return Series as a DolphinDB vector. |
Conversion
DolphinDB pandas currently supports the following functions for conversion:
Function |
Description |
Compatibility Statement |
---|---|---|
Series.astype(dtype[]) | Cast a pandas object to a specified dtype. | Parameters copy and errors are currently not supported. |
Series.copy() | Copy the data and indices of the calling object. | Parameter deep is currently not supported. |
Series.bool() | Return the Boolean scalar value presented in the Series. | |
Series.to_list() | Return a list of the Series values. |
Indexing / Iteration
DolphinDB pandas currently supports the following functions for indexing or iteration:
Function |
Description |
Compatibility Statement |
Series.get(key) | Get the item from the object for a given key. | Series.get(None) or
Series.get(ddb.NULL) returns None. |
Series.at | Access a single value by label. | When using at[i]=val to change an item, if the
data types of val and the Series differ, an attempt will be made to
convert the Series type. If the conversion fails, an error will be
raised. Accessing a lazy Series with the Series.at
method returns a Series. |
Series.iat | Access a single value by integer position. | Same as above. |
Series.loc | Access a group of rows and columns by label(s) or a Boolean array. | Same as above. |
Series.iloc | Purely integer-location based indexing for selection by position. | Same as above. |
Series.keys() | Return alias for index. | |
Series.pop(item) | Drop items from the Series and return the dropped items. | |
Series.item() | Return the scalar value presented in the Series. | |
Series.xs(key[, axis, level, drop_level]) | Return a cross-section from the Series for the given key value. |
Binary Operator Functions
DolphinDB pandas supports all binary operator functions of Python pandas.Series.
Note:
-
The parameter level is not supported.
-
The parameter other in all functions only supports scalar, DolphinDB vector, list, and Series.
-
If the parameter other in
eq
is a scalar, the parameter fill_value cannot be specified. -
The parameter func in
combine
only supports DolphinDB's built-in functions and does not support user-defined functions.
Function Application / GroupBy / Window
The parameter func of the following functions can only be built-in functions of pandas and user-defined functions (lambda expressions included). The parameter **kwargs is currently not supported.
Function |
Compatibility Statement |
---|---|
Series.apply(func[, args]) | The parameter convert_dtype is not supported. |
Series.agg(func) | The parameter axis is not supported. |
Series.aggregate(func) | The parameter axis is not supported. |
Series.transform(func) | The parameter axis is not supported. |
Series.groupby(by,dropna=True) | Parameters axis, level, as_index, sort, group_keys, squeeze,and observed are not supported. |
rolling | Parameters center, win_type, on, axis, closed, step, and method are not supported. |
ewm | Parameters ignore, axis, times, and method are not supported. |
map | The parameter arg cannot be specified as a Series. |
Computations / Descriptive Statistics
DolphinDB pandas supports the following functions related to computations / descriptive stats of Python
pandas.Series: abs
, all
, any
,
autocorr
, between
, corr
,
count
, cov
, diff
,
kurt
, kurtosis
, mad
,
max
, mean
, median
,
min
, mode
, nlargest
,
nsmallest
, prod
, sem
,
skew
, std
, sum
,
var
, unique
, nunique
,
is_unique
, is_monotonic_increasing
,
is_monotonic_decreasing
, value_counts
.
Note:
-
Parameters axis, skipna, bool_only, fill_value, min_count, and ddof are not supported by all functions.
-
Except for functions
count
,mad
,max
,mean
,median
,min
,prod
,skew
,sum
,sem
,std
,var
, the parameter level is not supported by other functions. -
Some parameters are currently not supported by certain functions, as shown in the following table.
Function |
Compatibility Statement |
---|---|
corr | Parameters method and min_periods are not supported. |
cov | Parameters min_period and ddof are not supported. |
diff | The parameter periods is not supported. |
kurt | No parameter is supported. |
kurtosis | No parameter is supported. |
std | The parameter ddof is not supported. |
value_counts | The parameter bins is not supported. |
Reindexing / Selection / Label Manipulation
DolphinDB pandas supports the following functions related to reindexing / selection / label
manipulation of Python pandas.Series: drop
,
drop_duplicates
, first
, head
,
idxmax
, idxmin
, isin
,
last
, reindex
,
reset_index
.
Some parameters are currently not supported by certain functions, as shown in the following table.
Function |
Compatibility Statement |
---|---|
drop | Parameters axis, columns, inplace, and errors are not supported. |
drop_duplicates | Parameters inplace and ignore_index are not supported. |
idxmax/idxmin | The parameter axis is not supported. |
reindex | Parameters copy, level, and tolerance are not supported. |
reset_index | Parameters inplace and allow_duplicates are not supported. |
head | The parameter n cannot be 0. |
take | The parameter axis is not supported. |
Missing Data Handling
DolphinDB pandas supports the following functions related to missing data handling of Python
pandas.Series: backfill
, isnull
,
notnull
, pad
.
Note: Parameters axis, downcast, inplace, and limit are not supported by all functions.
Reshaping / Sorting
DolphinDB pandas supports the following functions related to reshaping / sorting of Python
pandas.Series: argsort
, argmin
,
argmax
, sort_values
,
sort_index
.
Note: The parameter axis is not supported by all functions. Some parameters are currently not supported by certain functions, as shown in the following table.
Function |
Compatibility Statement |
---|---|
argsort | Parameters kind and na_position are not supported. |
sort_values/sort_index | Parameters inplace, kind, and na_position are not supported. |
Combining / Comparing / Joining / Merging
DolphinDB pandas supports all functions related to combining / comparing / joining / merging of Python pandas.Series. However, Some parameters are currently not supported by certain functions, as shown in the following table.
Function |
Compatibility Statement |
---|---|
compare | The paramter align_axis can only be 1. |
update |
Time Series-Related
DolphinDB pandas supports most functions related to time series of Python pandas.Series,
except for functions tz_convert
and tz_localize
.
However, Some parameters are currently not supported by certain functions, as shown
in the following table.
Function |
Compatibility Statement |
---|---|
asfreq | Parameters method, how, normalize, and fill_value are not supported. |
asof | The parameter subset is not supported. |
shift | The parameter axis is not supported. The parameter freq only takes the values of rule listed in asFreq. |
resample | Parameters on, loffset, base, and group_keys are not supported. |
at_time | Parameters asof and axis are not supported. |
between_time | Parameters include_start, include_end, and axis are not supported. |
Creating a Series
-
If the parameter index is not specified, it will default to RangeIndex (0, 1, 2, …, n).
s = pd.Series([20, 21, 12]) print(s)
Output:
0 20
1 21
2 12
dtype: INT
-
Specify the parameter index as a single column
pd.Series([20, 21, 12], ['London', 'New York', 'Helsinki']) print(s)
Output:
London 20
New York 21
Helsinki 12
dtype: INT
Accessing Data in a Series
city = ['London', 'New York', 'Helsinki']
s = pd.Series([20, 21, 12], city)
-
Accessing Data by Position
-
Access a single element by integer position
s[0] s.iloc[0] s.iat[0]
Output: 20
-
Access multiple rows by integer positions
s[[0,2]] s.iloc[[0,2]]
Output:
London 20
Helsinki 12
dtype: INT
-
Access multiple rows by slice
s[0:3] s.iloc[0:3]
Output:
London 20
New York 21
Helsinki 12
dtype: INT
s[:2] s.iloc[:2]
Output:
London 20
New York 21
dtype: INT
s[1:] s.iloc[1:]
Output:
New York 21
Helsinki 12
dtype: INT
s[:] s.iloc[:]
Output:
London 20
New York 21
Helsinki 12
dtype: INT
-
-
Accessing Data by Label
-
Access a single element by index label
s['London'] s.loc['London']
Output: 20
-
Access multiple elements by index labels
s[['London', 'New York']] s.loc[['London', 'New York']]
Output:
kind
A 20
B 21
dtype: INT
-
Access multiple elements by slice
s['London':'Helsinki'] s.loc['London':'Helsinki']
Output:
city kind
London A 20
New York B 21
Helsinki A 12
dtype: INT
s[:] s.loc[:]
Output:
city kind
London A 20
New York B 21
Helsinki A 12
dtype: INT
Note: Accessing the whole Series using [:] is not supported while using multiple index labels.
s[:, 'A'] s.loc[:, 'A']
-
Operating a Series
-
Adding elements
Currently not supported.
-
Updating elements
-
Update elements directly: Modification through index is supported, while modification through position is not supported.
s.loc["London"] = 33 s[1] = 33 // supported
-
Update elements using the
update
methodcity = ['London', 'New York', 'Helsinki'] s = pd.Series([20, 21, 12], city) s.update(pd.Series([50,60], index=['London', 'Helsinki'])) s
Output:
London 50
New York 21
Helsinki 60
dtype: INT
-
-
Removing elements
-
Remove elements using the
drop
methods.drop(["London"])
-
Series Calculation
Basic calculation method: align elements from different Series based on their indices before calculation. If an index exists in one Series but not in the other, the result for that index will be NaN, such as "Helsinki" and "New York" in the following script.
s=pd.Series([12, 15],["London","New York"])
city = ['London', 'Helsinki']
s1 = pd.Series([33, 25], city)
s + s1
Output:
Helsinki NaN
London 45
New York NaN
dtype: INT