Connecting to Server
There are two ways to establish a connection to a DolphinDB server:
(1) Passing arguments to the session class constructor when creating a session.
(2) Calling the connect
method after a session has been created.
The following script creates a connection by calling the connect
method with default parameter values:
connect(host, port,
userid=None, password=None, startup=None,
highAvailability=False, highAvailabilitySites=None,
keepAliveTime=None, reconnect=False,
*,
tryReconnectNums=None, readTimeout=None, writeTimeout=None)
In the following sections, we will explain the parameters of connect
in detail.
host, port, userid, password
- host: The server address to connect to.
- port: The port number of the server.
- userid: optional. The username for server login.
- password: optional. The password for server login.
You can connect and log in to a DolphinDB server by specifying the server domain name/IP address, port number and user credentials.
Example
import dolphindb as ddb
s = ddb.Session()
# connect to localhost:8848
s.connect("localhost", 8848)
# connect to localhost:8848 and login as user "admin"
s.connect("localhost", 8848, "admin", "123456")
If the session has timed out, or the user credentials are not specified when connecting to the server during session creation, use the login
method to log in to the server.
Example
import dolphindb as ddb
s = ddb.Session()
# connect to localhost:8848 without login
s.connect("localhost", 8848)
# log into the DolphinDB server
s.login("admin","123456")
highAvailability, highAvailabilitySites
- highAvailability: bool, default False. Whether to enable the high availability mode for the API.
- highAvailabilitySites: optional. The IP address and port number of all available nodes, in the format of <ip>:<port>.
highAvailability and highAvailabilitySites configure the high availability (HA) of the DolphinDB Python API. In HA mode, when connecting to cluster nodes, the API will connect to the node with the lowest load. When you create multiple sessions using a single-threaded approach, the connections are established to ensure load balancing on all available nodes. However, if the sessions are created simultaneously using a multi-threaded approach, all sessions will connect to the same node at the same time, overloading that node.
Note: Rapid, continuous session creation from a single thread could lead to load imbalance, as the load status of each node may not have enough time to fully propagate across the cluster. Adding slight delays between session creation can help avoid this scenario and ensure better load distribution.
In the following example, we have 20 connections to create across 3 nodes. For simplicity, we will represent the load of each node by the number of connections it handles. The initial load values of the 3 nodes are [1,9,10] connections respectively.
Scenario 1: Connections are created one by one with delay by a single thread. For each connection, the connection pool identifies the node with the minimum load at that time point and assigns the connection to that node. After all 20 connections have been created, the load values of the nodes are [14,13,13] connections.
Scenario 2: All connections are created at nearly the same time, either by a single thread with no delay or by multiple threads in parallel. All connections are assigned to the node with the initial minimum load. After all 20 connections have been created, the load values of the nodes become [21,9,10] connections.
Formula for node load calculation:
load = (connectionNum + workerNum + executorNum)/3.0
- connectionNum: the number of connections to the node.
- workerNum: the size of worker pool for regular interactive jobs.
- executorNum: the number of local executors.
To obtain the value for these variables, execute rpc(getControllerAlias(), getClusterPerf)
on any node in the cluster. For more information, see DolphinDB User Manual - getClusterPerf.
Example
import dolphindb as ddb
s = ddb.Session()
# create a vector which contains the ip addresses and ports of all available nodes
sites = ["192.168.1.2:24120", "192.168.1.3:24120", "192.168.1.4:24120"]
# establish a connection with high availability enabled on all available nodes
s.connect(host="192.168.1.2", port=24120, userid="admin", password="123456", highAvailability=True, highAvailabilitySites=sites)
Note
- If highAvailability is enabled and highAvailabilitySites is unspecified, the system enables high availability on all cluster nodes by default.
- Enabling high availability means that auto reconnection is enabled. Once disconnected, the Python API will try to reconnect in two scenarios: If it receives the <NotLeader> error message, it will keep attempting to reconnect to the last node it successfully connected to. Otherwise, it will try connecting to the next available node specified in highAvailabilitySites following the last successfully connected node.
keepAliveTime
int, default 60 (seconds). The duration between two keepalive transmissions to detect the TCP connection status.
Set the parameter to release half-open TCP connections timely when the network is unstable. If this parameter is not specified, the system uses the value of session().keepAliveTime
, which is specified when the session object was constructed.
Example
import dolphindb as ddb
s = ddb.Session()
# establish a connection with keepAliveTime set to 120 seconds
s.connect(keepAliveTime=120)
reconnect, tryReconnectNums
reconnect: bool, default False. Whether to reconnect if the API detects a connection exception when high availability is disabled.
When high availability is enabled, the system automatically reconnects when connection exception is detected, in which case it is not required to specify reconnect. When high availability is disabled, set reconnect to True for auto reconnection.
tryReconnectNums: int, default None. Number of reconnection attempts for each connection. If not specified, reconnection attempts are not limited. If specified, limited attempts are applied in HA and non-HA modes. In HA mode, cycles through each node once per attempt, up to tryReconnectNums cycles.
Example
import dolphindb as ddb
s = ddb.Session()
# Establish a connection with auto reconnection
s.connect(host="localhost", port=8848, reconnect=True, tryReconnectNums=5)
readTimeout, writeTimeout
- readTimeout: int, default None. The read timeout duration in seconds for TCP connections. If not set (None), the system will wait indefinitely. This corresponds to the SO_RCVTIMEO option for TCP connections.
- writeTimeout: int, default None. Specifies the write timeout duration in seconds for TCP connections. If not set (None), the system will wait indefinitely. This corresponds to the SO_SNDTIMEO option for TCP connections.
These timeout parameters allow for the configuration of read and write timeout durations for TCP connections. For example, if the read timeout is set to 5 seconds, an exception will be thrown when a script requiring 10 seconds to complete is executed.
startup
script, default None. The script to execute the tasks when the connection is established. It can be used to load plugins and DFS tables, define and load stream tables, etc.
Example
import dolphindb as ddb
s = ddb.Session()
# establish a connection and execute clearAllCache
s.connect(host="localhost", port=8848, startup="clearAllCache();")