System Freeze
A cluster may freeze during:
- Node startup or initialization: Refer to Node Startup Exception - Process Stuck for solutions.
- System runtime:
- Check for machine-related abnormal metrics to see if the cluster load is too high, which could slow down system responsiveness.
- Provide the collected stack traces to DolphinDB technical support for further troubleshooting. To assist technical support in accurately analyzing stack changes, it is recommended to collect stack traces at least twice, with a 3-5 minute interval.
The following introduces two methods for collecting stack traces —
pstack
and gdb
.
Method 1: pstack
Install and use
pstack
to collect stack traces by running the
following shell script on each machine in the
cluster:#!/bin/bash
mkdir /root/output/
dpid=`ps -ef |grep "mode datanode" |grep -v grep | awk '{print $2}'`
cpid=`ps -ef |grep "mode controller" |grep -v grep | awk '{print $2}'`
for i in $dpid
do
cd /ddb/software/server
pstack $i > /root/output/pstack_dnode_${i}.log
done
for i in $cpid
do
cd /ddb/software/server
pstack $i > /root/output/pstack_ctrl_${i}.log
done
Then, send the generated stack traces in the /root/output directory to DolphinDB technical support for further troubleshooting.
Method 2: gdb
Use
gdb
to collect stack
traces:#!/bin/bash
mkdir /root/output/
dpid=`ps -ef |grep "mode datanode" |grep -v grep | awk '{print $2}'`
cpid=`ps -ef |grep "mode controller" |grep -v grep | awk '{print $2}'`
for i in $dpid
do
cd /home/dolphindb/server
gdb --eval-command "set logging file /root/output/pstack_dnode_$i.log" --eval-command "set logging on" --eval-command "thread apply all bt" --batch --pid $i;
done
for i in $cpid
do
cd /home/dolphindb/server
gdb --eval-command "set logging file /root/output/pstack_ctl_$i.log" --eval-command "set logging on" --eval-command "thread apply all bt" --batch --pid $i;
done