Technical Analysis Indicator Library

TA-Lib is a Python library implemented in C language that encapsulates numerous indicators commonly used in technical analysis of financial market data. To help users calculate these technical indicators in DolphinDB, TA-Lib functions are implemented with DolphinDB script in DolphinDB ta module (ta.dos).

The ta module requires DolphinDB Database Server 1.10.3 or above.

Naming conventions of functions and parameters

  • In TA-Lib, all function names appear in uppercase and all parameters in lowercase. In comparison, in ta module, all function names and parameters use camelCase.

For example, the syntax of function DEMA in TA-Lib is DEMA(close, timeperiod = 30). The corresponding function in ta module is dema(close, timePeriod).

  • Some TA-Lib functions have optional parameters. In ta module, all parameters are required.

  • In order to produce meaningful results, the parameter 'timePeriod' of ta module functions is required to be at least 2.

Examples

Apply functions directly

Calculate a vector directly with the ta module function wma:

use ta
close = 7.2 6.97 7.08 6.74 6.49 5.9 6.26 5.9 5.35 5.63
x = wma(close, 5);

Apply functions in groups in SQL statements

In the following example, we first construct a table with data of 2 stocks, then conduct a calculation within each group:

close = 7.2 6.97 7.08 6.74 6.49 5.9 6.26 5.9 5.35 5.63 3.81 3.935 4.04 3.74 3.7 3.33 3.64 3.31 2.69 2.72
date = (2020.03.02 + 0..4 join 7..11).take(20)
symbol = take(`F,10) join take(`GPRO,10)
t = table(symbol, date, close)

Apply ta module function wma for each of these two stocks:

update t set wma = wma(close, 5) context by symbol

Results with multiple columns

Some ta module functions return results with multiple columns, such as function bBands.

Example 1:

close = 7.2 6.97 7.08 6.74 6.49 5.9 6.26 5.9 5.35 5.63
low, mid, high = bBands(close, 5, 2, 2, 2);

Example 2:

close = 7.2 6.97 7.08 6.74 6.49 5.9 6.26 5.9 5.35 5.63 3.81 3.935 4.04 3.74 3.7 3.33 3.64 3.31 2.69 2.72
date = (2020.03.02 + 0..4 join 7..11).take(20)
symbol = take(`F,10) join take(`GPRO,10)
t = table(symbol, date, close) 
select *, bBands(close, 5, 2, 2, 2) as `high`mid`low from t context by symbol;

symbol date       close high     mid      low
------ ---------- ----- -------- -------- --------
F      2020.03.02 7.2
F      2020.03.03 6.97
F      2020.03.04 7.08
F      2020.03.05 6.74
F      2020.03.06 6.49  7.292691 6.786    6.279309
F      2020.03.09 5.9   7.294248 6.454    5.613752
F      2020.03.10 6.26  7.134406 6.328667 5.522927
F      2020.03.11 5.9   6.789441 6.130667 5.471892
F      2020.03.12 5.35  6.601667 5.828    5.054333
F      2020.03.13 5.63  6.319728 5.711333 5.102939
GPRO   2020.03.02 3.81
GPRO   2020.03.03 3.935
GPRO   2020.03.04 4.04
GPRO   2020.03.05 3.74
GPRO   2020.03.06 3.7   4.069365 3.817333 3.565302
GPRO   2020.03.09 3.33  4.133371 3.645667 3.157962
GPRO   2020.03.10 3.64  4.062941 3.609333 3.155726
GPRO   2020.03.11 3.31  3.854172 3.482667 3.111162
GPRO   2020.03.12 2.69  3.915172 3.198    2.480828
GPRO   2020.03.13 2.72  3.738386 2.993333 2.24828

Performance comparison

Compared with the corresponding TA-Lib functions, ta module functions have similar performance on average when used directly but far superior performance when calculation is conducted in groups. In the examples of this section, we use function wma.

Apply functions directly

In DolphinDB:

use ta
close = 7.2 6.97 7.08 6.74 6.49 5.9 6.26 5.9 5.35 5.63
close = take(close, 1000000)
timer x = wma(close, 5);

The ta module function wma takes 3 milliseconds to calculate for a vector with 1,000,000 elements.

The corresponding Python script is as follows:

close = np.array([7.2,6.97,7.08,6.74,6.49,5.9,6.26,5.9,5.35,5.63,5.01,5.01,4.5,4.47,4.33])
close = np.tile(close,100000)

import time
start_time = time.time()
x = talib.WMA(close, 5)
print("--- %s seconds ---" % (time.time() - start_time))

The TA-Lib function WMA takes 11 milliseconds, which is 3.7 times as long as ta module function wma.

Apply functions in groups in SQL statements

n=1000000
close = rand(1.0, n)
date = take(2017.01.01 + 1..1000, n)
symbol = take(1..1000, n).sort!()
t = table(symbol, date, close)
timer update t set wma = wma(close, 5) context by symbol;

The ta module function wma takes 17 milliseconds to calculate for all groups.

The corresponding Python script is as follows:

close = np.random.uniform(size=1000000)
symbol = np.sort(np.tile(np.arange(1,1001),1000))
date = np.tile(pd.date_range('2017-01-02', '2019-09-28'),1000)
df = pd.DataFrame(data={'symbol': symbol, 'date': date, 'close': close})

import time
start_time = time.time()
df["wma"] = df.groupby("symbol").apply(lambda df: talib.WMA(df.close, 5)).to_numpy()
print("--- %s seconds ---" % (time.time() - start_time))

The TA-Lib function WMA takes 535 milliseconds to calculate for all groups, which is 31.5 times as long as ta module function wma.

Vectorization

Similar to TA-Lib, all ta module functions are vectorized functions in that both the input and the output are vectors of equal length. The bottom layer of TA-Lib is implemented in C language, which is very efficient. Although the ta module is implemented in DolphinDB scripting language, it makes full use of the built-in vectorized functions and higher-order functions to avoid loops. As a result, it is extremely efficient.

The implementation ta module functions is also extremely concise. The file ta.dos has 765 lines of code in total. For each function, the core code is about 4 lines on average. Users can learn how to write DolphinDB script for efficient vector programming by checking out ta module function definitions.

Handling of null values

If the input vector of TA-Lib functions starts with null values, the calculation starts from the first non-null position. The ta module uses the same strategy.

In both TA-Lib and ta module, for a rolling or cumulative window function with window length k, the first (k-1) elements of each group in the output vector are null values. If there is a null value after the first non-null element in a group, the result for this null position and all subsequent positions in the group may be null in TA-Lib. In comparison, the results are not null values for these positions in ta module unless the amount of non-null data in the window is not enough for calculation (e.g., when there is only one non-null value in a window to calculate the variance).

DolphinDB script and result:

close = [99.9, NULL, 84.69, 31.38, 60.9, 83.3, 97.26, 98.67]
ta::var(close, 5, 1);

[,,,,670.417819,467.420569,539.753584,644.748976]

Python script and result:

close = np.array([99.9, np.nan, 84.69, 31.38, 60.9, 83.3, 97.26, 98.67])
talib.VAR(close, 5, 1)

array([nan, nan, nan, nan, nan, nan, nan, nan])

As the first element of the vector 'close' is not null and the second element is null, the result is null on all positions. In short, only when all null values, if any, concentrate on the starting positions are the results of TA-Lib and ta module functions identical on all positions.

Iteration

Many TA-Lib functions use iterations where the current position value is a linear function of the previous position value and the current input: r[n] = coeff * r[n-1] + input[n]. For this type of calculation, DolphinDB introduces function iterate for vectorized iteration to avoid the use of loops.

def ema(close, timePeriod) {
1 	n = close.size()
2	b = ifirstNot(close)
3	start = b + timePeriod
4	if(b < 0 || start > n) return array(DOUBLE, n, n, NULL)
5	init = close.subarray(:start).avg()
6	coeff = 1 - 2.0/(timePeriod+1)
7	ret = iterate(init, coeff, close.subarray(start:)*(1 - coeff))
8	return array(DOUBLE, start - 1, n, NULL).append!(init).append!(ret)
}

The script above is the definition of ta module function ema. Line 5 calculates the mean of the first window as the initial value of the iterative sequence. Line 6 defines the iteration coefficient. Line 7 uses the highly-efficient DolphinDB built-in function iterate to calculate the ema sequence. To calculate the ema sequence with window length of 10 for a vector with 1,000,000 elements, TA-Lib takes 7.4 milliseconds while ta module takes only 5.0 milliseconds.

Moving window functions

DolphinDB offers various built-in moving window functions including mcount, mavg, msum, mmax, mmin, mimax, mimin, mmed, mpercentile, mrank, mmad, mbeta, mcorr, mcovar, mstd and mvar. These moving window functions have been fully optimized. The complexity of most of them is O(n) which means performance is independent of window length.

Some TA-Lib functions can be implemented with DolphinDB moving window functions. For example, TA-Lib function VAR is population variance, whereas DolphinDB function mvar is sample variance. The ta module function var is implemented in the following script:

def var(close, timePeriod, nddev){
1	n = close.size()
2	b = close.ifirstNot()
3	if(b < 0 || b + timePeriod > n) return array(DOUBLE, n, n, NULL)
4	mobs =  mcount(close, timePeriod)
5	return (mvar(close, timePeriod) * (mobs - 1) \ mobs).fill!(timePeriod - 1 + 0:b, NULL)
}

Tips for reducing data replication

When conducting operations such as slicing, joining or appending on a vector, a large amount of data may be replicated. Data replication is usually more time consuming than many simple calculations. Here are some examples about how to reduce data replication.

Use subarrays

When a subset of the elements of a vector are needed in calculation, if we use script such as close[10:].avg(), a new vector 'close[10:]' is generated with replicated data from the original vector 'close' before the calculation is conducted. This not only consumes more memory but also takes time. DolphinDB function subarray generates a subarray of the original vector. It only records the pointer to the original vector together with the starting and ending positions of the subarray. As the system does not allocate a large block of memory to store the subarray, data replication does not occur. All read-only operations on vectors can be applied directly to a subarray. The implementation of many ta module functions use subarrays.

Specify capacity for vectors

In DolphinDB, each vector is allocated with a memory capacity. When new data is appended to a vector, if the capacity is insufficient, a larger memory space is allocated to the vector. Data is copied to the new memory space and then the the old memory space is released. With a large vector, this operation may be time-consuming. If the largest possible length of a vector is known, then we can set it as the capacity of the vector in advance to avoid expanding capacity. We can set a vector's capacity in parameter 'capacity' of DolphinDB's built-in function array when we create the vector. For example, in the 8th line of the defintion of function ema (in section 4.3), we first create a vector with capacity n, and then append the result from the calculation to the vector.

DolphinDB ta module functions list

Overlap Studies

FunctionSyntaxDescription
bBandsbBands(close, timePeriod, nbDevUp, nbDevDn, maType)Bollinger Bands
demadema(close, timePeriod)Double Exponential Moving Average
emaema(close, timePeriod)Exponential Moving Average
kamakama(close, timePeriod)Kaufman Adaptive Moving Average
mama(close, timePeriod, maType)Moving average
mavpmavp(inReal, periods, minPeriod, maxPeriod, maType)Moving average with variable period
midPointmidPoint(close, timePeriod)MidPoint over period
midPricemidPrice(low, high, timePeriod)Midpoint Price over period
smasma(close, timePeriod)Simple Moving Average
t3t3(close, timePeriod, vfactor)Triple Exponential Moving Average (T3)
tematema(close, timePeriod)Triple Exponential Moving Average
trimatrima(close, timePeriod)Triangular Moving Average
wmawma(close, timePeriod)Weighted Moving Average

Momentum Indicators

FunctionSyntaxDescription
adxadx(high, low, close, timePeriod)Average Directional Movement Index
adxradxr(high, low, close, timePeriod)Average Directional Movement Index Rating
apoapo(close,fastPeriod,slowPeriod,maType)Absolute Price Oscillator
aroonaroon(high,low,timePeriod)Aroon
aroonOscaroonOsc(high, low, timePeriod)Aroon Oscillator
bopbop(open, high, low, close)Balance Of Power
ccicci(high, low, close, timePeriod)Commodity Channel Index
cmocmo(close, timePeriod)Chande Momentum Oscillator
dxdx(high, low, close, timePeriod)Directional Movement Index
macdmacd(close, fastPeriod, slowPeriod, signalPeriod)Moving Average Convergence/Divergence
macdExtmacdExt(close, fastPeriod, fastMaType, slowPeriod, slowMaType, signalPeriod, signalMaType)MACD with controllable MA type
macdFixmacdFix(close, signalPeriod)Moving Average Convergence/Divergence Fix 12/26
mfimfi(high, low, close, volume, timePeriod)Money Flow Index
minus_diminus_di(high, low, close, timePeriod)Minus Directional Indicator
minus_dmminus_dm(high, low, timePeriod)Minus Directional Movement
mommom(close, timePeriod)Momentum
plus_diplus_di(high, low, close, timePeriod)Plus Directional Indicator
plus_dmplus_dm(high, low, timePeriod)Plus Directional Movement
ppoppo(close, fastPeriod, slowPeriod, maType)Percentage Price Oscillator
rocroc(close, timePeriod)Rate of change : ((price/prevPrice)-1)*100
rocprocp(close, timePeriod)Rate of change Percentage: (price-prevPrice)/prevPrice
rocrrocr(close, timePeriod)Rate of change ratio: (price/prevPrice)
rocr100rocr100(close, timeperiod)Rate of change ratio 100 scale: (price/prevPrice)*100
rsirsi(close, timePeriod)Relative Strength Index
stochstoch(high, low, close, fastkPeriod, slowkPeriod, slowkMatype, slowdPeriod, slowdMatype)Stochastic
stochfstochf(high, low, close, fastkPeriod, fastdPeriod, fastdMatype)Stochastic Fast
stochRsistochRsi(real, timePeriod, fastkPeriod, fastdPeriod, fastdMatype)Stochastic Relative Strength Index
trixtrix(close, timePeriod)1-day Rate-Of-Change (ROC) of a Triple Smooth EMA
ultOscultOsc(high, low, close, timePeriod1, timePeriod2, timePeriod3)Ultimate Oscillator
willrwillr(high, low, close, timePeriod)Williams' %R

Volume Indicators

FunctionSyntaxDescription
adad(high, low, close, volume)Chaikin A/D Line
obvobv(close, volume)On Balance Volume

Volatility Indicators

FunctionSyntaxDescription
atratr(high, low, close, timePeriod)Average True Range
natrnatr(high, low, close, timePeriod)Normalized Average True Range
trangetrange(high, low, close)True Range

Price Transform

FunctionSyntaxDescription
avgPriceavgPrice(open, high, low, close)Average Price
medPricemedPrice(high, low)Median Price
typPricetypPrice(high, low, close)Typical Price
wclPricewclPrice(high, low, close)Weighted Close Price

Statistic Functions

FunctionSyntaxDescription
betabeta(high, low, timePeriod)Beta
correlcorrel(high, low, timePeriod)Pearson's Correlation Coefficient (r)
linearreglinearreg(close, timePeriod)Linear Regression
linearreg_anglelinearreg_angle(close, timePeriod)Linear Regression Angle
linearreg_interceptlinearreg_intercept(close, timePeriod)Linear Regression Intercept
linearreg_slopelinearreg_slope(close, timePeriod)Linear Regression Slope
stdDevstdDev(close, timePeriod, nbdev)Standard Deviation
tsftsf(close, timePeriod)Time Series Forecast
varvar(close, timePeriod, nbdev)Variance

Other Functions

  • For Math Transform and Math Operators functions in TA-Lib, you can use the corresponding DolphinDB built-in functions. For examples, functions SQRT, LN and SUM in TA-Lib correspond to DolphinDB functions sqrt, log and msum, respectively.
  • The following TA-Lib functions have not been implemented in the ta module: all Pattern Recognition and Cycle Indicators functions, as well as HT_TRENDLINE (Hilbert Transform-Instantaneous Trendline), ADOSC (Chaikin A / D Oscillator), MAMA (MESA Adaptive Moving Average), SAR (Parabolic SAR) and SAREXT (Parabolic SAR-Extended).

Future work

  • The TA-Lib functions that have not yet been implemented will be implemented in the next version.
  • Unlike TA-Lib, DolphinDB's user-defined functions do not support optional arguments or keyword arguments. They will be implemented in DolphinDB Server 1.20.0.
  • Now we must use "use ta" to load ta module before using ta module functions. DolphinDB Server will allow pre-loading of modules during system initialization in version 1.20. The functions defined in modules and DolphinDB built-in functions will have the same status, and modules will no longer need to be loaded.