Data Analysis Functions

The OperatorImp.h header file declares a comprehensive set of optimized data analysis functions. These functions span various categories, including: numeric computations, data type conversions, cumulative and sliding window operations, and row-based processing. This diverse toolkit provides robust support for complex data analysis tasks, enhancing Swordfish's utility across numerous applications.

You can access the full range of Swordfish’s data analysis functions by including Swordfish.h and OperatorImp.h header files in C++ code using #include directive.

Before using Swordfish, call DolphinDBLib::initializeRuntime to initialize the runtime. After all operations are performed, you can call DolphinDBLib::finalizeRuntime to finalize the runtime.

#include "Swordfish.h"
#include "OperatorImp.h"

int main()
{
    DolphinDBLib::initializeRuntime();
    
    // function implementation
        
    DolphinDBLib::finalizeRuntime();
    return 0;
}

The following sections demonstrate the usage of several commonly used functions through specific examples. To maintain code clarity, we'll focus solely on function implementation.

For all functions, see API Reference.

Unary Function

Use OperatorImp::log to calculate the natural logarithm of 5. The second argument is passed as a void pointer using Expression::void_.

// define constant a
ConstantSP a = new Int(5);

// calculate natural logarithm of a
ConstantSP result = OperatorImp::log(a,Expression::void_);
std::cout <<result->getString() << std::endl;

Numeric Computation

Use OperatorImp::ratio to calculate the ratio of 12 to 5.

// define a and b
ConstantSP a = new Int(5);
ConstantSP b = new Int(12);

// calculate the ratio of b to a
ConstantSP result = OperatorImp::ratio(b,a);
std::cout <<result->getString() << std::endl;

Data Type Conversion

Use OperatorImp::asDecimal64 to convert the data to the DECIMAL64 type.

// define a and the number of decimal places to keep
ConstantSP a = new Double(8.6767676);
ConstantSP scale = new Int(6);

// convert data type of a
ConstantSP result = OperatorImp::asDecimal64(a, scale);
std::cout <<result->getString() << std::endl;

Data Processing

Use OperatorImp::rand to generate 10 random integers no greater than 100.

// define x and the count
ConstantSP X = new Int(100);
ConstantSP count = new Int(10);

// generate random values in a vector 
ConstantSP v = OperatorImp::rand(X, count);
std::cout << v->getString() << std::endl;

Use OperatorImp::diag to generate the diagonal matrix from a vector, or a vector with the diagonal elements from a square matrix.

// generate diagonal matrix with vector v
VectorSP v = Util::createVector(DT_INT, 0, 10);
std::vector<int> newData = {2, 4, 6, 8, 10};
v->appendInt(newData.data(), newData.size());
ConstantSP result1 = OperatorImp::diag(v, Expression::void_);
std::cout << result1->getString() << std::endl;

// generate a vector with the diagonal elements with square matrix m
int *rawData = new int[9]{1, 2, 3, 4, 5, 6, 7, 8, 9};
VectorSP m = Util::createMatrix(DT_INT, 3, 3, 9, 0, rawData);
ConstantSP result2 = OperatorImp::diag(m, Expression::void_);
std::cout << result2->getString() << std::endl;

Use OperatorImp::deltas to calculate the differences between elements.

// define vector v
VectorSP v = Util::createVector(DT_INT, 0, 10);
std::vector<int> newData = {7, 2, 5, 8, 9};
v->appendInt(newData.data(), newData.size());

// calculate X_i-X_i-n 
ConstantSP result = OperatorImp::deltas(v, Expression::void_);
std::cout << result->getString() << std::endl;

Cumulative Window Functions

Use OperatorImp::cumsum to calculate the cumulative sum of the elements in a vector.

// define vector v
VectorSP v1 =  Util::createVector(DT_INT, 0, 10);
std::vector<int> newData = {1, 2, 3, 4, 5, 6, 7, 8, 9, 10};
v1->appendInt(newData.data(), newData.size());

// calculate the cumulative sum of v
ConstantSP result = OperatorImp::cumsum(v1,Expression::void_);
std::cout <<result->getString() << std::endl;

Sliding Window Functions

Use OperatorImp::mmax to calculate the moving maximums of X in a sliding window.

// define X and the window size
VectorSP X =  Util::createVector(DT_INT, 0, 10);
std::vector<int> newData = {1, 2, 3, 4, 5, 6, 7};
X->appendInt(newData.data(), newData.size());
ConstantSP window = new Int(4);
std::vector<ConstantSP> parameter = {X, window};

// calculate the moving maximums of X
SessionSP session = DolphinDBLib::createSession();
ConstantSP result = OperatorImp::mmax(session->getHeap().get(),parameter);
std::cout <<result->getString() << std::endl;

Row-Based Functions

Use OperatorImp::rowImin to return the index of the minimum in each row.

// define matrix m
double *rawData = new double[12]{4.5, 2.6, 1.5, 3.2, 1.5, 4.8, 5.9, 1.7, 4.9, 2.0, 6.2, 5.5};
VectorSP m = Util::createMatrix(DT_DOUBLE, 3, 4, 12, 0, rawData);
std::vector<ConstantSP> parameter = {m};
SessionSP session = DolphinDBLib::createSession();

// return the index of the minimum in each row of m
ConstantSP result = OperatorImp::rowImin(session->getHeap().get(),parameter);
std::cout << result->getString() << std::endl;

Vectorized Functions

Use OperatorImp::move to shift the elements of a vector to the right with specific steps.

// define vector v and shifting step
VectorSP v = Util::createVector(DT_INT, 0, 10);
std::vector<int> newData = {3, 9, 5, 1, 4, 9};
v->appendInt(newData.data(), newData.size());
ConstantSP step = new Int(3);

// shift the elements of v to the right for 3 steps
ConstantSP result = OperatorImp::move(v, step);
std::cout << result->getString() << std::endl;

Aggregate Functions

Use OperatorImp::avg to calculate the average of a vector.

// define a matrix m
int *rawData = new int[9]{1, 2, 3, 4, 5, 6, 7, 8, 9};
VectorSP m = Util::createMatrix(DT_INT, 3, 3, 9, 0, rawData);

// calculate the average of each column of m
ConstantSP result = OperatorImp::avg(m, Expression::void_);
std::cout << result->getString() << std::endl;

Use OperatorImp::beta to calculate the coefficient estimate of an ordinary-least-squares regression of Y on X (with intercept).

// define vectors y and x
VectorSP x = Util::createVector(DT_INT, 0, 10);
std::vector<int> newData1 = {1, 3, 5, 7, 11, 16, 23};
x->appendInt(newData1.data(), newData1.size());
VectorSP y = Util::createVector(DT_DOUBLE, 0, 10);
std::vector<double> newData2 = {0.1, 4.2, 5.6, 8.8, 22.1, 35.6, 77.2};
y->appendDouble(newData2.data(), newData2.size());

// calculate the coefficient estimate
ConstantSP result = OperatorImp::beta(y, x);
std::cout << result->getString() << std::endl;

Time Access and Conversions

Use OperatorImp::date to convert other temporal types to dates.

// define a timestamp
ConstantSP timestamp =  new Timestamp(2024, 8, 19, 11, 13, 29, 326);
std::cout << timestamp->getString() << std::endl;

// convert timestamps to dates
ConstantSP date = OperatorImp::date(timestamp, Expression::void_);
std::cout << date->getString() << std::endl;

String Handling

Use OperatorImp::like to determine whether a string fits a specific pattern.

// define a string scalar x and pattern y
ConstantSP x = new String("ABCDEFG");
ConstantSP y = new String("%DE%");

// determine whether x fits y
ConstantSP result = OperatorImp::like(x, y);
std::cout << result->getString() << std::endl;

Higher-Order Functions

Use OperatorImp::eachFuncCall to apply a function to each element of vectors a and b.

#include "Swordfish.h"
#include "OperatorImp.h"

// define function myAdd
ConstantSP myAdd(const ConstantSP& a, const ConstantSP& b){
        return (a->getInt() % 2 == 0) ? new Int(0) : new Int(a->getInt() + b->getInt());
};

int main()
{
    DolphinDBLib::initializeRuntime();

    // define arguments
    SessionSP session = DolphinDBLib::createSession();
    VectorSP a = Util::createVector(DT_INT, 0, 5);
    std::vector<int> newData1 = {1, 2, 3, 4, 5};
    a->appendInt(newData1.data(), newData1.size());
    VectorSP b = Util::createVector(DT_INT, 0, 5);
    std::vector<int> newData2 = {1, 2, 3, 4, 5};
    b->appendInt(newData2.data(), newData2.size());
    std::string funcName = "myFunc";
    ConstantSP myFunc1 = Util::createOperatorFunction(funcName, myAdd, 2, 2, true);
    std::vector<ConstantSP> parameter = {myFunc1, a, b};
    
    // call eachFuncCall
    ConstantSP result = OperatorImp::eachFuncCall(session->getHeap().get(), parameter);
    std::cout << result->getString() << std::endl;
    
    DolphinDBLib::finalizeRuntime();
    return 0;
}