sql

Syntax

sql(select, from, [where], [groupBy], [groupFlag], [csort], [ascSort], [having], [orderBy], [ascOrder], [limit], [hint], [exec=false], [map=false])

Arguments

select is metacode indicating the columns to be selected. Each column is generated by either function sqlCol or sqlColAlias . Use a tuple to select multiple columns.

from is a table object or table name.

where (optional) indicates the "where" conditions. In case of multiple "where" conditions, use an ANY vector with each element corresponding to the metacode of a condition.

groupBy (optional) indicates "group by" or "context by" column(s). In case of multiple "group by" columns, use an ANY vector with each element corresponding to the metacode of a column name.

groupFlag (optional) 1 means "group by"; 0 means "context by"; 2 means "pivot by". The default value is 1.

csort (optional) is metacode or a tuple of metacode that specifies the column name(s) followed by csort. This parameter only works when contextBy is specified.

ascSort (optional) is a scalar or vector indicating whether each csort column is sorted in ascending or descending order. 1 (default) means ascending and 0 means descending.

having (optional) is metacode or a tuple of metacode that specifies the having condition(s). This parameter only works when contextBy is specified.

orderBy (optional) indicates "order by" column(s). In case of multiple "order by" columns, use a tuple with each element corresponding to the metacode of a column name.

ascOrder (optional) is a scalar or vector indicating whether each "order by" column is sorted in ascending or descending order. 1 means sorting in ascending order; 0 means sorting in descending order. The default value is 1.

limit (optional) is an integer or an integral pair indicating the number of rows to select from the result starting from the first row. If groupBy is specified and groupFlag=0, select limit rows from each group starting from the first row in each group. It corresponds to "top" clause in "select" statements.

hint (optional) is a constant that can take the following values:

  • HINT_HASH: use Hashing algorithm to execute "group by" statements.

  • HINT_SNAPSHOT: query data from snapshot engine.

  • HINT_KEEPORDER: the records in the result after executing "context by" statements are in the same order as in the input data.

exec indicates whether to use the exec clause. The default value is false. If set to be true, a scalar or a vector will be generated. If the "pivot by" is used in the exec clause, a matrix can be generated.

map (optional) is a Boolean scalar specifying whether to use the map keyword. The default value is false.

Details

Create a SQL statement dynamically. To execute the generated SQL statement, use function eval.

Examples

symbol = take(`GE,6) join take(`MSFT,6) join take(`F,6)
date=take(take(2017.01.03,2) join take(2017.01.04,4), 18)
price=31.82 31.69 31.92 31.8  31.75 31.76 63.12 62.58 63.12 62.77 61.86 62.3 12.46 12.59 13.24 13.41 13.36 13.17
volume=2300 3500 3700 2100 1200 4600 1800 3800 6400 4200 2300 6800 4200 5600 8900 2300 6300 9600
t1 = table(symbol, date, price, volume);

t1;
symbol date price volume
GE 2017.01.03 31.82 2300
GE 2017.01.03 31.69 3500
GE 2017.01.04 31.92 3700
GE 2017.01.04 31.8 2100
GE 2017.01.04 31.75 1200
GE 2017.01.04 31.76 4600
MSFT 2017.01.03 63.12 1800
MSFT 2017.01.03 62.58 3800
MSFT 2017.01.04 63.12 6400
MSFT 2017.01.04 62.77 4200
MSFT 2017.01.04 61.86 2300
MSFT 2017.01.04 62.3 6800
F 2017.01.03 12.46 4200
F 2017.01.03 12.59 5600
F 2017.01.04 13.24 8900
F 2017.01.04 13.41 2300
F 2017.01.04 13.36 6300
F 2017.01.04 13.17 9600
x=5000
whereConditions = [<symbol=`MSFT>,<volume>x>]
havingCondition = <sum(volume)>200>;

sql(sqlCol("*"), t1);
// output: < select * from t1 >

sql(sqlCol("*"), t1, whereConditions);
// output: < select * from t1 where symbol == "MSFT",volume > x >

sql(select=sqlColAlias(<avg(price)>), from=t1, where=whereConditions, groupBy=sqlCol(`date));
// output: < select avg(price) as avg_price from t1 where symbol == "MSFT",volume > x group by date >

sql(select=sqlColAlias(<avg(price)>), from=t1, groupBy=[sqlCol(`date),sqlCol(`symbol)]);
// output: < select avg(price) as avg_price from t1 group by date,symbol >

sql(select=(sqlCol(`symbol),sqlCol(`date),sqlColAlias(<cumsum(volume)>, `cumVol)), from=t1, groupBy=sqlCol(`date`symbol), groupFlag=0);
// output: < select symbol,date,cumsum(volume) as cumVol from t1 context by date,symbol >

sql(select=(sqlCol(`symbol),sqlCol(`date),sqlColAlias(<cumsum(volume)>, `cumVol)), from=t1, where=whereConditions, groupBy=sqlCol(`date), groupFlag=0);
// output: < select symbol,date,cumsum(volume) as cumVol from t1 where symbol == "MSFT",volume > x context by date >

sql(select=(sqlCol(`symbol),sqlCol(`date),sqlColAlias(<cumsum(volume)>, `cumVol)), from=t1, where=whereConditions, groupBy=sqlCol(`date), groupFlag=0, csort=sqlCol(`volume), ascSort=0);
// output: < select symbol,date,cumsum(volume) as cumVol from t1 where symbol == "MSFT",volume > x context by date csort volume desc >

sql(select=(sqlCol(`symbol),sqlCol(`date),sqlColAlias(<cumsum(volume)>, `cumVol)), from=t1, where=whereConditions, groupBy=sqlCol(`date), groupFlag=0, having=havingCondition);
// output: < select symbol,date,cumsum(volume) as cumVol from t1 where symbol == "MSFT",volume > x context by date having sum(volume) > 200 >

sql(select=sqlCol("*"), from=t1, where=whereConditions, orderBy=sqlCol(`date), ascOrder=0);
// output: < select * from t1 where symbol == "MSFT",volume > x order by date desc >

sql(select=sqlCol("*"), from=t1, limit=1);
// output: < select top 1 * from t1 >

sql(select=sqlCol("*"), from=t1, groupBy=sqlCol(`symbol), groupFlag=0, limit=1);
// output: < select top 1 * from t1 context by symbol >

sql(select=(sqlCol(`symbol),sqlCol(`date),sqlColAlias(<cumsum(volume)>, `cumVol)), from=t1, groupBy=sqlCol(`date`symbol), groupFlag=0, hint=HINT_KEEPORDER);
// output: < select [128] symbol,date,cumsum(volume) as cumVol from t1 context by date,symbol >

whereConditions1 = <symbol=`MSFT or volume>x>
sql(select=sqlCol("*"), from=t1, where=whereConditions1, orderBy=sqlCol(`date), ascOrder=0);
// output: < select * from t14059d76a00000000 where symbol == "MSFT" or volume > x order by date desc > 

sql(select=sqlCol("*"), from=t1, where=whereConditions, orderBy=sqlCol(`date), ascOrder=0, map=true);
// output: < select [256] * from t17092a30500000000 where symbol == "MSFT",volume > x order by date desc map >

A convenient and flexible way to generate complicated queries dynamically is to define a function that calls function sql.

def f1(t, sym, x){
whereConditions=[<symbol=sym>,<volume>x>]
return sql(sqlCol("*"),t,whereConditions).eval()
};

f1(t1, `MSFT, 5000);
symbol date price volume
MSFT 2017.01.04 63.12 6400
MSFT 2017.01.04 62.3 6800
f1(t1, `F, 9000);
symbol date price volume
F 2017.01.04 13.17 9600
def f2(t, sym, colNames, filterColumn, filterValue){
 whereConditions=[<symbol=sym>,expr(sqlCol(filterColumn),>,filterValue)]
    return sql(sqlCol(colNames),t,whereConditions).eval()
};
f2(t1,`GE, `symbol`date`volume, `volume, 3000);
symbol date volume
GE 2017.01.03 3500
GE 2017.01.04 3700
GE 2017.01.04 4600
f2(t1,`F, `symbol`date`volume,`price,13.2);
symbol date volume
F 2017.01.04 8900
F 2017.01.04 2300
F 2017.01.04 6300

Set the parameter exec=true and use exec clause with the pivot by statement to generate a matrix:

date = 2020.09.21 + 0 0 0 0 1 1 1 1
sym = `MS`MS`GS`GS`MS`MS`GS`GS$SYMBOL
factorNum = 1 2 1 2 1 2 1 2
factorValue = 1.2 -3.4 -2.5 6.3 1.1 -3.2 -2.1 5.6
t = table(date, sym, factorNum, factorValue);
sql(select=sqlCol(`factorValue), from=t, groupBy=[sqlCol(`date), sqlCol(`sym)], groupFlag=2, exec=true)

// output: < exec factorValue from t pivot by date,sym >