ols
Syntax
ols(Y, X, [intercept=true], [mode=0],
[method='default'],[usePinv=true])
Arguments
Y is the dependent variable; X is the independent variable(s).
Y is a vector; X is a vector/matrix/table/tuple.
When X is a matrix,
- If the number of rows equals the length of Y, each column of X is a factor;
- If the number of rows is not the same as the length of Y, and the number of columns equals the length of Y, each row of X is a factor.
intercept is a Boolean variable indicating whether the regression includes the intercept. If it is true, the system automatically adds a column of 1's to X to generate the intercept. The default value is true.
mode is an integer indicating the contents in the output. It can be:
-
0 (default): a vector of the coefficient estimates.
-
1: a table with coefficient estimates, standard error, t-statistics, and p-values.
-
2: a dictionary with the following keys: ANOVA, RegressionStat, Coefficient and Residual
Source of Variance | DF (degree of freedom) | SS (sum of square) | MS (mean of square) | F (F-score) | Significance |
---|---|---|---|---|---|
Regression | p | sum of squares regression, SSR | regression mean square, MSR=SSR/R | MSR/MSE | p-value |
Residual | n-p-1 | sum of squares error, SSE | mean square error, MSE=MSE/E | ||
Total | n-1 | sum of squares total, SST |
Item | Description |
---|---|
R2 | R-squared |
AdjustedR2 | The adjusted R-squared corrected based on the degrees of freedom by comparing the sample size to the number of terms in the regression model. |
StdError | The residual standard error/deviation corrected based on the degrees of freedom. |
Observations | The sample size. |
Item | Description |
---|---|
factor | Independent variables |
beta | Estimated regression coefficients |
StdError | Standard error of the regression coefficients |
tstat | t statistic, indicating the significance of the regression coefficients |
Residual: the difference between each predicted value and the actual value.
method (optional) is a string indicating the method for the ordinary-least-squares regression problem.
-
When set to "default" (by default),
ols
solves the problem by constructing coefficient matrices and inverse matrices. -
When set to "svd",
ols
solves the problem by using singular value decomposition.
usePinv (optional) is a Boolean value indicating whether to use pseudo-inverse method to calculate inverse of a matrix.
-
true (default): computing the pseudo-inverse of the matrix. It must be true for singular matrices.
-
false: computing the inverse of the matrix, which is only applicable to non-singular matrices.
Details
Return the result of an ordinary-least-squares regression of Y on X.
Note that null values in X and Y are treated as 0 in calculations.
Examples
x1=1 3 5 7 11 16 23
x2=2 8 11 34 56 54 100
y=0.1 4.2 5.6 8.8 22.1 35.6 77.2;
ols(y, x1);
// output: [-9.912821,3.378632]
ols(y, (x1,x2));
// output: [-9.494813,2.806426,0.13147]
ols(y, (x1,x2), 1, 1);
factor | beta | stdError | tstat | pvalue |
---|---|---|---|---|
intercept | -9.494813 | 5.233168 | -1.814353 | 0.143818 |
x1 | 2.806426 | 1.830782 | 1.532911 | 0.20007 |
x2 | 0.13147 | 0.409081 | 0.321379 | 0.764015 |
ols(y, (x1,x2), 1, 2);
/* output:
ANOVA->
Breakdown DF SS MS F Significance
---------- -- ----------- ----------- --------- ------------
Regression 2 4204.416396 2102.208198 31.467739 0.003571
Residual 4 267.220747 66.805187
Total 6 4471.637143
RegressionStat->
item statistics
------------ ----------
R2 0.940241
AdjustedR2 0.910361
StdError 8.173444
Observations 7
Coefficient->
factor beta stdError tstat pvalue
--------- --------- -------- --------- --------
intercept -9.494813 5.233168 -1.814353 0.143818
x1 2.806426 1.830782 1.532911 0.20007
x2 0.13147 0.409081 0.321379 0.764015
*/
x=matrix(1 4 8 2 3, 1 4 2 3 8, 1 5 1 1 5);
x;
#0 | #1 | #2 |
---|---|---|
1 | 1 | 1 |
4 | 4 | 5 |
8 | 2 | 1 |
2 | 3 | 1 |
3 | 8 | 5 |
ols(1..5, x);
// output: [1.156537,0.105505,0.91055,-0.697821]
ols(1..5, x.transpose());
// output: [1.156537,0.105505,0.91055,-0.697821]
// the system adjusts the dimensions of the dependent variable and the independent variables for the regression to go through.
x = table([13.9782,13.4688,13.4336,12.9642,12.7905,13.4771,13.0423,12.6588,13.8933,13.9006] as col0, [195.3904,181.4090,180.4627,168.0723,163.5973,181.6342,170.1017,160.2477,193.0241,193.2270] as col1, [2731.2089,2443.3656,2424.2715,2178.9356,2092.4947,2447.9167,2218.5185,2028.5594,2681.7456,2685.9754] as col2)
y = [-0.4002,-0.8004,-0.2002,-1.0002,-0.2001,-0.5001,-0.2501,0.0000,0.0000,0.0000]
ols(y, x, true, 0, "default")
// output: [2.968166,13.023638,-2.016390,0.076485]
ols(y, x, true, 0, "svd")
// output: [3266.457957,-722.195120,53.157806,-1.302769]