

lasso(ds, yColName, xColNames, [alpha=1.0], [intercept=true], [normalize=false], [maxIter=1000], [tolerance=0.0001], [positive=false], [swColName], [checkInput=true])


ds is an in-memory table or a data source usually generated by the sqlDS function.

yColName is a string indicating the column name of the dependent variable in ds.

xColNames is a string scalar/vector indicating the column names of the independent variables in ds.

alpha is a floating number representing the constant that multiplies the L1-norm. The default value is 1.0.

intercept is a Boolean value indicating whether to include the intercept in the regression. The default value is true.

normalize is a Boolean value. If true, the regressors will be normalized before regression by subtracting the mean and dividing by the L2-norm. If intercept =false, this parameter will be ignored. The default value is false.

maxIter is a positive integer indicating the maximum number of iterations. The default value is 1000.

tolerance is a floating number. The iterations stop when the improvement in the objective function value is smaller than tolerance. The default value is 0.0001.

positive is a Boolean value indicating whether to force the coefficient estimates to be positive. The default value is false.

swColName is a STRING indicating a column name of ds. The specified column is used as the sample weight. If it is not specified, the sample weight is treated as 1.

checkInput is a BOOLEAN value. It determines whether to enable validation check for parameters yColName, xColNames, and swColName.
  • If checkInput = true (default), it will check the invalid value for parameters and throw an error if the null value exists.

  • If checkInput = false, the invalid value is not checked.

It is recommended to specify checkInput = true. If it is false, it must be ensured that there are no invalid values in the input parameters and no invalid values are generated during intermediate calculations, otherwise the returned model may be inaccurate.


Estimate a Lasso regression that performs L1 regularization.

Minimize the following objective function:


y = [225.720746,-76.195841,63.089878,139.44561,-65.548346,2.037451,22.403987,-0.678415,37.884102,37.308288];
x0 = [2.240893,-0.854096,0.400157,1.454274,-0.977278,-0.205158,0.121675,-0.151357,0.333674,0.410599];
x1 = [0.978738,0.313068,1.764052,0.144044,1.867558,1.494079,0.761038,0.950088,0.443863,-0.103219];
t = table(y, x0, x1);

lasso(t, `y, `x0`x1);

If t is a DFS table, then the input should be a data source:

lasso(sqlDS(<select * from t>), `y, `x0`x1);