elasticNet

Syntax

elasticNet(ds, yColName, xColNames, [alpha=1.0], [l1Ratio=0.5], [intercept=true], [normalize=false], [maxIter=1000], [tolerance=0.0001], [positive=false], [swColName], [checkInput=true])

Arguments

ds is an in-memory table or a data source usually generated by the sqlDS function.

yColName is a string indicating the column name of the dependent variable in ds.

xColNames is a STRING scalar/vector indicating the column names of the independent variables in ds.

alpha (optional) is a floating-point number representing the constant that multiplies the L1-norm. The default value is 1.0.

l1Ratio (optional) is a floating-point number between 0 and 1 indicating the mixing parameter. For l1Ratio = 0 the penalty is an L2 penalty; for l1Ratio = 1 it is an L1 penalty; for 0 < l1Ratio < 1, the penalty is a combination of L1 and L2. The default value is 0.5.

intercept (optional) is a Boolean value indicating whether to include the intercept in the regression. The default value is true.

normalize (optional) is a Boolean value. If true, the regressors will be normalized before regression by subtracting the mean and dividing by the L2-norm. If intercept=false, this parameter will be ignored. The default value is false.

maxIter (optional) is a positive integer indicating the maximum number of iterations. The default value is 1000.

tolerance (optional) is a floating-point number. The iterations stop when the improvement in the objective function value is smaller than tolerance. The default value is 0.0001.

positive (optional) is a Boolean value indicating whether to force the coefficient estimates to be positive. The default value is false.

swColName (optional) is a string indicating a column name of ds. The specified column is used as the sample weight. If it is not specified, the sample weight is treated as 1.

checkInput (optional) is a Boolean value. It determines whether to enable validation check for parameters yColName, xColNames, and swColName.
  • If checkInput = true (default), it will check the invalid value for parameters and throw an error if the NULL value exists.

  • If checkInput = false, the invalid value is not checked.

It is recommended to specify checkInput = true. If it is false, it must be ensured that there are no invalid values in the input parameters and no invalid values will be generated during intermediate calculations, otherwise the returned model may be inaccurate.

Details

Implement linear regression with elastic net penalty (combined L1 and L2 priors as regularizer).

Minimize the following objective function:

Examples

y = [225.720746,-76.195841,63.089878,139.44561,-65.548346,2.037451,22.403987,-0.678415,37.884102,37.308288]
x0 = [2.240893,-0.854096,0.400157,1.454274,-0.977278,-0.205158,0.121675,-0.151357,0.333674,0.410599]
x1 = [0.978738,0.313068,1.764052,0.144044,1.867558,1.494079,0.761038,0.950088,0.443863,-0.103219]
t = table(y, x0, x1)
elasticNet(t, `y, `x0`x1);

If t is a DFS table, then the input should be a data source:

elasticNet(sqlDS(<select * from t>), `y, `x0`x1);