# lassoBasic {#lassobasic}

**Parent topic:**[Functions](../../Functions/category.md)

## Syntax {#syntax}

`lassoBasic(Y, X, [mode=0], [alpha=1.0], [intercept=true], [normalize=false], [maxIter=1000], [tolerance=0.0001], [positive=false], [swColName], [checkInput=true])`

## Details {#details}

Perform lasso regression.

Minimize the following objective function:

![](../../images/lasso.png)

## Arguments {#arguments}

**Y** is a numeric vector indicating the dependent variables.

**X** is a numeric vector/tuple/matrix/table, indicating the independent variables.

-   When *X* is a vector/tuple, its length must be equal to the length of *Y*.

-   When *X* is a matrix/table, its number of rows must be equal to the length of *Y*.


**mode** is an integer that can take the following three values:

-   0 \(default\) : a vector of the coefficient estimates.

-   1: a table with coefficient estimates, standard error, t-statistics, and p-values.

-   2: a dictionary with the following keys: ANOVA, RegressionStat, Coefficient and Residual

    |Source of Variance|DF \(degree of freedom\)|SS \(sum of square\)|MS \(mean of square\)|F \(F-score\)|Significance|
    |------------------|------------------------|--------------------|---------------------|-------------|------------|
    |Regression|p|sum of squares regression, SSR|regression mean square, MSR=SSR/R|MSR/MSE|p-value|
    |Residual|n-p-1|sum of squares error, SSE|mean square error, MSE=MSE/E|||
    |Total|n-1|sum of squares total, SST||||

    |Item|Description|
    |----|-----------|
    |R2|R-squared|
    |AdjustedR2|The adjusted R-squared corrected based on the degrees of freedom by comparing the sample size to the number of terms in the regression model.|
    |StdError|The residual standard error/deviation corrected based on the degrees of freedom.|
    |Observations|The sample size.|

    |Item|Description|
    |----|-----------|
    |factor|Independent variables|
    |beta|Estimated regression coefficients|
    |StdError|Standard error of the regression coefficients|
    |tstat|t statistic, indicating the significance of the regression coefficients|

    Residual: the difference between each predicted value and the actual value.


**alpha** is a floating number representing the constant that multiplies the L1-norm. The default value is 1.0.

**intercept** is a Boolean variable indicating whether the regression includes the intercept. If it is true, the system automatically adds a column of "1"s to *X* to generate the intercept. The default value is true.

**normalize** is a Boolean value. If true, the regressors will be normalized before regression by subtracting the mean and dividing by the L2-norm. If *intercept* =false, this parameter will be ignored. The default value is false.

**maxIter** is a positive integer indicating the maximum number of iterations. The default value is 1000.

**tolerance** is a floating number. The iterations stop when the improvement in the objective function value is smaller than *tolerance*. The default value is 0.0001.

**positive** is a Boolean value indicating whether to force the coefficient estimates to be positive. The default value is false.

**swColName** is a STRING indicating a column name of *ds*. The specified column is used as the sample weight. If it is not specified, the sample weight is treated as 1.

**checkInput** is a BOOLEAN value. It determines whether to enable validation check for parameters *yColName*, *xColNames*, and *swColName*.

-   If *checkInput* = true \(default\), it will check the invalid value for parameters and throw an error if the null value exists.

-   If *checkInput* = false, the invalid value is not checked.


**Note:** It is recommended to specify *checkInput* = true. If it is false, it must be ensured that there are no invalid values in the input parameters and no invalid values are generated during intermediate calculations, otherwise the returned model may be inaccurate.

## Examples {#examples}

```
x1=1 3 5 7 11 16 23
x2=2 8 11 34 56 54 100
y=0.1 4.2 5.6 8.8 22.1 35.6 77.2;

print(lassoBasic(y, (x1,x2), mode = 0));
// output: [-9.133706333069543,2.535935196073186,0.189298948643987]


print(lassoBasic(y, (x1,x2), mode = 1));
/* output:
factor    beta               stdError          tstat              pvalue
--------- ------------------ ----------------- ------------------ -----------------
intercept -9.133706333069543 5.247492365971091 -1.740584968222107 0.156730846105191
x1        2.535935196073186  1.835793667840723 1.38138356205138   0.239309472176311
x2        0.189298948643987  0.410201227095842 0.461478260277749  0.66843504931137
*/


print(lassoBasic(y, (x1,x2), mode = 2));
/* output:
Coefficient->
factor    beta               stdError          tstat              pvalue
--------- ------------------ ----------------- ------------------ -----------------
intercept -9.133706333069543 5.247492365971091 -1.740584968222107 0.156730846105191
x1        2.535935196073186  1.835793667840723 1.38138356205138   0.239309472176311
x2        0.189298948643987  0.410201227095842 0.461478260277749  0.66843504931137

RegressionStat->
item         statistics
------------ -----------------
R2           0.931480447323074
AdjustedR2   0.897220670984611
StdError     8.195817208870076
Observations 7

ANOVA->
Breakdown  DF SS                   MS                   F                  Significance
---------- -- -------------------- -------------------- ------------------ -----------------
Regression 2  4165.242566095043912 2082.621283047521956 31.004574440904473 0.003672076469395
Residual   4  268.685678884843582  67.171419721210895
Total      6  4471.637142857141952

Residual->
[6.319173239708383,4.21150915569809,-0.028258082380245,-6.254004293338318,-7.262321947798779,-6.063400030876729,9.077301958987561]
*/
```

