kernelRidge

First introduced in version: 3.00.3

Syntax

kernelRidge(ds, yColName, xColNames, [alpha=1.0], [kernel='linear'], [gamma=0], [degree=3], [coef0=1], [swColName])

Details

This function combines ridge regression regularization with the kernel trick to model complex nonlinear relationships in the data and outputs a regression prediction model.

The objective function is:

Parameters

ds is an in-memory table or a data source usually generated by the sqlDS function.

yColName is a string indicating the column name of the dependent variable in ds, corresponding toyin the objective function.

xColNames is a string scalar/vector indicating the column names of the independent variables in ds, corresponding to X in the objective function.

alpha (optional) is a positive floating-point number indicating the regularization strength, corresponding to alpha in the objective function. The default value is 1.0.

kernel (optional) is a string indicating the kernel, corresponding to Φ in the objective function. It can take one of the following values:

  • 'linear' (default): the linear kernel <x, y>
  • 'rbf': the RBF (Gaussian) kernel exp(-gamma * ||x - y||²)
  • 'poly'/'polynomial': the polynomial kernel (gamma * <x, y> + coef0)^degree
  • 'sigmoid': the sigmoid kernel tanh(gamma * <x, y> + coef0)
  • 'laplacian': the laplacian kernel exp(-gamma * ||x - y||₁)
  • 'cosine': the cosine similarity kernel <x, y> / (||x|| * ||y||)
  • 'additive_chi2': the additive chi-squared kernel -∑[(x - y)² / (x + y)]
  • 'chi2': the chi-squared kernel exp(-gamma * ∑[(x - y)² / (x + y)])

gamma (optional) is a numeric scalar indicating the gamma parameter for kernels. The default value is 0, indicating that gamma will be automatically determined based on the number of features.

degree (optional) is a numeric scalar indicating the degree of the polynomial kernel.

coef0 (optional) is a numeric scalar indicating the zero coefficient for kernels.

swColName (optional) is a string indicating a column name in ds as the sample weight. If it is not specified, the sample weight is treated as 1.

Returns

A dictionary with the following keys:

  • modelName: A string indicating the model name, which is ‘kernelRidge’.
  • coefficients: A numeric vector indicating the parameters, corresponding to the w in the objective function.
  • xFit: A numeric matrix indicating the independent variables.
  • predict: The prediction function of the model.
  • The other keys, including xColNames, alpha, kernel, gamma, degree and coef0, correspond to the arguments with the same names.

Examples

x1 = [-1.5, 2.3, 4.2, 1.6];
x2 = [-2.2, 3.9, 2.8, 0.5];
sw = [2, 5, 8, 1];
y = [0.0, 0.9, 3.2, 3.1];
t = table(y, x1, x2, sw);

m = kernelRidge(t, `y, `x1`x2, 1, 'laplacian', 0, 3, 1, `sw);
m
/*
modelName->kernelRidge
coefficients->#0                
------------------
-0.061436450468451
0.09192514209272  
2.716894143137368 
1.428547958639275 

xColNames->["x1","x2"]
xFit->#0                #1  
----------------- ----
-1.5              -2.2
2.3               3.9 
4.200000000000001 2.8 
1.6               0.5 

alpha->1
kernel->laplacian
gamma->0.5
degree->3
coef0->1
predict->kernelRidgePredict
*/



x1 = [-1.2, 4.6, 2.4];
x2 = [-1.0, 2.8, -4.7];
t_test = table(x1, x2);
predict(m, t_test);
/*
#0               
-----------------
0.166070925673076
2.34188781136904 
0.095783250622374
*/