# adaBoostClassifier {#adaboostclassifier}

**Parent topic:**[Functions](../../Functions/category.md)

## Syntax {#syntax}

`adaBoostClassifier(ds, yColName, xColNames, numClasses, [maxFeatures=0], [numTrees=10], [numBins=32], [maxDepth=10], [minImpurityDecrease=0.0], [learningRate=0.1], [algorithm='SAMME.R'], [randomSeed])`

## Arguments {#arguments}

**ds** is the data sources to be trained. It can be generated with function [sqlDS](../s/sqlDS.md).

**yColName** is a string indicating the name of the category column in the data sources.

**xColNames** is a string scalar/vector indicating the names of the feature columns in the data sources.

**numClasses** is a positive integer indicating the number of categories in the category column. The value of the category column must be integers in \[0, *numClasses*\).

**maxFeatures** \(optional\) is an integer or a floating number indicating the number of features to consider when looking for the best split. The default value is 0.

-   If *maxFeatures* is a positive integer, then consider *maxFeatures* features at each split.
-   If *maxFeatures* is 0, then `sqrt(the number of feature columns)` features are considered at each split.
-   If *maxFeatures* is a floating number between 0 and 1, then `int(maxFeatures * the number of feature columns)` features are considered at each split.

**numTrees** \(optional\) is a positive integer indicating the number of trees. The default value is 10.

**numBins** \(optional\) is a positive integer indicating the number of bins used when discretizing continuous features. The default value is 32. Increasing *numBins* allows the algorithm to consider more split candidates and make fine-grained split decisions. However, it also increases computation and communication time.

**maxDepth** \(optional\) is a positive integer indicating the maximum depth of a tree. The default value is 10.

**minImpurityDecrease** \(optional\) a node will be split if this split induces a decrease of the Gini impurity greater than or equal to this value. The default value is 0.

**learningRate** \(optional\) is a positive floating number indicating the contribution of a regressor to the next regressor.

**algorithm** \(optional\) is a string indicating the algorithm used. It can take the value of either "SAMME.R" or "SAMME". The default value is "SAMME.R".

**randomSeed** \(optional\) is the seed used by the random number generator.

## Details {#details}

Fit an AdaBoost classification model. The result is a dictionary with the following keys: numClasses, minImpurityDecrease, maxDepth, numBins, numTress, maxFeatures, model, modelName, xColNames, learningRate and algorithm. model is a tuple with the result of the trained trees; modelName is "AdaBoost Classifier".

The fitted model can be used as an input for function [predict](../p/predict.md).

## Examples {#examples}

Fit an AdaBoost classification model with simulated data:

```
t = table(100:0, `cls`x0`x1, [INT,DOUBLE,DOUBLE])
n=5
cls = take(0, n)
x0 = norm(0, 10, n)
x1 = norm(0, 10, n)
insert into t values (cls, x0, x1)
cls = take(1, n)
x0 = norm(1, 10, n)
x1 = norm(1, 10, n)
insert into t values (cls, x0, x1)
model = adaBoostClassifier(sqlDS(<select * from t>), `cls, `x0`x1, 2);
```

Use the fitted model in forecasting:

```
t1 = table(-0.5 0 1 2 as x0, -2 0 1 3 as x1)
predict(model, t1);
```

Save the fitted model to disk:

```
saveModel(model, "C:/DolphinDB/data/classifierModel.bin");
loadModel("C:/DolphinDB/data/classifierModel.bin");
```

Related functions: [adaBoostRegressor](adaBoostRegressor.md), [randomForestClassifier](../r/randomForestClassifier.md), [randomForestRegressor](../r/randomForestRegressor.md)

