lgbm
The DolphinDB lgbm plugin is used for efficient classification and regression based on the LightGBM library. Key features include:
- Fast training: Train models using the LightGBM algorithm.
- Prediction: Make predictions on new data using trained models.
- Model saving and loading: Save trained models for future prediction, eliminating the need for retraining.
Installation (with installPlugin
)
Required server version: DolphinDB 2.00.13 or higher
OS: Linux x86-64.
Installation Steps:
(1) Use listRemotePlugins to check plugin information in the plugin repository.
Note: For plugins not included in the provided list, you can install through precompiled binaries or compile from source. These files can be accessed from our GitHub repository by switching to the appropriate version branch.
login("admin", "123456")
listRemotePlugins()
(2) Invoke installPlugin for plugin installation
installPlugin("lgbm")
(3) Use loadPlugin to load the plugin before using the plugin methods.
loadPlugin("lgbm")
Method References
train
Syntax
train(X,Y,[numIteration=100],[params])
Details
Perform regression training on the lgbm model using the specified dataset. Return a trained lgbm model.
Parameters
- X: A table with numeric columns indicating the input features. Rows with null values are excluded from training.
- Y: A numeric vector indicating the label column. Rows in X with corresponding nulls in Y are excluded from training.
- numIteration (optional): A non-negative integer indicating the number of iterations for a regression training. The default value is 100.
- params (optional): A dictionary used for parameter configuration. For
details, refer to Parameters. Common parameters
include:
- objective: A STRING scalar indicating the name of the objective function. The default value is 'regression'.
- boosting: A STRING scalar indicating the boosting algorithm of the training. Currently only 'gbdt' is supported.
- learning_rate: A floating-point number indicating the learning rate. The default value is 0.1.
- min_data_in_leaf: A non-negative integer indicating the minimal number of data points in a leaf node. The default value is 20.
predict
Syntax
predict(model, X)
Details
Perform a regression prediction on the input feature set. Return a vector indicating the prediction results.
Parameters
- model: A trained lgbm model.
- X: A table indicating the input feature set. Rows with null values lead to unreliable prediction results.
saveModel
saveModel(model, path)
Details
Save the trained model to a file.
Parameters
- model: The lgbm model to be saved.
- path: A STRING scalar indicating the path to save the model, e.g., "XXX/model.txt".
loadModel
Syntax
loadModel(path)
Details
Load a lgbm model from a file.
Parameters
- path: A STRING scalar indicating the path of the model file, e.g., "XXX/model.txt".
Usage Examples
loadPlugin("lgbm")
// Create a dateset for training
x1=rand(10,100)
x2=rand(10,100)
Y=2*x1+3*x2
X=table(x1,x2)
// Set model parameters
num_iteration=500
params = {task:"train",min_data_in_leaf:"5"}
// Train the lgbm model
model=lgbm::train(X,Y,num_iteration,params);
// Create a dataset for prediction
x1=rand(10,10)
x2=rand(10,10)
X=table(x1,x2)
Y=2*x1+3*x2
// Make predictions on new data using the lgbm model
pred=lgbm::predict(model,X);
// Calculate the fitting rate
fitGoodness = 1 - (pred - Y).sum2() / (Y - Y.avg()).sum2()
// Save the model
path="/path/to/model.txt";
lgbm::saveModel(model, path)
// Load the model
model1=lgbm::loadModel(path);
// Make predictions using the loaded model
pred=lgbm::predict(model1, X);
// Calculate the fitting rate
fitGoodness = 1 - (pred - Y).sum2() / (Y - Y.avg()).sum2()