gmm
Syntax
gmm(X, k, [maxIter=300], [tolerance=1e-4], [randomSeed], [mean],
[sigma])
Arguments
X is the training data set. For univariate data, X is a vector; For multivariate data, X is a matrix/table where each column is a sample.
k is an integer indicating the number of independent Gaussians in a mixture model.
maxlter (optional) is a positive integer indicating the maximum EM iterations to perform. The default value is 300.
tolerance (optional) is a floating-point number indicating the convergence tolerance. EM iterations will stop when the lower bound average gain is below this threshold. The default value is 1e-4.
randomSeed (optional) is the random seed given to the method.
-
For univariate data, it is a vector of length k;
-
For multivariate data, it is a matrix whose number of columns is k and number of rows is the same as the number of variables in X;
-
If mean is unspecified, k values are randomly selected from X as the initial means.
-
a vector, indicating the initialized variance of each submodel if X is univariate data;
-
a tuple of length k, indicating the covariance matrix of each submodel if X is multivariate data;
-
a vector with element values of 1 or an identity matrix if sigma is unspecified.
Details
Train the Gaussian Mixture Model (GMM) with the given data set. Return a dictionary with the following keys:
-
modelName: a string "Gaussian Mixture Model"
-
prior: the prior probability of each submodel
-
mean: the expectation of each submodel
-
sigma: If X is univariate data, it represents the variance of each submodel; If X is multivariate data, it represents the covariance matrix of each submodel.
Examples
dataT = 6.8 7.2 5.3 9.4 6.5 11.2 25.6 0.6 8.9 4.3 2.2 1.9 8.7 0.2 1.5
mean = [2, 2]
re = gmm(dataT, 2, 300, 1e-4, 42, mean)
re
/* output:
sigma->[36.759822,36.759822]
modelName->Gaussian Mixture Model
prior->[0.5,0.5]
mean->[6.686667,6.686667]
*/
dataT = transpose(matrix(3.2 1.5 2.6 7.8 6.3 4.2 5.1 8.9 11.2 25.8, 25.6 4.6 8.9 4.3 2.2 1.9 8.7 0.2 1.5 9.3))
mean = transpose(matrix([1, 0], [0, 1]))
re = gmm(dataT, 2, 300, 1e-4, 42, mean)
re
/* output:
sigma->(#0 #1
51.001369 18.273032
18.273032 9.34789
,#0 #1
1.718475 0.629584
0.629584 67.713701
)
modelName->Gaussian Mixture Model
prior->[0.558683,0.441317]
mean->
#0 #1
11.152841 3.238262
3.341493 10.996997
*/