homework7_ZhankunLuo
Homework 7
Zhankun Luo
PUID: 0031195279
Email: [email protected]
Fall-2018-ECE-59500-009
Instructor: Toma Hentea
文章目录
Function
plot_cluster.m
function [] = plot_cluster(X2, bel, theta)
plot(X2(1,bel==1),X2(2,bel==1),'r.',...
X2(1,bel==2),X2(2,bel==2),'g*',X2(1,bel==3),X2(2,bel==3),'bo',...
X2(1,bel==4),X2(2,bel==4),'cx',X2(1,bel==5),X2(2,bel==5),'md',...
X2(1,bel==6),X2(2,bel==6),'yp',X2(1,bel==7),X2(2,bel==7),'ks')
hold on, plot(theta(1,:), theta(2,:),'k+')
end
k_means.m
Hard Clustering: k-means
Where
because is not related to
We have
u_{ij}(t - 1) = (bel==j) % 1xN vector
\sum u_{ij}(t - 1) * x_i = sum(X'.*((bel==j)'*ones(1,l))) % 1xm vector
\sum u_{ij} = sum(bel==j)
Thus
% in function k_means.m
theta(:,j)=sum(X'.*((bel==j)'*ones(1,l))) / sum(bel==j);
drawbacks
- Different initial partitions may lead k-means to produces different final clusterings, each one corresponding to a different local minimum.
- Knowledge of the number of clusters m is required a priori.
- k-means is sensitive to outliers and noise.
- k-means is not suitable for data with nominal (categorical) coordinates.
code
function [theta,bel,J]=k_means(X,theta)
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
% FUNCTION
% [theta,bel,J]=k_means(X,theta)
% This function implements the k-means algorithm, which requires
% as input the number of clusters underlying the data set. The algorithm
% starts with an initial estimation of the cluster representatives and
% iteratively tries to move them into regions that are "dense" in data
% vectors, so that a suitable cost function is minimized. This is
% achieved by performing (usually) a few passes on the data set. The
% algorithm terminates when the values of the cluster representatives
% remain unaltered between two successive iterations.
%
% INPUT ARGUMENTS:
% X: lxN matrix, each column of which corresponds to
% an l-dimensional data vector.
% theta: a matrix, whose columns contain the l-dimensional (mean)
% representatives of the clusters.
%
% OUTPUT ARGUMENTS:
% theta: a matrix, whose columns contain the final estimations of
% the representatives of the clusters.
% bel: N-dimensional vector, whose i-th element contains the
% cluster label for the i-th data vector.
% J: the value of the cost function (sum of squared Euclidean
% distances of each data vector from its closest parameter
% vector) that corresponds to the estimated clustering.
%
% (c) 2010 S. Theodoridis, A. Pikrakis, K. Koutroumbas, D. Cavouras
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
[l,N]=size(X);
[l,m]=size(theta);
e=1;
iter=0;
while(e~=0)
iter=iter+1;
theta_old=theta;
dist_all=[];
for j=1:m
dist=sum(((ones(N,1)*theta(:,j)'-X').^2)'); % sum of l dimensions
dist_all=[dist_all; dist];
end
[q1,bel]=min(dist_all); % get min value of every col of dist_all: mxN
J=sum(min(dist_all));
% bel =
% 1 2 1
% (bel==1)=
% 1 0 1 % choose 1st class
for j=1:m
if(sum(bel==j)~=0)
theta(:,j)=sum(X'.*((bel==j)'*ones(1,l))) / sum(bel==j);
end
end
e=sum(sum(abs(theta-theta_old)));
end
fuzzy_c_means.m
Fuzzy Clustering
In fcm.m, define
because is not related to
code
function [theta,U,obj_fun]=fuzzy_c_means(X,m,q)
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
% FUNCTION
% [w, U, obj_fun]=fuzzy_c_means(X,m,q)
% This function applies the FCM algorithm by calling the corresponding
% MATLAB function, called "fcm".
%
% INPUT ARGUMENTS:
% X: lxN matrix, each column of which corresponds to
% an l-dimensional data vector.
% m: the number of clusters.
% q: the fuzzifier.
%
% OUTPUT ARGUMENTS:
% theta: lxm matrix, each column of which corresponds to
% a cluster representative, after the convergence of the
% algorithm.
% U: Nxm matrix containing in its i-th row
% the ``grade of membership'' of the data vector xi to the m
% clusters.
% obj_fun: a vector, whose t-th coordinate is the value of the cost
% function, J, for the clustering produced at the t-th teration.
%
% (c) 2010 S. Theodoridis, A. Pikrakis, K. Koutroumbas, D. Cavouras
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
X=X';
options(1)=q;
% options(1): Exponent for the partition matrix U. Default: 2.0
% options(2): Maximum number of iterations. Default: 100
% options(3): Minimum amount of improvement. Default: 1e-5
% options(4): Information displayed during iteration. Default: 1.
[theta,U,obj_fun] = fcm(X,m,options);
theta=theta';
U=U';
where function: [center, U, obj_fcn] = fcm(data, cluster_n)
[center, U, obj_fcn] = fcm(data, cluster_n)
- applies the fuzzy c-means clustering method to a given data set.
The input arguments of this function are
-
data:
data set to be clustered;
each row is a sample data point
-
cluster_n:
number of clusters (greater than one)
The output arguments of this function are
-
center:
matrix of final cluster centers where each row provides the center coordinates
-
U:
final fuzzy partition matrix (or membership function matrix)
-
obj_fcn:
values of the objective function during iterations
Exercise
Exercise 7.5.1
-
Generate and plot a data set , that consists of 300 2-dimensional stemming from the normal distribution with mean and covariance matrix equal to the 2 x 2 identity matrix.
repeat step I of Example 7.4.2
randn('seed', 0); m = [0 0; 4 0; 0 4; 5 4]; S(:,:,1) = eye(2); S(:,:,2) = [1.0 .2; .2 1.5]; S(:,:,3) = [1.0 .4; .4 1.1]; S(:,:,4) = [.3 .2; .2 .5]; n_points = 300 * ones(1,4); % Number of points per group
-
Apply the k-means algorithm for and random initialization if 's, and draw conclusions.
exercise_751.m
% Exercise 7.5.1
% NOTE: this example generates a total of six figures.
close('all');
clear
% 1. To generate and plot X2, work as in Example 7.4.2, where now the means
% of the Gaussians are different. The plot shows that X3 contains four
% clearly separated compact clusters.
randn('seed', 0);
m = [0 0; 4 0; 0 4; 5 4];
S(:,:,1) = eye(2);
S(:,:,2) = [1.0 .2; .2 1.5];
S(:,:,3) = [1.0 .4; .4 1.1];
S(:,:,4) = [.3 .2; .2 .5];
n_points = 300 * ones(1,4); % Number of points per group
X2=[];
for i=1:4
X2=[X2; mvnrnd(m(i,:),S(:,:,i),n_points(i))];
end
X2=X2'; % X3: lxN=2x1200
figure(1), plot(X2(1,:),X2(2,:),' b.')
figure(1), axis equal
% 2. To apply the k-means algorithm, for m = 4 and random initialization
m = 4;
[l, N] = size(X2);
rand('seed', 0)
theta_ini = rand(l, m);
[theta, bel, J] = k_means(X2, theta_ini);
% Plot X3, using different colors for points from different clusters
figure(2), plot_cluster(X2, bel, theta)
figure(2), axis equal
% 3. Work as in step 2, for m = 2
m = 2;
[l, N] = size(X2);
rand('seed', 0)
theta_ini = rand(l, m);
[theta, bel, J] = k_means(X2, theta_ini);
figure(3), plot_cluster(X2, bel, theta)
figure(3), axis equal
% 4. Work as in step 2, for m = 3
m = 3;
[l, N]=size(X2);
rand('seed', 0)
theta_ini = rand(l, m);
[theta,bel,J] = k_means(X2, theta_ini);
figure(4), plot_cluster(X2, bel, theta)
figure(4), axis equal
% 5. Work as in step 2 with different theta_ini set1
m = 4;
[l, N] = size(X2);
theta_ini = [-2 -2; -2.1 -2.1; -2 -2.2; -2.1 -2.2]';
[theta, bel, J] = k_means(X2, theta_ini);
figure(5), plot_cluster(X2, bel, theta)
figure(5), axis equal
% 6. Work as in step 2 with different theta_ini set2
m = 4;
[l,N] = size(X2);
rand('seed', 0)
theta_ini = rand(l, m);
theta_ini(:, m) = [5 5];
[theta, bel, J] = k_means(X2, theta_ini);
figure(6), plot_cluster(X2, bel, theta)
figure(6), axis equal
Result
- original data
- m = 4, theta_ini = rand(l, m)
- m = 2, theta_ini = rand(l, m)
- m = 3, theta_ini = rand(l, m)
- m = 4, theta_ini = [-2 -2; -2.1 -2.1; -2 -2.2; -2.1 -2.2]’
- m = 4, theta_ini = rand(l, m), theta_ini(:, m) = [12 12];
Conclusion
-
When , divisions of clusters would be different
So, for different , it could get different divisions.
-
Sometimes when are not set properly,
Division would be trapped in local best value,
Thus, it would not get global best value ( which is unique for = certain value)
Exercise 7.5.9
- Apply the FCM on the data set generated in Example 7.5.1 for and .
- Define and plot the three corresponding hard clustering, as discussed, as discussed previously.
- Compare the parameters and the 's for the three cases and draw conclusions.
exercise_759.m
% Exercise 7.5.9
% NOTE: this example generates a total of 4 figures: 1 original + 3 cases
close('all');
clear
% 1. To generate and plot X2, work as in Example 7.4.2, where now the means
% of the Gaussians are different. The plot shows that X3 contains four
% clearly separated compact clusters.
randn('seed', 0);
mu = [0 0; 4 0; 0 4; 5 4];
S(:,:,1) = eye(2);
S(:,:,2) = [1.0 .2; .2 1.5];
S(:,:,3) = [1.0 .4; .4 1.1];
S(:,:,4) = [.3 .2; .2 .5];
n_points = 300 * ones(1,4); % Number of points per group
m = 4; % cluster number
X2=[];
for i=1:4
X2=[X2; mvnrnd(mu(i,:),S(:,:,i),n_points(i))];
end
X2=X2'; % X3: lxN=2x1200
[l, N] = size(X2); % l: deminsion N: point number
rand('seed', 0)
theta_ini = rand(l, m);
figure(1), plot(X2(1,:),X2(2,:),' b.')
figure(1), axis equal
% 2. To apply the k-means algorithm, for q = 2 and random initialization
q = 2;
[theta, U, obj_fun] = fuzzy_c_means(X2, m, q);
[U_max, bel] = max(U');
% Plot X3, using different colors for points from different clusters
figure(2), hold on, plot_cluster(X2, bel, theta)
figure(2), axis equal
% 3. Work as in step 2, for q = 10
q = 10;
[theta, U, obj_fun] = fuzzy_c_means(X2, m, q);
[U_max, bel] = max(U');
figure(3), hold on, plot_cluster(X2, bel, theta)
figure(3), axis equal
% 4. Work as in step 2, for q = 25
q = 25;
[theta, U, obj_fun] = fuzzy_c_means(X2, m, q);
[U_max, bel] = max(U');
figure(4), hold on, plot_cluster(X2, bel, theta)
figure(4), axis equal
Result
- original data
- q = 2
- q = 10
- q = 25
>> exercise_759
% q = 2
Iteration count = 1, obj. fcn = 4374.311623
Iteration count = 2, obj. fcn = 3344.540096
Iteration count = 3, obj. fcn = 3335.790961
Iteration count = 4, obj. fcn = 3304.365018
Iteration count = 5, obj. fcn = 3195.008604
Iteration count = 6, obj. fcn = 2884.099872
Iteration count = 7, obj. fcn = 2364.754387
Iteration count = 8, obj. fcn = 2037.567145
Iteration count = 9, obj. fcn = 1834.607024
Iteration count = 10, obj. fcn = 1606.009571
Iteration count = 11, obj. fcn = 1483.160811
Iteration count = 12, obj. fcn = 1454.601969
Iteration count = 13, obj. fcn = 1450.174378
Iteration count = 14, obj. fcn = 1449.552804
Iteration count = 15, obj. fcn = 1449.463887
Iteration count = 16, obj. fcn = 1449.450440
Iteration count = 17, obj. fcn = 1449.448266
Iteration count = 18, obj. fcn = 1449.447888
Iteration count = 19, obj. fcn = 1449.447816
Iteration count = 20, obj. fcn = 1449.447801
Iteration count = 21, obj. fcn = 1449.447798
% q = 10
Iteration count = 1, obj. fcn = 14.151425
Iteration count = 2, obj. fcn = 0.051120
Iteration count = 3, obj. fcn = 0.050908
Iteration count = 4, obj. fcn = 0.050800
Iteration count = 5, obj. fcn = 0.050680
Iteration count = 6, obj. fcn = 0.050538
Iteration count = 7, obj. fcn = 0.050373
Iteration count = 8, obj. fcn = 0.050195
Iteration count = 9, obj. fcn = 0.049981
Iteration count = 10, obj. fcn = 0.049709
Iteration count = 11, obj. fcn = 0.049374
Iteration count = 12, obj. fcn = 0.048968
Iteration count = 13, obj. fcn = 0.048510
Iteration count = 14, obj. fcn = 0.048036
Iteration count = 15, obj. fcn = 0.047543
Iteration count = 16, obj. fcn = 0.047002
Iteration count = 17, obj. fcn = 0.046502
Iteration count = 18, obj. fcn = 0.046041
Iteration count = 19, obj. fcn = 0.045677
Iteration count = 20, obj. fcn = 0.045404
Iteration count = 21, obj. fcn = 0.045249
Iteration count = 22, obj. fcn = 0.045160
Iteration count = 23, obj. fcn = 0.045097
Iteration count = 24, obj. fcn = 0.045053
Iteration count = 25, obj. fcn = 0.045025
Iteration count = 26, obj. fcn = 0.045009
Iteration count = 27, obj. fcn = 0.044999
% q = 25
Iteration count = 1, obj. fcn = 1.678658
Iteration count = 2, obj. fcn = 0.000000
Iteration count = 3, obj. fcn = 0.000000
Conclusion
-
When , divisions of clusters are different
So, for different , it could get different divisions.
-
When , its division has great difference to the original division
So, we need choose proper to certain clustering problems.