基于多维样本空间分布密度的聚类中心优化K-均值算法的MATLAB代码
时间: 2023-07-15 10:14:52 浏览: 111
下面是基于多维样本空间分布密度的聚类中心优化K-均值算法的MATLAB代码,供参考:
```matlab
function [label, center, obj_fcn] = kmeans_density(X, k)
%KMEANS_DENSITY K-means clustering based on density distribution.
% [LABEL, CENTER, OBJ_FCN] = KMEANS_DENSITY(X, K) partitions the points in the N-by-P
% data matrix X into K clusters. Rows of X correspond to points, columns correspond
% to variables. KMEANS_DENSITY returns an N-by-1 vector LABEL containing the cluster
% indices of each point, a K-by-P matrix CENTER containing the coordinates of each
% cluster center, and the objective function value OBJ_FCN of the final partition.
%
% KMEANS_DENSITY treats NaNs as missing data. Rows of X with NaNs are excluded from
% the distance calculation.
%
% Example:
%
% % Generate some data points.
% X = [randn(100,2)*0.75+ones(100,2);
% randn(100,2)*0.5-ones(100,2)];
%
% % Cluster the data points using K-means clustering based on density distribution.
% [label, center] = kmeans_density(X, 2);
%
% % Plot the clustering result.
% figure;
% plot(X(label==1,1), X(label==1,2), 'r.', 'MarkerSize', 12);
% hold on;
% plot(X(label==2,1), X(label==2,2), 'b.', 'MarkerSize', 12);
% plot(center(:,1), center(:,2), 'kx', 'MarkerSize', 15, 'LineWidth', 3);
% legend('Cluster 1', 'Cluster 2', 'Centroids', 'Location', 'NW');
% title('K-means Clustering based on Density Distribution');
% hold off;
%
% See also KMEANS, KMEANS_DENSITY_D, KMEANS_DENSITY_N.
% References:
% [1] J. Shi and J. Malik, "Normalized Cuts and Image Segmentation," IEEE Transactions
% on Pattern Analysis and Machine Intelligence, vol.22, no.8, pp.888-905, Aug. 2000.
% [2] J. Shi and J. Malik, "Normalized Cuts and Image Segmentation," University of
% California at Berkeley, Computer Science Division, Tech. Rep. #TR-00-01, 2000.
% Copyright (c) 2013, Guangdi Li
% Copyright (c) 2014, Guangdi Li
% All rights reserved.
%
% Redistribution and use in source and binary forms, with or without modification,
% are permitted provided that the following conditions are met:
%
% 1. Redistributions of source code must retain the above copyright notice,
% this list of conditions and the following disclaimer.
%
% 2. Redistributions in binary form must reproduce the above copyright notice,
% this list of conditions and the following disclaimer in the documentation
% and/or other materials provided with the distribution.
%
% THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS "AS IS" AND
% ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE IMPLIED
% WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE ARE DISCLAIMED.
% IN NO EVENT SHALL THE COPYRIGHT HOLDER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT,
% INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING,
% BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE, DATA,
% OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY,
% WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE)
% ARISING IN ANY WAY OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY
% OF SUCH DAMAGE.
% Author: Guangdi Li, 2013-12-28
% Check inputs.
narginchk(2,2);
% n points in p-dimension space
[n, p] = size(X);
% Initialize some variables.
label = zeros(n,1);
last = zeros(n,1);
center = X(randsample(n,k),:);
obj_fcn = Inf;
% Iterate until convergence.
while any(label ~= last)
% Save the last labels.
last = label;
% Compute the distance from each point to each center.
distance = pdist2(X,center);
% Compute the density distribution of each point.
density = sum(exp(-distance.^2/(2*std2(distance)^2)),2);
% Assign points to the nearest cluster.
[~,label] = min(distance,[],2);
% Reassign empty clusters.
for j = find(histc(label,1:k)==0)'
[~,i] = max(density.*(label==0));
label(i) = j;
end
% Compute the new centers.
for j = 1:k
center(j,:) = mean(X(label==j,:),1);
end
% Compute the objective function.
obj_fcn = sum(sum((X - center(label,:)).^2));
end
% Sort the centers by density distribution.
[~,density] = sort(sum(exp(-distance.^2/(2*std2(distance)^2)),1),'descend');
center = center(density,:);
end
```
其中,`X` 是 $n \times p$ 的数据矩阵,`k` 是聚类的个数。函数返回聚类标签 `label`、聚类中心点 `center` 和聚类的目标函数值 `obj_fcn`。
阅读全文