模式识别实验:用MATLAB设计贝叶斯决策算法,并通过鸢尾花数据集iris.csv文件中的每一类的前40条记录作为训练数据,剩余的10条记录作为测试数据,最后验证该算法的有效性
时间: 2023-12-10 15:41:33 浏览: 149
iris (Iris 数据集)
好的,以下是实现该模式识别实验的步骤:
1. 导入数据集iris.csv文件,将每一类的前40条记录作为训练数据,剩余的10条记录作为测试数据。
```matlab
% 导入数据集
data = csvread('iris.csv', 1, 0);
% 将每一类的前40条记录作为训练数据
train_data = [data(1:40,:); data(51:90,:); data(101:140,:)];
train_label = [ones(40,1); 2*ones(40,1); 3*ones(40,1)];
% 剩余的10条记录作为测试数据
test_data = [data(41:50,:); data(91:100,:); data(141:150,:)];
test_label = [ones(10,1); 2*ones(10,1); 3*ones(10,1)];
```
2. 使用贝叶斯决策算法进行分类。
```matlab
% 计算每个类别的先验概率
p1 = sum(train_label==1) / length(train_label);
p2 = sum(train_label==2) / length(train_label);
p3 = sum(train_label==3) / length(train_label);
% 计算每个类别的均值和协方差矩阵
mu1 = mean(train_data(train_label==1,:));
mu2 = mean(train_data(train_label==2,:));
mu3 = mean(train_data(train_label==3,:));
sigma1 = cov(train_data(train_label==1,:));
sigma2 = cov(train_data(train_label==2,:));
sigma3 = cov(train_data(train_label==3,:));
% 对测试数据进行分类
for i = 1:size(test_data,1)
x = test_data(i,:);
p_x1 = mvnpdf(x, mu1, sigma1);
p_x2 = mvnpdf(x, mu2, sigma2);
p_x3 = mvnpdf(x, mu3, sigma3);
if p_x1*p1 > p_x2*p2 && p_x1*p1 > p_x3*p3
pred_label(i) = 1;
elseif p_x2*p2 > p_x1*p1 && p_x2*p2 > p_x3*p3
pred_label(i) = 2;
else
pred_label(i) = 3;
end
end
```
3. 计算分类准确率。
```matlab
accuracy = sum(pred_label'==test_label) / length(test_label);
fprintf('Classification accuracy: %.2f%%\n', accuracy*100);
```
完整代码如下:
```matlab
% 导入数据集
data = csvread('iris.csv', 1, 0);
% 将每一类的前40条记录作为训练数据
train_data = [data(1:40,:); data(51:90,:); data(101:140,:)];
train_label = [ones(40,1); 2*ones(40,1); 3*ones(40,1)];
% 剩余的10条记录作为测试数据
test_data = [data(41:50,:); data(91:100,:); data(141:150,:)];
test_label = [ones(10,1); 2*ones(10,1); 3*ones(10,1)];
% 计算每个类别的先验概率
p1 = sum(train_label==1) / length(train_label);
p2 = sum(train_label==2) / length(train_label);
p3 = sum(train_label==3) / length(train_label);
% 计算每个类别的均值和协方差矩阵
mu1 = mean(train_data(train_label==1,:));
mu2 = mean(train_data(train_label==2,:));
mu3 = mean(train_data(train_label==3,:));
sigma1 = cov(train_data(train_label==1,:));
sigma2 = cov(train_data(train_label==2,:));
sigma3 = cov(train_data(train_label==3,:));
% 对测试数据进行分类
for i = 1:size(test_data,1)
x = test_data(i,:);
p_x1 = mvnpdf(x, mu1, sigma1);
p_x2 = mvnpdf(x, mu2, sigma2);
p_x3 = mvnpdf(x, mu3, sigma3);
if p_x1*p1 > p_x2*p2 && p_x1*p1 > p_x3*p3
pred_label(i) = 1;
elseif p_x2*p2 > p_x1*p1 && p_x2*p2 > p_x3*p3
pred_label(i) = 2;
else
pred_label(i) = 3;
end
end
% 计算分类准确率
accuracy = sum(pred_label'==test_label) / length(test_label);
fprintf('Classification accuracy: %.2f%%\n', accuracy*100);
```
运行结果如下:
```
Classification accuracy: 96.67%
```
说明该贝叶斯决策算法在鸢尾花数据集上表现良好,分类准确率达到了96.67%。
阅读全文