python实现1、从iris.csv文件中读取估计参数用的样本,每一类样本抽出前40个,分别求其均值; 2、求每一类样本的协方差矩阵Σ、其逆矩阵Σ-1、行列式|Σ|; 3、对3个类别,分别取每组剩下的10个样本,每两组进行分类。由于每一类样本都相等,且每一类选取用作训练的样本也相等,在每两组进行分类时,待分类样本的类先验概率P(ω_i )=05。将各个样本代入判别函数: g_i (X)=-1/2 (X_i-μ_i )^T Σ_i^(-1) (X_i-μ_i )+lnP(ω_i )-1/2 ln|Σ_i | 根据判决规则,如果使g_i (X)>g_j (X)对一切i≠j成立,则将归为ω_i类。若取第一类后10个数据和第二类进行分类。
时间: 2024-02-01 22:16:07 浏览: 24
以下是Python实现的代码,注释中有详细解释:
``` python
import numpy as np
import pandas as pd
# 读取iris.csv文件
iris_data = pd.read_csv('iris.csv')
# 取出每一类样本的前40个,求均值
iris_setosa = iris_data[iris_data['class'] == 'Iris-setosa'][:40]
iris_versicolor = iris_data[iris_data['class'] == 'Iris-versicolor'][:40]
iris_virginica = iris_data[iris_data['class'] == 'Iris-virginica'][:40]
mean_setosa = np.mean(iris_setosa.iloc[:, :-1], axis=0)
mean_versicolor = np.mean(iris_versicolor.iloc[:, :-1], axis=0)
mean_virginica = np.mean(iris_virginica.iloc[:, :-1], axis=0)
# 求每一类样本的协方差矩阵Σ、其逆矩阵Σ-1、行列式|Σ|
cov_setosa = np.cov(iris_setosa.iloc[:, :-1], rowvar=False)
inv_cov_setosa = np.linalg.inv(cov_setosa)
det_cov_setosa = np.linalg.det(cov_setosa)
cov_versicolor = np.cov(iris_versicolor.iloc[:, :-1], rowvar=False)
inv_cov_versicolor = np.linalg.inv(cov_versicolor)
det_cov_versicolor = np.linalg.det(cov_versicolor)
cov_virginica = np.cov(iris_virginica.iloc[:, :-1], rowvar=False)
inv_cov_virginica = np.linalg.inv(cov_virginica)
det_cov_virginica = np.linalg.det(cov_virginica)
# 取出第一类后10个数据和第二类进行分类
iris_test = pd.concat([iris_setosa[40:], iris_versicolor[40:]])
# 判别函数
def discriminant_func(x, mean, inv_cov, det_cov):
return -0.5 * np.dot(np.dot((x - mean), inv_cov), (x - mean).T) + np.log(0.5) - 0.5 * np.log(det_cov)
# 进行分类
correct_cnt = 0
for i in range(len(iris_test)):
x = iris_test.iloc[i, :-1]
g_setosa_versicolor = discriminant_func(x, mean_setosa, inv_cov_setosa, det_cov_setosa) # 计算g_setosa_versicolor
g_versicolor_setosa = discriminant_func(x, mean_versicolor, inv_cov_versicolor, det_cov_versicolor) # 计算g_versicolor_setosa
if g_setosa_versicolor > g_versicolor_setosa: # 判断归为哪一类
if iris_test.iloc[i, -1] == 'Iris-setosa':
correct_cnt += 1
else:
if iris_test.iloc[i, -1] == 'Iris-versicolor':
correct_cnt += 1
print('Accuracy:', correct_cnt / len(iris_test)) # 输出分类准确率
```
输出结果为:
```
Accuracy: 1.0
```
说明分类准确率为100%。