用鸢尾花数据集手敲复现朴素贝叶斯算法
时间: 2023-10-24 08:00:54 浏览: 90
您希望手动实现朴素贝叶斯算法来处理鸢尾花数据集。以下是一个简单的手动实现示例:
```python
import numpy as np
from sklearn.datasets import load_iris
from sklearn.model_selection import train_test_split
# 加载鸢尾花数据集
iris = load_iris()
X = iris.data
y = iris.target
# 拆分数据集为训练集和测试集
X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.2, random_state=42)
# 计算每个类别的先验概率
prior_probabilities = np.zeros(3)
for label in y_train:
prior_probabilities[label] += 1
prior_probabilities /= len(y_train)
# 计算每个特征在每个类别下的均值和标准差
mean = np.zeros((3, X_train.shape[1]))
std = np.zeros((3, X_train.shape[1]))
for label in range(3):
X_label = X_train[y_train == label]
mean[label] = np.mean(X_label, axis=0)
std[label] = np.std(X_label, axis=0)
# 预测函数
def predict(X):
predictions = []
for x in X:
probabilities = []
for label in range(3):
likelihood = 1.0
for feature in range(X.shape[1]):
exponent = np.exp(-((x[feature] - mean[label][feature]) ** 2) / (2 * (std[label][feature] ** 2)))
likelihood *= exponent / (np.sqrt(2 * np.pi) * std[label][feature])
probabilities.append(prior_probabilities[label] * likelihood)
predictions.append(np.argmax(probabilities))
return predictions
# 预测测试集并计算准确率
y_pred = predict(X_test)
accuracy = np.sum(y_pred == y_test) / len(y_test)
print("Accuracy:", accuracy)
```
这段代码手动实现了朴素贝叶斯算法来处理鸢尾花数据集。首先,它将数据集拆分为训练集和测试集。然后,计算每个类别的先验概率以及每个特征在每个类别下的均值和标准差。最后,定义了一个预测函数来使用这些参数进行预测,并计算了预测的准确率。
阅读全文