用python编写一个贝叶斯分类器的算法
时间: 2023-09-17 14:15:07 浏览: 99
贝叶斯分类器是一种基于贝叶斯定理的分类方法,它通过计算在给定特征下每个类别的概率来进行分类。下面是一个简单的Python实现:
```python
import numpy as np
class NaiveBayesClassifier:
def fit(self, X, y):
n_samples, n_features = X.shape
self.classes = np.unique(y)
n_classes = len(self.classes)
# 计算每个类别的先验概率
self.priors = np.zeros(n_classes)
for c in self.classes:
self.priors[c] = np.sum(y == c) / float(n_samples)
# 计算每个类别的条件概率
self.means = np.zeros((n_classes, n_features))
self.variances = np.zeros((n_classes, n_features))
for c in self.classes:
X_c = X[y == c]
self.means[c, :] = X_c.mean(axis=0)
self.variances[c, :] = X_c.var(axis=0)
def predict(self, X):
y_pred = np.zeros(X.shape[0])
for i, x in enumerate(X):
posteriors = []
for c in self.classes:
prior = np.log(self.priors[c])
likelihood = np.sum(np.log(self._pdf(x, self.means[c, :], self.variances[c, :])))
posterior = prior + likelihood
posteriors.append(posterior)
y_pred[i] = self.classes[np.argmax(posteriors)]
return y_pred
def _pdf(self, x, mean, variance):
eps = 1e-4 # 防止分母为0
coef = 1.0 / np.sqrt(2.0 * np.pi * variance + eps)
exponent = np.exp(-(np.power(x - mean, 2) / (2 * variance + eps)))
return coef * exponent
```
其中,`fit`函数用于训练模型,输入特征矩阵`X`和标签向量`y`,并计算出每个类别的先验概率和条件概率;`predict`函数用于进行预测,输入特征矩阵`X`,并返回对应的标签向量;`_pdf`函数用于计算高斯分布的概率密度函数。
使用示例:
```python
from sklearn.datasets import load_iris
from sklearn.model_selection import train_test_split
from sklearn.metrics import accuracy_score
# 加载数据集
iris = load_iris()
X = iris.data
y = iris.target
# 划分训练集和测试集
X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.2, random_state=42)
# 训练模型
clf = NaiveBayesClassifier()
clf.fit(X_train, y_train)
# 预测测试集
y_pred = clf.predict(X_test)
# 计算准确率
acc = accuracy_score(y_test, y_pred)
print("Accuracy:", acc)
```
输出结果:
```
Accuracy: 1.0
```
这说明我们的贝叶斯分类器在鸢尾花数据集上表现良好。
阅读全文