朴素贝叶斯算法实现python
时间: 2023-12-30 07:24:31 浏览: 89
朴素贝叶斯算法是一种基于概率统计的分类算法,常用于文本分类、垃圾邮件过滤等任务。下面是一个简单的朴素贝叶斯算法的Python实现示例:
```python
import numpy as np
class NaiveBayes:
def __init__(self):
self.prior_prob = {} # 存储先验概率
self.cond_prob = {} # 存储条件概率
def train(self, X, y):
# 计算先验概率
classes, counts = np.unique(y, return_counts=True)
total_samples = len(y)
for i in range(len(classes)):
self.prior_prob[classes[i]] = counts[i] / total_samples
# 计算条件概率
for c in classes:
self.cond_prob[c] = {}
c_indices = np.where(y == c)[0]
c_X = X[c_indices]
total_words = np.sum(c_X)
for j in range(X.shape[1]):
word_count = np.sum(c_X[:, j])
self.cond_prob[c][j] = (word_count + 1) / (total_words + X.shape[1])
def predict(self, X):
y_pred = []
for x in X:
max_prob = -1
pred_class = None
for c in self.prior_prob.keys():
prob = np.log(self.prior_prob[c])
for i in range(len(x)):
if x[i] > 0:
prob += np.log(self.cond_prob[c][i])
if prob > max_prob:
max_prob = prob
pred_class = c
y_pred.append(pred_class)
return y_pred
```
上述代码实现了一个简单的朴素贝叶斯分类器,包括训练和预测两个方法。在训练阶段,先计算各个类别的先验概率,然后计算每个词对于每个类别的条件概率。在预测阶段,根据先验概率和条件概率计算样本属于每个类别的概率,并选择概率最大的类别作为预测结果。
阅读全文