用python写一个朴素贝叶斯分类的函数
时间: 2024-05-16 09:17:52 浏览: 106
好的,下面是一个简单的朴素贝叶斯分类的函数,用于处理二元分类问题:
```python
import math
def naive_bayes_classifier(train_data, train_labels, test_data):
"""
朴素贝叶斯分类器函数,用于二元分类问题
:param train_data: 训练数据,二维数组,每行为一个训练样本,每列为一个特征值
:param train_labels: 训练标签,一维数组,每个值为0或1
:param test_data: 测试数据,二维数组,每行为一个测试样本,每列为一个特征值
:return: 一维数组,每个值为0或1,表示测试数据的分类结果
"""
# 获取训练数据集合中每个特征的取值
feature_values = {}
for i in range(len(train_data)):
for j in range(len(train_data[0])):
if j not in feature_values:
feature_values[j] = set()
feature_values[j].add(train_data[i][j])
# 计算类别为0或1的先验概率
num_0 = sum([1 for label in train_labels if label == 0])
num_1 = sum([1 for label in train_labels if label == 1])
prior_0 = num_0 / len(train_labels)
prior_1 = num_1 / len(train_labels)
# 计算每个特征在类别为0或1时的条件概率
conditional_probs_0 = {}
conditional_probs_1 = {}
for feature in feature_values:
feature_values_list = list(feature_values[feature])
for value in feature_values_list:
count_0 = 0
count_1 = 0
for i in range(len(train_data)):
if train_data[i][feature] == value:
if train_labels[i] == 0:
count_0 += 1
else:
count_1 += 1
conditional_probs_0[(feature, value)] = (count_0 + 1) / (num_0 + len(feature_values_list))
conditional_probs_1[(feature, value)] = (count_1 + 1) / (num_1 + len(feature_values_list))
# 对测试数据进行分类
result = []
for data in test_data:
prob_0 = math.log(prior_0)
prob_1 = math.log(prior_1)
for feature in feature_values:
prob_0 += math.log(conditional_probs_0[(feature, data[feature])])
prob_1 += math.log(conditional_probs_1[(feature, data[feature])])
if prob_0 > prob_1:
result.append(0)
else:
result.append(1)
return result
```
使用方法:
```python
# 准备训练数据和测试数据
train_data = [[1, 1, 1], [1, 0, 1], [0, 1, 0], [0, 0, 1], [1, 1, 0], [0, 1, 1], [0, 0, 0], [1, 0, 0]]
train_labels = [1, 1, 0, 0, 1, 0, 0, 1]
test_data = [[1, 1, 0], [0, 0, 0], [1, 0, 1], [0, 1, 1]]
# 使用朴素贝叶斯分类器进行分类
result = naive_bayes_classifier(train_data, train_labels, test_data)
# 输出分类结果
print(result) # [0, 0, 1, 1]
```
阅读全文