不使用sklearn,手动实现朴素贝叶斯分类
时间: 2024-10-19 21:12:12 浏览: 11
手动实现朴素贝叶斯分类器,你需要了解其基本原理,即利用贝叶斯定理来计算给定特征向量的概率,以及类别之间的先验概率。这里是一个简单的例子,仅用于二分类情况,使用了高斯朴素贝叶斯:
```python
import numpy as np
class NaiveBayesClassifier:
def __init__(self):
self.classes_count = None
self.class_priors = None
self.mean_vectors = None
self.variance_vectors = None
def fit(self, X, y):
self.classes_count = np.bincount(y)
self.class_priors = self.classes_count / len(y)
self.mean_vectors = {}
self.variance_vectors = {}
for class_idx, class_label in enumerate(np.unique(y)):
class_samples = X[y == class_label]
self.mean_vectors[class_label] = np.mean(class_samples, axis=0)
self.variance_vectors[class_label] = np.cov(class_samples.T)
def _gaussian_probability(self, x, mean, variance):
# 计算单个样本x属于某类的概率,按照高斯分布
return (1 / ((2 * np.pi) ** .5 * np.sqrt(variance))) * np.exp(-((x - mean)**2) / (2 * variance))
def predict(self, X):
probabilities = []
for sample in X:
likelihoods = []
for class_idx, mean_vector in self.mean_vectors.items():
likelihood = np.prod([self._gaussian_probability(x, mean, var) for x, mean, var in zip(sample, mean_vector, self.variance_vectors[class_idx])])
likelihoods.append(likelihood * self.class_priors[class_idx])
probabilities.append(np.argmax(likelihoods))
return probabilities
# 使用示例
# 假设我们有一个二维的数据集(特征向量为2D数组)
data = [[0, 0], [1, 1], [0, 1], [1, 0]]
labels = [0, 0, 1, 1]
model = NaiveBayesClassifier()
model.fit(data, labels)
test_data = [[0, 0], [0.5, 0.5]]
predictions = model.predict(test_data)
```
在这个例子中,`fit`方法负责学习每个类别的均值和方差,`predict`方法则根据新的样本计算高斯概率并选择最有可能的类别。
阅读全文