朴素贝叶斯垃圾邮件分类的算法代码
时间: 2023-09-14 20:02:36 浏览: 94
朴素贝叶斯垃圾邮件分类的算法代码如下所示:
```
import numpy as np
import pandas as pd
from sklearn.model_selection import train_test_split
from sklearn.naive_bayes import MultinomialNB
from sklearn.feature_extraction.text import CountVectorizer
# 导入数据
data = pd.read_csv('spam.csv', encoding='latin-1')
# 分割数据
X_train, X_test, y_train, y_test = train_test_split(data['text'], data['label'], random_state=42)
# 特征提取
cv = CountVectorizer(stop_words='english')
X_train_cv = cv.fit_transform(X_train)
# 构建模型
model = MultinomialNB()
model.fit(X_train_cv, y_train)
# 预测新邮件是否为垃圾邮件
new_email = ['Hello, this is a message from the bank.']
new_email_cv = cv.transform(new_email)
print(model.predict(new_email_cv))
```
注:此处的数据集为spam.csv,包含正常邮件和垃圾邮件。
阅读全文