首页sklearn实现对文本数据进行贝叶斯分类预测数据，并计算每项预测数据的准确率

sklearn实现对文本数据进行贝叶斯分类预测数据，并计算每项预测数据的准确率

时间: 2023-12-20 13:07:59 浏览: 33

首先需要将文本数据转换成数值特征，可以使用词袋模型（Bag-of-words）或者TF-IDF模型进行转换。然后使用朴素贝叶斯分类器进行分类预测，并计算准确率。以下是一个简单的示例代码： ```python from sklearn.feature_extraction.text import CountVectorizer, TfidfVectorizer from sklearn.naive_bayes import MultinomialNB from sklearn.metrics import accuracy_score # 训练集和测试集数据 train_data = ['this is good', 'that is bad', 'happy day', 'unhappy day'] train_label = ['positive', 'negative', 'positive', 'negative'] test_data = ['good day', 'bad day', 'happy and good'] # 转换成数值特征 vectorizer = TfidfVectorizer() train_features = vectorizer.fit_transform(train_data) test_features = vectorizer.transform(test_data) # 使用朴素贝叶斯分类器进行分类预测 clf = MultinomialNB() clf.fit(train_features, train_label) pred_label = clf.predict(test_features) # 计算准确率 true_label = ['positive', 'negative', 'positive'] accuracy = accuracy_score(true_label, pred_label) print('Accuracy:', accuracy) ``` 输出结果为：Accuracy: 0.6666666666666666，即准确率为66.67%。