在数据挖掘与机器学习中写出以下代码:加载数据并查看数据,转换数据的属性编码,创建分类器,实现分类预测
时间: 2024-05-27 18:13:36 浏览: 33
1. 加载数据并查看数据
import pandas as pd
# 加载数据
data = pd.read_csv('data.csv')
# 查看前5行数据
print(data.head())
# 查看数据信息
print(data.info())
2. 转换数据的属性编码
from sklearn.preprocessing import LabelEncoder
# 将标签列进行编码
label_encoder = LabelEncoder()
data['label'] = label_encoder.fit_transform(data['label'])
# 将属性列进行one-hot编码
data = pd.get_dummies(data, columns=['attribute'])
3. 创建分类器
from sklearn.tree import DecisionTreeClassifier
# 创建决策树分类器
clf = DecisionTreeClassifier()
4. 实现分类预测
# 分离特征和标签
X = data.drop('label', axis=1)
y = data['label']
# 划分训练集和测试集
from sklearn.model_selection import train_test_split
X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.3, random_state=1)
# 训练分类器
clf.fit(X_train, y_train)
# 预测测试集
y_pred = clf.predict(X_test)
# 计算分类器准确率
from sklearn.metrics import accuracy_score
accuracy = accuracy_score(y_test, y_pred)
print("分类器准确率为:", accuracy)
阅读全文