X=pd.concat([card['Contacts_Count_12_mon'],card['Months_Inactive_12_mon'],card['Total_Amt_Chng_Q4_Q1'],card['Total_Relationship_Count'],card['Total_Trans_Amt'],card['Avg_Utilization_Ratio'],card['Total_Revolving_Bal'],card['Total_Trans_Ct'],card['Gender'],Education_Level_onehot,Income_Category_onehot,Marital_Status_onehot],axis=1) y=card['Attrition_Flag'] from sklearn.tree import DecisionTreeClassifier #X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.3, random_state=0) clf = DecisionTreeClassifier() clf.fit(X, y) y_pred = clf.predict(X) #plt.scatter(y, y_pred) #plt.xlabel('True Values') #plt.ylabel('Predictions') #plt.show() plt.scatter(y, y_pred, color='red', label='Predicted Values') plt.scatter(y, clf.predict(y), color='blue', label='True Values') plt.xlabel('X') plt.ylabel('Y') plt.title('Decision Tree Regression') plt.legend() plt.show() plt.show()报错Expected 2D array, got 1D array instead: array=[0. 0. 0. ... 1. 1. 1.]. Reshape your data either using array.reshape(-1, 1) if your data has a single feature or array.reshape(1, -1) if it contains a single sample.
时间: 2024-01-20 09:04:10 浏览: 24
这个报错是因为在使用scatter函数绘制散点图时,y和y_pred都是一维数组,而scatter函数要求输入的数据是二维数组。可以使用reshape函数将y和y_pred转化为二维数组,例如:
```python
y = y.values.reshape(-1, 1)
y_pred = y_pred.reshape(-1, 1)
```
这样可以将y和y_pred都转化为二维数组,使其符合scatter函数的要求。修改后的代码如下:
```python
import pandas as pd
import numpy as np
import matplotlib.pyplot as plt
from sklearn.tree import DecisionTreeClassifier
# 读取数据
card = pd.read_csv('creditcard.csv')
# 特征选择和预处理
Education_Level_onehot = pd.get_dummies(card['Education_Level'], prefix='Education_Level')
Income_Category_onehot = pd.get_dummies(card['Income_Category'], prefix='Income_Category')
Marital_Status_onehot = pd.get_dummies(card['Marital_Status'], prefix='Marital_Status')
# 构造特征矩阵和标签向量
X = pd.concat([card['Contacts_Count_12_mon'],card['Months_Inactive_12_mon'],card['Total_Amt_Chng_Q4_Q1'],card['Total_Relationship_Count'],card['Total_Trans_Amt'],card['Avg_Utilization_Ratio'],card['Total_Revolving_Bal'],card['Total_Trans_Ct'],card['Gender'],Education_Level_onehot,Income_Category_onehot,Marital_Status_onehot],axis=1)
y = card['Attrition_Flag']
# 创建决策树模型
clf = DecisionTreeClassifier()
clf.fit(X, y)
y_pred = clf.predict(X)
# 将y和y_pred转化为二维数组
y = y.values.reshape(-1, 1)
y_pred = y_pred.reshape(-1, 1)
# 绘制散点图
plt.scatter(y, y_pred, color='red', label='Predicted Values')
plt.scatter(y, clf.predict(X), color='blue', label='True Values')
plt.xlabel('True Values')
plt.ylabel('Predictions')
plt.title('Decision Tree Classification')
plt.legend()
plt.show()
```
注意到在这个示例中,我们使用的是分类问题,因此修改了代码中的一些细节。