如何用我的.csv文件替换下列python代码中的数据集,其中我的.csv文件含有四个变量的四列数据。 #code-4-3.py #Simple Linear Regression from sklearn.datasets import load_boston from sklearn.linear_model import LinearRegression import matplotlib.pyplot as plt from sklearn. model_selection import train_test_split dataset = load_boston() x_data = dataset.data # 导入所有特征变量 y_data = dataset.target # 导入目标值(房价) name_data = dataset.feature_names #导入特征 x_train,x_test,y_train,y_test = train_test_split(x_data, y_data,test_size= 0.25,random_state= 1001) x_data_train = x_train[:, 5].reshape(-1, 1)#选取前400个样本作为训练集 y_data_train = y_train.reshape(-1, 1) x_data_test = x_test[:, 5].reshape(-1, 1)#选取剩余的样本作为训练集 y_data_test = y_test.reshape(-1, 1) simple_model = LinearRegression() #创建线性回归估计器实例 simple_model.fit(x_data_train,y_data_train)#用训练数据拟合模型 y_data_test_p = simple_model.predict(x_data_test)#用训练的模型对测试集进行预测 plt.subplot(1, 1, 1) plt.scatter(x_data_test,y_data_test,s = 20, color="r") plt.scatter(x_data_test,y_data_test_p,s = 20, color="b") plt.xlabel('Room Number') plt.ylabel('Price') plt.title(name_data[5]) plt.show() r_squared = simple_model.score(x_data_test, y_data_test) print('R2') print(r_squared)
时间: 2023-12-24 14:15:06 浏览: 82
你可以使用pandas库中的read_csv()函数读取你的.csv文件,并将其转换为numpy数组,然后将其用于代码中的x_data和y_data。以下是代码示例:
```python
import pandas as pd
import numpy as np
from sklearn.linear_model import LinearRegression
import matplotlib.pyplot as plt
from sklearn.model_selection import train_test_split
# 读取csv文件并将其转换为numpy数组
data = pd.read_csv('your_file.csv')
x_data = np.array(data.iloc[:, :-1])
y_data = np.array(data.iloc[:, -1])
x_train,x_test,y_train,y_test = train_test_split(x_data, y_data,test_size= 0.25,random_state= 1001)
x_data_train = x_train[:, 5].reshape(-1, 1)
y_data_train = y_train.reshape(-1, 1)
x_data_test = x_test[:, 5].reshape(-1, 1)
y_data_test = y_test.reshape(-1, 1)
simple_model = LinearRegression()
simple_model.fit(x_data_train,y_data_train)
y_data_test_p = simple_model.predict(x_data_test)
plt.subplot(1, 1, 1)
plt.scatter(x_data_test,y_data_test,s = 20, color="r")
plt.scatter(x_data_test,y_data_test_p,s = 20, color="b")
plt.xlabel('Room Number')
plt.ylabel('Price')
plt.title('your_feature_name')
plt.show()
r_squared = simple_model.score(x_data_test, y_data_test)
print('R2')
print(r_squared)
```
请注意,你需要将代码中的“your_file.csv”和“your_feature_name”替换为你自己的.csv文件名和特征名称。
阅读全文