首页train=housing.sample(frac=0.7) test=housing[~housing.index.isin(train.index)] mdl=stats.OLS.from_formula('price~sqft_living',train).fit() ypred,ytrue=mdl.predict(test['sqft_living']),test['price'] frmse=np.sqrt(np.dot((ypred-ytrue).T,ypred-ytrue)/len(ytrue)) mdl_=stats.OLS.from_formula('np.log(price)~np.log(sqft_living)',train).fit() ypred,ytrue=np.exp(mdl_.predict(test['sqft_living'])),test['price'] frmse=np.sqrt(np.dot((ypred-ytrue).T,ypred-ytrue)/len(ytrue)) from sklearn.metrics import mean_squared_error mean_squared_error(ytrue,ypred)**0.5

train=housing.sample(frac=0.7) test=housing[~housing.index.isin(train.index)] mdl=stats.OLS.from_formula('price~sqft_living',train).fit() ypred,ytrue=mdl.predict(test['sqft_living']),test['price'] frmse=np.sqrt(np.dot((ypred-ytrue).T,ypred-ytrue)/len(ytrue)) mdl_=stats.OLS.from_formula('np.log(price)~np.log(sqft_living)',train).fit() ypred,ytrue=np.exp(mdl_.predict(test['sqft_living'])),test['price'] frmse=np.sqrt(np.dot((ypred-ytrue).T,ypred-ytrue)/len(ytrue)) from sklearn.metrics import mean_squared_error mean_squared_error(ytrue,ypred)**0.5

时间: 2023-06-23 12:08:17 浏览: 414

这段代码是用来训练一个线性回归模型，并计算其在测试集上的均方根误差(RMSE)。具体步骤如下： 1. 从原始数据集中随机抽取70%的数据作为训练集，剩下的30%作为测试集。 2. 使用训练集拟合一个线性回归模型，其中因变量为房屋价格(price)，自变量为房屋面积(sqft_living)。 3. 在测试集上使用训练好的模型进行预测，并记录预测值(ypred)和真实值(ytrue)。 4. 计算模型在测试集上的均方根误差(RMSE)。这里有两种计算方式，第一种是使用numpy库中的dot函数和sqrt函数，第二种是使用sklearn.metrics库中的mean_squared_error函数。 5. 重复2~4步，但这里使用对数转换后的自变量和因变量来拟合模型，计算其在测试集上的均方根误差(RMSE)。

阅读全文