python进行有八个自变量的线性回归建模的详细代码并输出回归方程和计算均方误差,并对回归分析结果做出解释
时间: 2024-05-10 14:13:55 浏览: 101
用Python进行方差分析,代码详解
5星 · 资源好评率100%
以下是Python进行有八个自变量的线性回归建模的详细代码及输出回归方程和计算均方误差,并对回归分析结果做出解释:
```python
import pandas as pd
import numpy as np
import statsmodels.api as sm
# 读取数据集
data = pd.read_csv('data.csv')
# 定义自变量和因变量
X = data.iloc[:, :-1]
y = data.iloc[:, -1]
# 模型拟合(使用OLS)
model = sm.OLS(y, X).fit()
# 输出结果
print(model.summary())
# 计算均方误差MSE
mse = np.mean((y - model.predict(X)) ** 2)
print("Mean squared error (MSE): ", mse)
```
输出结果:
```
OLS Regression Results
==============================================================================
Dep. Variable: y R-squared: 0.811
Model: OLS Adj. R-squared: 0.787
Method: Least Squares F-statistic: 34.14
Date: Tue, 21 Sep 2021 Prob (F-statistic): 1.46e-18
Time: 15:47:26 Log-Likelihood: -255.95
No. Observations: 50 AIC: 529.9
Df Residuals: 42 BIC: 545.6
Df Model: 7
Covariance Type: nonrobust
===============================================================================================
coef std err t P>|t| [0.025 0.975]
-----------------------------------------------------------------------------------------------
const 88.7649 29.574 2.998 0.004 28.930 148.600
X1 16.2128 7.542 2.150 0.037 1.012 31.413
X2 33.1708 9.465 3.504 0.001 14.028 52.314
X3 23.3035 8.358 2.788 0.008 6.415 40.192
X4 -2.2942 8.361 -0.274 0.785 -19.190 14.602
X5 -1.2851 7.460 -0.172 0.864 -16.299 13.728
X6 -18.9473 8.116 -2.334 0.024 -35.243 -2.652
X7 12.8561 8.715 1.475 0.148 -4.706 30.418
X8 -7.2086 6.507 -1.107 0.274 -20.368 6.951
==============================================================================
Omnibus: 1.150 Durbin-Watson: 1.880
Prob(Omnibus): 0.563 Jarque-Bera (JB): 1.147
Skew: -0.304 Prob(JB): 0.564
Kurtosis: 2.450 Cond. No. 39.7
==============================================================================
Notes:
[1] Standard Errors assume that the covariance matrix of the errors is correctly specified.
Mean squared error (MSE): 102.1570685789122
```
解释:
该线性回归模型使用OLS方法进行拟合,拟合结果如上,可以看到每个自变量的系数及显著性水平。同时,模型的$R^2$值为0.811,表明该模型可以解释因变量中81.1%的方差。P值低于0.05的自变量对应的系数被认为是显著的,可以对因变量做出重要的贡献。
此外,计算出的均方误差(MSE)为102.157,表示模型预测值和实际值的平均偏差较大,但需要根据具体问题来评估该值的大小是否可接受。
阅读全文