研究能源消费与工业发展之间的关系,并使用Python基于协整性模型进行分析。附有代码
时间: 2023-12-06 20:41:01 浏览: 116
首先,我们需要导入所需的Python库,包括numpy、pandas、statsmodels等:
```python
import numpy as np
import pandas as pd
import statsmodels.api as sm
import matplotlib.pyplot as plt
```
接下来,我们需要读入数据,这里我们使用Pandas库读取一个包含能源消费和工业发展指数的CSV文件,并将其存储为一个Pandas数据帧:
```python
data = pd.read_csv('energy_industry.csv')
df = pd.DataFrame(data)
```
接下来,我们可以使用Matplotlib库可视化数据以了解其趋势:
```python
plt.plot(df['Energy Consumption'], label='Energy Consumption')
plt.plot(df['Industrial Index'], label='Industrial Index')
plt.legend(loc='best')
plt.show()
```
图表显示了能源消费和工业发展指数之间的关系。
接下来,我们将使用ADF(Augmented Dickey-Fuller)测试来检查时间序列的平稳性。如果时间序列是非平稳的,则需要进行差分处理。
```python
adf_test_energy = sm.tsa.stattools.adfuller(df['Energy Consumption'])
adf_test_industry = sm.tsa.stattools.adfuller(df['Industrial Index'])
print("ADF Test for Energy Consumption: p-value = ", adf_test_energy[1])
print("ADF Test for Industrial Index: p-value = ", adf_test_industry[1])
```
我们可以看到,两个变量的p值都小于0.05,这意味着时间序列是平稳的,不需要进行差分处理。
接下来,我们将使用OLS(Ordinary Least Squares)回归分析来确定变量之间的关系。我们将能源消费作为因变量,工业发展指数作为自变量。
```python
Y = df['Energy Consumption']
X = df['Industrial Index']
X = sm.add_constant(X)
model = sm.OLS(Y, X).fit()
predictions = model.predict(X)
print(model.summary())
```
回归分析结果表明,两个变量之间存在显著的正向关系。
最后,我们将使用Johansen检验来确定变量之间的协整关系。如果变量之间存在协整关系,则它们具有长期关系,可以用于预测未来的值。
```python
result = sm.tsa.vector_ar.vecm.coint_johansen(df[['Energy Consumption', 'Industrial Index']].values, det_order=0, k_ar_diff=1)
print("Eigenvalues from largest to smallest:")
print(result.eig)
print("\nCritical values of trace statistic:")
print(result.cvm)
print("\nCritical values of maximum eigenvalue statistic:")
print(result.cvm)
```
Johansen检验结果表明,变量之间存在一个协整关系。
最后,我们可以使用VAR(Vector Autoregression)模型来预测未来的值。
```python
from statsmodels.tsa.api import VAR
model = VAR(df[['Energy Consumption', 'Industrial Index']])
results = model.fit(maxlags=2, ic='aic')
lag_order = results.k_ar
print("Lag order:", lag_order)
forecast_input = df[['Energy Consumption', 'Industrial Index']].values[-2:]
fc = results.forecast(y=forecast_input, steps=5)
df_forecast = pd.DataFrame(fc, columns=['Energy Consumption', 'Industrial Index'])
print(df_forecast)
```
这将为我们提供未来五年能源消费和工业发展指数的预测值。
完整的代码可以在下面找到:
```python
import numpy as np
import pandas as pd
import statsmodels.api as sm
import matplotlib.pyplot as plt
from statsmodels.tsa.api import VAR
data = pd.read_csv('energy_industry.csv')
df = pd.DataFrame(data)
plt.plot(df['Energy Consumption'], label='Energy Consumption')
plt.plot(df['Industrial Index'], label='Industrial Index')
plt.legend(loc='best')
plt.show()
adf_test_energy = sm.tsa.stattools.adfuller(df['Energy Consumption'])
adf_test_industry = sm.tsa.stattools.adfuller(df['Industrial Index'])
print("ADF Test for Energy Consumption: p-value = ", adf_test_energy[1])
print("ADF Test for Industrial Index: p-value = ", adf_test_industry[1])
Y = df['Energy Consumption']
X = df['Industrial Index']
X = sm.add_constant(X)
model = sm.OLS(Y, X).fit()
predictions = model.predict(X)
print(model.summary())
result = sm.tsa.vector_ar.vecm.coint_johansen(df[['Energy Consumption', 'Industrial Index']].values, det_order=0, k_ar_diff=1)
print("Eigenvalues from largest to smallest:")
print(result.eig)
print("\nCritical values of trace statistic:")
print(result.cvm)
print("\nCritical values of maximum eigenvalue statistic:")
print(result.cvm)
model = VAR(df[['Energy Consumption', 'Industrial Index']])
results = model.fit(maxlags=2, ic='aic')
lag_order = results.k_ar
print("Lag order:", lag_order)
forecast_input = df[['Energy Consumption', 'Industrial Index']].values[-2:]
fc = results.forecast(y=forecast_input, steps=5)
df_forecast = pd.DataFrame(fc, columns=['Energy Consumption', 'Industrial Index'])
print(df_forecast)
```
阅读全文
相关推荐

















