在python上用 多层感知机 回归 器 在 加州 住房数据集 上 获得 部分 依赖图 和 个体条件 期望图( ICE
时间: 2025-01-01 14:37:45 浏览: 19
在Python中使用多层感知机(Multilayer Perceptron, MLP)回归模型,并针对加州房价数据集(California Housing Dataset)获取部分依赖图(Partial Dependence Plot, PDP)和个体条件期望图(Individual Conditional Expectation, ICE),可以按照以下步骤操作:
首先,你需要安装必要的库,如`pandas`, `numpy`, `scikit-learn`以及用于绘制图形的`seaborn`和`plotly`等。
1. 导入所需的库并加载数据:
```python
import pandas as pd
import numpy as np
from sklearn.datasets import fetch_california_housing
from sklearn.model_selection import train_test_split
from sklearn.preprocessing import StandardScaler
from sklearn.neural_network import MLPRegressor
import seaborn as sns
import plotly.express as px
```
2. 加载和预处理加州房价数据集:
```python
data = fetch_california_housing()
X = data.data
y = data.target
# 分割数据集
X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.2, random_state=42)
# 数据标准化
scaler = StandardScaler()
X_train = scaler.fit_transform(X_train)
X_test = scaler.transform(X_test)
```
3. 创建并训练MLP回归模型:
```python
mlp = MLPRegressor(hidden_layer_sizes=(64,), max_iter=500, random_state=42)
mlp.fit(X_train, y_train)
```
4. 计算部分依赖图(PDP):
```python
import matplotlib.pyplot as plt
from pdpbox import pdp, get_dataset
# 获取特征和目标变量
grid = pdp.pdp_isolate(model=mlp, dataset=X_train, model_features=['MedInc', 'AveRooms'], target_feature='Price')
# 绘制PDP
pdp.pdp_plot(grid, 'MedInc')
plt.title('Median Income vs. Housing Price')
plt.show()
# 对其他特征重复上述过程
```
5. 计算个体条件期望图(ICE):
```python
def plot_ice(ices, feature):
fig = px.line(x=np.arange(len(y_train)), y=ices, color=feature, title=f'{feature} vs. Individual Conditioned Expectations')
fig.show()
# 对每个特征创建并展示ICE
for feature in ['MedInc', 'AveRooms']:
ice = pdp.ice_grid(model=mlp, dataset=X_train, model_features=[feature], num_grid_points=len(y_train))
plot_ice(ice.ice_values, feature)
```
这部分代码演示了如何使用多层感知机回归模型对加州房价数据集分析部分依赖和个体条件期望。通过观察PDP和ICE图,你可以了解各个特征对房价的影响以及每个样本点在特定特征变化下的预测情况。
阅读全文