用Python代码实现对 skleam 自带的糖尿病数据集进行线性回归分析,输出线性回归方程的参数,并使用降维算法将该数据集的维度降为2,将线性回归分析的结果用图形显示出来
时间: 2024-10-19 17:11:46 浏览: 88
首先,我们来演示如何使用Python和scikit-learn库进行糖尿病数据集的线性回归分析以及降维:
```python
# 导入必要的库
import pandas as pd
import numpy as np
from sklearn import datasets
from sklearn.model_selection import train_test_split
from sklearn.linear_model import LinearRegression
from sklearn.decomposition import PCA
import matplotlib.pyplot as plt
# 加载糖尿病数据集
diabetes = datasets.load_diabetes()
X = diabetes.data
y = diabetes.target
# 划分训练集和测试集
X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.2, random_state=42)
# 创建并训练线性回归模型
regressor = LinearRegression()
regressor.fit(X_train, y_train)
# 输出线性回归方程的参数
intercept = regressor.intercept_
coefficients = regressor.coef_
print(f"Intercept: {intercept}")
print("Coefficients (feature weights):", coefficients)
# 降维到2维
pca = PCA(n_components=2)
principal_components = pca.fit_transform(X_train)
# 将降维后的数据拟合到新的坐标系上
x_pca_1 = principal_components[:, 0]
y_pca_1 = principal_components[:, 1]
# 绘制原始数据和回归线
plt.figure(figsize=(10, 6))
plt.scatter(x_pca_1, y_pca_1, color='blue', label='Original Data')
plt.plot(x_pca_1, intercept + x_pca_1 * coefficients[0] + y_pca_1 * coefficients[1], color='red', label='Linear Regression Line')
plt.xlabel('PCA Component 1')
plt.ylabel('PCA Component 2')
plt.legend()
plt.title('Linear Regression on PCA-transformed Diabetes Dataset')
plt.show()
阅读全文