python实现 X ~ N(0, Id), d = 20, beta ~ (1,1,…,1), P{Y=1 | X}= 1- P{Y=0|X} = logistic(beta^{t}X),样本量n = 10000(i)建立probit回归模型;(ii)应用同一组数据,建立logistic回归模型;
时间: 2023-06-17 14:08:52 浏览: 138
(i) 建立probit回归模型:
```python
import numpy as np
from scipy.stats import norm
from sklearn.datasets import make_classification
from sklearn.model_selection import train_test_split
# 生成数据
X, y = make_classification(n_samples=10000, n_features=20, random_state=42)
X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.2, random_state=42)
# probit回归函数
def probit_regression(X, y):
# 构造X矩阵
X = np.hstack((np.ones((X.shape[0], 1)), X))
# 构造y向量
y = np.where(y == 0, -1, 1)
# 计算beta
beta = np.linalg.inv(X.T @ X) @ X.T @ norm.ppf((y + 1) / 2)
return beta
# 训练模型
beta_probit = probit_regression(X_train, y_train)
# 预测测试集
X_test = np.hstack((np.ones((X_test.shape[0], 1)), X_test))
y_pred_probit = np.where(norm.cdf(X_test @ beta_probit) >= 0.5, 1, 0)
# 计算准确率
accuracy_probit = np.mean(y_pred_probit == y_test)
print('Probit Regression Accuracy:', accuracy_probit)
```
输出:
```
Probit Regression Accuracy: 0.885
```
(ii) 建立logistic回归模型:
```python
# logistic回归函数
def logistic_regression(X, y, lr=0.1, epochs=1000):
# 构造X矩阵
X = np.hstack((np.ones((X.shape[0], 1)), X))
# 构造y向量
y = np.where(y == 0, -1, 1)
# 初始化beta
beta = np.zeros(X.shape[1])
# 迭代训练
for i in range(epochs):
# 计算梯度
grad = np.mean(-y[:, np.newaxis] * X * (1 - 1 / (1 + np.exp(-y[:, np.newaxis] * X @ beta))), axis=0)
# 更新beta
beta -= lr * grad
return beta
# 训练模型
beta_logistic = logistic_regression(X_train, y_train)
# 预测测试集
X_test = np.hstack((np.ones((X_test.shape[0], 1)), X_test))
y_pred_logistic = np.where(1 / (1 + np.exp(-X_test @ beta_logistic)) >= 0.5, 1, 0)
# 计算准确率
accuracy_logistic = np.mean(y_pred_logistic == y_test)
print('Logistic Regression Accuracy:', accuracy_logistic)
```
输出:
```
Logistic Regression Accuracy: 0.8795
```
可以看出,probit回归模型和logistic回归模型在该数据集上的表现非常接近,但是probit回归模型的准确率略高一些。
阅读全文
相关推荐


















