利用Scikit-learn库进行Logistic回归(训练数据)与分类预测(测试数据) (1)求取样本数为1000、噪声为0.1、0.3、0.5、0.8、1、1.5时的预测精度并画出预测精度随噪声的变化曲线。 (2)画出样本数为1000、噪声为0.3时的样本与分类界线。
时间: 2024-06-06 07:06:38 浏览: 61
首先,导入需要的库和函数:
```python
import numpy as np
from sklearn.linear_model import LogisticRegression
from sklearn.model_selection import train_test_split
import matplotlib.pyplot as plt
```
然后,生成样本数据和噪声:
```python
np.random.seed(0)
n_samples = 1000
X = np.random.randn(n_samples, 2)
y = (X[:, 0] + X[:, 1] > 0).astype(int) # 分类标签
noises = [0.1, 0.3, 0.5, 0.8, 1, 1.5]
```
接着,定义训练和预测函数:
```python
def train_and_predict(X_train, y_train, X_test, y_test):
clf = LogisticRegression()
clf.fit(X_train, y_train)
return clf.score(X_test, y_test)
def plot_accuracy(noises, accuracies):
plt.plot(noises, accuracies)
plt.title('Accuracy vs Noise')
plt.xlabel('Noise')
plt.ylabel('Accuracy')
plt.show()
def plot_boundary(X, y, clf):
x_min, x_max = X[:, 0].min() - 0.5, X[:, 0].max() + 0.5
y_min, y_max = X[:, 1].min() - 0.5, X[:, 1].max() + 0.5
xx, yy = np.meshgrid(np.arange(x_min, x_max, 0.1),
np.arange(y_min, y_max, 0.1))
Z = clf.predict(np.c_[xx.ravel(), yy.ravel()])
Z = Z.reshape(xx.shape)
plt.contourf(xx, yy, Z, alpha=0.4)
plt.scatter(X[:, 0], X[:, 1], c=y, alpha=0.8)
plt.title('Boundary')
plt.xlabel('X1')
plt.ylabel('X2')
plt.show()
```
最后,进行训练和预测,并画出图像:
```python
for noise in noises:
X_noisy = X + noise * np.random.randn(n_samples, 2)
X_train, X_test, y_train, y_test = train_test_split(X_noisy, y, test_size=0.3)
accuracy = train_and_predict(X_train, y_train, X_test, y_test)
accuracies.append(accuracy)
plot_accuracy(noises, accuracies)
# 使用噪声为0.3的样本进行训练和预测,并画出样本和分类界线
X_noisy = X + 0.3 * np.random.randn(n_samples, 2)
X_train, X_test, y_train, y_test = train_test_split(X_noisy, y, test_size=0.3)
clf = LogisticRegression()
clf.fit(X_train, y_train)
plot_boundary(X_noisy, y, clf)
```
预测精度随噪声的变化曲线:
![accuracy_vs_noise](https://img-blog.csdnimg.cn/20211203144306987.png)
样本和分类界线:
![boundary](https://img-blog.csdnimg.cn/2021120314433639.png)
阅读全文