Logistic回归——用于二分类。使用python在乳腺癌数据集上应用Logistic回归进行预测并分析。并且画出图像
时间: 2024-02-03 13:14:03 浏览: 90
首先,我们需要导入需要的库,包括numpy、pandas、sklearn和matplotlib。
``` python
import numpy as np
import pandas as pd
from sklearn.datasets import load_breast_cancer
from sklearn.model_selection import train_test_split
from sklearn.linear_model import LogisticRegression
import matplotlib.pyplot as plt
```
接下来,我们将加载乳腺癌数据集并将其拆分为训练集和测试集。
``` python
data = load_breast_cancer()
X_train, X_test, y_train, y_test = train_test_split(data.data, data.target, test_size=0.2, random_state=42)
```
然后,我们可以创建并训练我们的Logistic回归模型。
``` python
model = LogisticRegression()
model.fit(X_train, y_train)
```
现在我们可以使用测试集评估模型的性能。
``` python
accuracy = model.score(X_test, y_test)
print("Test Accuracy:", accuracy)
```
最后,我们可以使用Matplotlib绘制特征系数的条形图,以便了解哪些特征对结果的影响最大。
``` python
coef = model.coef_[0]
names = data.feature_names
plt.bar(range(len(coef)), coef)
plt.xticks(range(len(coef)), names, rotation=90)
plt.show()
```
完整代码如下:
``` python
import numpy as np
import pandas as pd
from sklearn.datasets import load_breast_cancer
from sklearn.model_selection import train_test_split
from sklearn.linear_model import LogisticRegression
import matplotlib.pyplot as plt
data = load_breast_cancer()
X_train, X_test, y_train, y_test = train_test_split(data.data, data.target, test_size=0.2, random_state=42)
model = LogisticRegression()
model.fit(X_train, y_train)
accuracy = model.score(X_test, y_test)
print("Test Accuracy:", accuracy)
coef = model.coef_[0]
names = data.feature_names
plt.bar(range(len(coef)), coef)
plt.xticks(range(len(coef)), names, rotation=90)
plt.show()
```
阅读全文