编程实现对率回归,并给出西瓜数据集3.0、
时间: 2024-01-11 07:02:34 浏览: 109
机器学习基于python实现对数几率回归模型对西瓜、鸢尾花进行分类源码+项目说明+实验报告.zip
5星 · 资源好评率100%
好的,对率回归是一种二分类算法,可以用于解决分类问题。下面是对率回归的实现过程:
1. 数据预处理:读取数据集,将数据集分为训练集和测试集,对数据进行标准化处理。
2. 模型训练:使用训练集训练模型,得到模型参数。
3. 模型预测:使用测试集进行预测,计算预测准确率。
具体实现代码如下:
```python
import numpy as np
import pandas as pd
from sklearn.model_selection import train_test_split
from sklearn.preprocessing import StandardScaler
class LogisticRegression:
def __init__(self, lr=0.01, num_iter=100000, fit_intercept=True, verbose=False):
self.lr = lr
self.num_iter = num_iter
self.fit_intercept = fit_intercept
self.verbose = verbose
def __add_intercept(self, X):
intercept = np.ones((X.shape[0], 1))
return np.concatenate((intercept, X), axis=1)
def __sigmoid(self, z):
return 1 / (1 + np.exp(-z))
def __loss(self, h, y):
return (-y * np.log(h) - (1 - y) * np.log(1 - h)).mean()
def fit(self, X, y):
if self.fit_intercept:
X = self.__add_intercept(X)
self.theta = np.zeros(X.shape[1])
for i in range(self.num_iter):
z = np.dot(X, self.theta)
h = self.__sigmoid(z)
gradient = np.dot(X.T, (h - y)) / y.size
self.theta -= self.lr * gradient
if(self.verbose == True and i % 10000 == 0):
z = np.dot(X, self.theta)
h = self.__sigmoid(z)
print(f'loss: {self.__loss(h, y)} \t')
def predict_prob(self, X):
if self.fit_intercept:
X = self.__add_intercept(X)
return self.__sigmoid(np.dot(X, self.theta))
def predict(self, X, threshold=0.5):
return self.predict_prob(X) >= threshold
# 读取数据集
data = pd.read_csv('watermelon_3.csv')
# 划分训练集和测试集
X_train, X_test, y_train, y_test = train_test_split(data.iloc[:, :-1], data.iloc[:, -1], test_size=0.2, random_state=0)
# 数据标准化处理
scaler = StandardScaler()
X_train = scaler.fit_transform(X_train)
X_test = scaler.transform(X_test)
# 训练模型
model = LogisticRegression(lr=0.1, num_iter=300000)
model.fit(X_train, y_train)
# 预测结果
y_pred = model.predict(X_test)
# 计算准确率
accuracy = np.mean(y_pred == y_test)
print(f'Accuracy: {accuracy}')
```
阅读全文