随机梯度下降L2正则化 python代码
时间: 2023-11-16 14:58:16 浏览: 106
下面是使用随机梯度下降和L2正则化的Python代码实现:
```
import numpy as np
from sklearn.linear_model import SGDRegressor
X = 2*np.random.rand(100,1)
y = 4 + 3*X + np.random.randn(100, 1)
# 使用随机梯度下降的方法,penalty代表使用哪种正则项,alpha代表正则化强度,max_iter代表迭代多少次
sgd_reg = SGDRegressor(penalty='l2', alpha=0.1, max_iter=1000)
sgd_reg.fit(X, y.reshape(-1,))
print(sgd_reg.predict([[1.5]]))
print(sgd_reg.intercept_)
print(sgd_reg.coef_)
```
其中,`penalty='l2'`表示使用L2正则化,`alpha`表示正则化强度,`max_iter`表示迭代次数。`fit`方法用于训练模型,`predict`方法用于预测新的数据。
相关问题
用梯度下降算法实现Logistic回归的L1正则化和L2正则化python代码
下面是用梯度下降算法实现Logistic回归的L1正则化和L2正则化的Python代码:
```python
import numpy as np
class LogisticRegression:
def __init__(self, lr=0.1, num_iter=1000, fit_intercept=True, regularization=None, lambda_=0.1):
self.lr = lr
self.num_iter = num_iter
self.fit_intercept = fit_intercept
self.regularization = regularization
self.lambda_ = lambda_
def __add_intercept(self, X):
intercept = np.ones((X.shape[0], 1))
return np.concatenate((intercept, X), axis=1)
def __sigmoid(self, z):
return 1 / (1 + np.exp(-z))
def __loss(self, h, y):
return (-y * np.log(h) - (1 - y) * np.log(1 - h)).mean()
def __l1_regularization(self, w):
return self.lambda_ * np.abs(w[1:]).sum()
def __l2_regularization(self, w):
return self.lambda_ * np.sum(w[1:] ** 2)
def fit(self, X, y):
if self.fit_intercept:
X = self.__add_intercept(X)
self.theta = np.zeros(X.shape[1])
for i in range(self.num_iter):
z = np.dot(X, self.theta)
h = self.__sigmoid(z)
if self.regularization == 'l1':
# L1正则化
grad = np.dot(X.T, (h - y)) / y.size + self.lambda_ * np.sign(self.theta)
elif self.regularization == 'l2':
# L2正则化
grad = np.dot(X.T, (h - y)) / y.size + self.lambda_ * self.theta
else:
grad = np.dot(X.T, (h - y)) / y.size
self.theta -= self.lr * grad
def predict_prob(self, X):
if self.fit_intercept:
X = self.__add_intercept(X)
return self.__sigmoid(np.dot(X, self.theta))
def predict(self, X, threshold=0.5):
return self.predict_prob(X) >= threshold
```
其中,lr是学习率,num_iter是迭代次数,fit_intercept表示是否拟合截距,regularization表示正则化方法,lambda_是正则化系数。在fit方法中,通过判断regularization的取值,来实现L1正则化和L2正则化。在L1正则化中,使用np.sign函数计算符号函数,而在L2正则化中,直接对参数的平方和进行惩罚。在predict_prob方法中,对X进行截距拟合和sigmoid变换,返回预测概率。在predict方法中,对预测概率进行阈值处理,返回预测结果。
L2正则化python代码
### Python 实现 L2 正则化
为了减少过拟合现象,在神经网络模型中加入L2正则化是一种常见做法。通过向损失函数添加权重参数平方和的形式来惩罚较大的权值,从而使得模型更加泛化[^4]。
下面是一个简单的Python代码片段用于展示如何在计算成本以及反向传播过程中融入L2正则化:
```python
import numpy as np
def compute_cost_with_regularization(A3, Y, parameters, lambd):
"""
Implement the cost function with L2 regularization
Arguments:
A3 -- post-activation, output of forward propagation, of shape (output size, number of examples)
Y -- "true" labels vector, of shape (output size, number of examples)
parameters -- python dictionary containing parameters of the model
lambd -- regularization hyperparameter, scalar
Returns:
cost -- value of the regularized loss function
"""
m = Y.shape[1]
W1 = parameters["W1"]
W2 = parameters["W2"]
W3 = parameters["W3"]
cross_entropy_cost = -np.sum(np.multiply(Y,np.log(A3)) + np.multiply(1-Y, np.log(1-A3))) / m # This gives you the cross-entropy part of the cost
L2_regularization_cost = (lambd/(2*m))*(np.sum(np.square(W1))+np.sum(np.square(W2))+np.sum(np.square(W3)))
cost = cross_entropy_cost + L2_regularization_cost
return cost
def backward_propagation_with_regularization(X, Y, cache, lambd):
"""
Implements the backward propagation of our baseline model to which we added an L2 regularization.
Arguments:
X -- input dataset, of shape (input size, number of examples)
Y -- "true" labels vector, of shape (output size, number of examples)
cache -- cache output from forward_propagation()
lambd -- regularization hyperparameter, scalar
Returns:
gradients -- A dictionary with the gradients with respect to each parameter, activation and pre-activation variables
"""
m = X.shape[1]
(Z1, A1, W1, b1, Z2, A2, W2, b2, Z3, A3, W3, b3) = cache
dZ3 = A3 - Y
dW3 = 1./m * np.dot(dZ3, A2.T) + (lambd/m)*W3
db3 = 1./m * np.sum(dZ3, axis=1, keepdims=True)
dA2 = np.dot(W3.T, dZ3)
dZ2 = np.multiply(dA2, np.int64(A2 > 0))
dW2 = 1./m * np.dot(dZ2, A1.T) + (lambd/m)*W2
db2 = 1./m * np.sum(dZ2, axis=1, keepdims=True)
dA1 = np.dot(W2.T, dZ2)
dZ1 = np.multiply(dA1, np.int64(A1 > 0))
dW1 = 1./m * np.dot(dZ1, X.T) + (lambd/m)*W1
db1 = 1./m * np.sum(dZ1, axis=1, keepdims=True)
gradients = {"dZ3": dZ3, "dW3": dW3, "db3": db3,"dA2": dA2,
"dZ2": dZ2, "dW2": dW2, "db2": db2, "dA1": dA1,
"dZ1": dZ1, "dW1": dW1, "db1": db1}
return gradients
```
上述代码展示了两个主要功能:`compute_cost_with_regularization()` 和 `backward_propagation_with_regularization()` ,它们分别负责带有L2正则化的代价函数计算与反向传播过程中的梯度更新操作[^1]。
阅读全文
相关推荐














