pgd攻击,生成并保存mnist数据集的对抗样本
时间: 2023-05-29 18:02:43 浏览: 1301
PGD攻击是一种迭代的梯度下降方法,用于生成对抗样本。在MNIST数据集上进行PGD攻击的步骤如下:
1. 加载MNIST数据集。
```python
import tensorflow as tf
from tensorflow.keras.datasets import mnist
(x_train, y_train), (x_test, y_test) = mnist.load_data()
x_train = x_train.astype('float32') / 255
x_test = x_test.astype('float32') / 255
```
2. 定义模型。
```python
model = tf.keras.models.Sequential([
tf.keras.layers.Flatten(input_shape=(28, 28)),
tf.keras.layers.Dense(128, activation='relu'),
tf.keras.layers.Dense(10, activation='softmax')
])
model.compile(optimizer='adam',
loss='sparse_categorical_crossentropy',
metrics=['accuracy'])
```
3. 针对每个测试样本,生成对抗样本。
```python
import numpy as np
def pgd_attack(model, x, y, epsilon=0.3, alpha=0.01, num_iter=40):
# epsilon: 最大扰动
# alpha: 步长
# num_iter: 迭代次数
# 保存原始图片和标签
x_adv = np.copy(x)
y_adv = np.copy(y)
for i in range(num_iter):
# 计算梯度并加入噪声
with tf.GradientTape() as tape:
tape.watch(x_adv)
logits = model(x_adv)
loss = tf.keras.losses.sparse_categorical_crossentropy(y_adv, logits)
grad, = tape.gradient(loss, x_adv)
grad = tf.sign(grad)
noise = tf.random.uniform(tf.shape(x_adv), -epsilon, epsilon)
grad = grad + noise
grad = tf.clip_by_value(grad, -epsilon, epsilon)
# 对抗样本更新并投影到L_{\infty}集合中
x_adv = x_adv + alpha * grad
x_adv = tf.clip_by_value(x_adv, x - epsilon, x + epsilon)
x_adv = tf.clip_by_value(x_adv, 0, 1)
# 找到最近的正确分类
logits = model(x_adv)
pred = tf.argmax(logits, axis=1)
mask = pred == y_adv
x_adv = tf.where(mask, x_adv, x_adv + 0)
return x_adv.numpy(), y_adv.numpy()
```
4. 对测试集进行攻击,并保存对抗样本。
```python
x_adv_list = []
y_adv_list = []
for i in range(len(x_test)):
x_adv, y_adv = pgd_attack(model, x_test[i:i+1], y_test[i:i+1])
x_adv_list.append(x_adv.reshape(28, 28))
y_adv_list.append(y_adv)
x_adv = np.array(x_adv_list)
y_adv = np.array(y_adv_list)
np.save('x_adv_pgd.npy', x_adv)
np.save('y_adv_pgd.npy', y_adv)
```
在保存了对抗样本之后,可以用以下代码加载并评估模型在对抗样本上的表现。
```python
x_adv = np.load('x_adv_pgd.npy')
y_adv = np.load('y_adv_pgd.npy')
model.evaluate(x_adv, y_adv)
```
阅读全文