Iterative methods [9] iteratively apply fast gradient
multiple times with a small step size α. The iterative version
of FGSM (I-FGSM) can be expressed as:
x
∗
0
= x, x
∗
t+1
= x
∗
t
+ α · sign(∇
x
J(x
∗
t
, y)). (3)
To make the generated adversarial examples satisfy the L
∞
(or L
2
) bound, one can clip x
∗
t
into the vicinity of x or
simply set α =
/T with T being the number of iterations.
It has been shown that iterative methods are stronger white-
box adversaries than one-step methods at the cost of worse
transferability [10, 24].
Optimization-based methods [23] directly optimize the
distance between the real and adversarial examples sub-
ject to the misclassification of adversarial examples. Box-
constrained L-BFGS can be used to solve such a problem.
A more sophisticated way [1] is solving:
arg min
x
∗
λ · kx
∗
− xk
p
− J(x
∗
, y). (4)
Since it directly optimizes the distance between an adver-
sarial example and the corresponding real example, there
is no guarantee that the L
∞
(L
2
) distance is less than the
required value. Optimization-based methods also lack the
efficacy in black-box attacks just like iterative methods.
2.2. Defense methods
Among many attempts [13, 3, 15, 10, 24, 17, 11], adver-
sarial training is the most extensively investigated way to
increase the robustness of DNNs [5, 10, 24]. By injecting
adversarial examples into the training procedure, the adver-
sarially trained models learn to resist the perturbations in
the gradient direction of the loss function. However, they do
not confer robustness to black-box attacks due to the cou-
pling of the generation of adversarial examples and the pa-
rameters being trained. Ensemble adversarial training [24]
augments the training data with the adversarial samples pro-
duced not only from the model being trained, but also from
other hold-out models. Therefore, the ensemble adversari-
ally trained models are robust against one-step attacks and
black-box attacks.
3. Methodology
In this paper, we propose a broad class of momentum
iterative gradient-based methods to generate adversar-
ial examples, which can fool white-box models as well as
black-box models. In this section, we elaborate the pro-
posed algorithms. We first illustrate how to integrate mo-
mentum into iterative FGSM, which induces a momentum
iterative fast gradient sign method (MI-FGSM) to generate
adversarial examples satisfying the L
∞
norm restriction in
the non-targeted attack fashion. We then present several
methods on how to efficiently attack an ensemble of mod-
els. Finally, we extend MI-FGSM to L
2
norm bound and
targeted attacks, yielding a broad class of attack methods.
Algorithm 1 MI-FGSM
Input: A classifier f with loss function J; a real example x and
ground-truth label y;
Input: The size of perturbation ; iterations T and decay factor µ.
Output: An adversarial example x
∗
with kx
∗
− xk
∞
≤ .
1: α =
/T ;
2: g
0
= 0; x
∗
0
= x;
3: for t = 0 to T − 1 do
4: Input x
∗
t
to f and obtain the gradient ∇
x
J(x
∗
t
, y);
5: Update g
t+1
by accumulating the velocity vector in the
gradient direction as
g
t+1
= µ · g
t
+
∇
x
J(x
∗
t
, y)
k∇
x
J(x
∗
t
, y)k
1
; (6)
6: Update x
∗
t+1
by applying the sign gradient as
x
∗
t+1
= x
∗
t
+ α · sign(g
t+1
); (7)
7: end for
8: return x
∗
= x
∗
T
.
3.1. Momentum iterative fast gradient sign method
The momentum method [18] is a technique for accelerat-
ing gradient descent algorithms by accumulating a velocity
vector in the gradient direction of the loss function across
iterations. The memorization of previous gradients helps to
barrel through narrow valleys, small humps and poor local
minima or maxima [4]. The momentum method also shows
its effectiveness in stochastic gradient descent to stabilize
the updates [20]. We apply the idea of momentum to gener-
ate adversarial examples and obtain tremendous benefits.
To generate a non-targeted adversarial example x
∗
from
a real example x, which satisfies the L
∞
norm bound,
gradient-based approaches seek the adversarial example by
solving the constrained optimization problem
arg max
x
∗
J(x
∗
, y), s.t. kx
∗
− xk
∞
≤ , (5)
where is the size of adversarial perturbation. FGSM gen-
erates an adversarial example by applying the sign of the
gradient to a real example only once (in Eq. (1)) by the
assumption of linearity of the decision boundary around
the data point. However in practice, the linear assump-
tion may not hold when the distortion is large [12], which
makes the adversarial example generated by FGSM “under-
fits” the model, limiting its attack ability. In contrast, it-
erative FGSM greedily moves the adversarial example in
the direction of the sign of the gradient in each iteration (in
Eq. (3)). Therefore, the adversarial example can easily drop
into poor local maxima and “overfit” the model, which is
not likely to transfer across models.
In order to break such a dilemma, we integrate momen-
tum into the iterative FGSM for the purpose of stabiliz-
ing update directions and escaping from poor local max-
ima. Therefore, the momentum-based method remains the
transferability of adversarial examples when increasing it-
3