![](https://csdnimg.cn/release/download_crawler_static/87863197/bg6.jpg)
2021 6
TABLE 1
The involved concepts and notations for equivariant convolutions
(eConv) in the continuous and discrete domains, respectively.
Concept
Notation
Continuous Discrete
Input Image r(x) I
Transformation Group O(2) S
Group element/Index A, B ∈ O(2) A, B ∈ S
Feature Map e(x, A) F
A
Filter (Input) ϕ
in
A
−1
x
˜
Ψ
A
Filter (Intermediate) ϕ
A
B
−1
x
˜
Φ
B,A
Filter (Output) ϕ
out
B
−1
x
˜
Υ
B
eConv (Input) Ψ[r]
˜
Ψ ? I
eConv (Intermediate) Φ[e]
˜
Φ ? F
eConv (Output) Υ[e]
˜
Υ ? F
convolution on digital images. Correspondingly, the contin-
uous version of I and F
A
are denoted as 2-dimensional
functions r(x) and e(x, A), x ∈ R
2
. Although continuous
F-Conv can not be directly utilized in practice, theoretical
analysis in continuous domain should facilitate to more
easily understanding the insights underlying the F-Conv,
while alleviates the complexity and approximation error
caused by the discretization.
In Section 4.3, we will formally introduce the discretized
F-Conv, which is the practically available version. To avoid
the possible confusion cased by different notations in con-
tinuous and discrete domains, we list the major notations
correspondingly used in two domains in Table 1.
4.2 Equivariant Convolution on Continuous Functions
For convolutional networks, the input image and interme-
diate feature map can be naturally modeled as functions
defined in the continuous domain
7
.
Following the previous works [4], [5], we denote the
orthogonal group as O(2), and consider the equivariance on
it
8
. Formally, O(2) = {A ∈ R
2×2
|A
T
A = I
2×2
}, which con-
tains all rotation and reflection matrices. Without ambiguity,
we use A to parameterize O(2). We consider the Euclidean
group E(2) = R
2
o O(2) (o is a semidirect-product), whose
element is represented as (x, A). Restricting the domain of
A and x, we can also use this representation to parametrize
any subgroup of E(2), such as the rotation group.
We then model the network input image as a function
defined on R
2
, denoted as r(x), and model the intermediate
feature map as a function defined on E(2), denoted as
e(x, A) (as shown in Table 1, r(x) and e(x, A) are the contin-
uous versions of I and F
A
in Fig. 4, respectively). A feature
map e can also be viewed as a set of functions defined on R
2
with infinite channels indexed by A. We denote the function
spaces of r and e as C
∞
(R
2
) and C
∞
(E(2)), respectively
9
.
Then, transformations on inputs and feature maps can be
7
In this paper, we only consider the 2D case, which can be easily
generalized to n dimensional cases.
8
S is a subgroup of O(2), and it is also regarded as the discretization
of O(2) in this paper.
9
The smoothness of e means that the feature map e(x, A) is smooth
with respect to x when A is fixed. For simplicity, we set the functional
space as C
∞
(R
2
). Actually, in implementation, we only require that
r ∈ C
2
(R
2
). The requirement on e is the same.
mathematically formulated. For an input r ∈ C
∞
(R
2
) and
transformation
˜
A ∈ O(2),
˜
A acts on r by
π
R
˜
A
[r](x) = r(
˜
A
−1
x), ∀x ∈ R
2
. (11)
For a feature map e ∈ C
∞
(E(2)) and transformation
˜
A ∈
O(2),
˜
A acts on e by
π
E
˜
A
[e](x, A) = e(
˜
A
−1
x,
˜
A
−1
A), ∀(x, A) ∈ E(2). (12)
In this way, we can construct parameterized convolu-
tions as Eq. (9) for input layer, intermediate layers (group-
convolutional layers) and output layer, respectively. Note
that the discrete version of these convolutions are shown in
Fig. 4 for easy understanding.
Input layer. We use Ψ to denote the convolution im-
posed on the input layer, which maps an input r ∈ C
∞
(R
2
)
to a feature map defined on E(2). Specifically, for any
(y, A) ∈ E(2), we define:
Ψ[r](y, A) =
Z
R
2
ϕ
in
A
−1
x
r(y − x)dσ(x), (13)
where σ is a measure on R
2
and ϕ
in
is a parameterized
filter defined in the formulation of (9). Note that (13) can
be roughly viewed as convoluting r(x) with a set of trans-
formed filters,
ϕ
in
A
−1
x
|A ∈ O(2)
.
Intermediate layers. We use Φ to denote the convolu-
tion on an intermediate layer, which maps an feature map
e ∈ C
∞
(E(2)) to another feature map defined on E(2).
Specifically, for any (y, B) ∈ E(2), we define:
Φ[e](y, B)=
Z
R
2
Z
O(2)
ϕ
A
B
−1
x
e(y − x, BA)dσ(x)dv(A),
(14)
where v is a measure on O(2), A, B ∈ O(2) denote or-
thogonal transformations in the considered group, and ϕ
A
indicates the filter with respect to the channel of feature map
indexed by A, which is parameterized by (9).
Output layer. We use Υ to denote the convolution on the
final layer, which maps a feature map e ∈ C
∞
(E(2)) to a
function in C
∞
(R
2
). Specifically, for any y ∈ R
2
, we define:
Υ[e](y)=
Z
R
2
Z
O(2)
ϕ
out
B
−1
x
e(y−x, B)dσ(x)dv(B), (15)
where B ∈ O(2) and ϕ
out
is a parameterized filter.
We now show that the above operators are equivariant
under orthogonal transformations (O(2)) and describe how
the outputs are transformed with the transformations of in-
puts. Specifically, inspired by the theoretical result presented
in [5], we can deduce the following theorem.
Theorem 2. For r ∈ C
∞
(R
2
), e ∈ C
∞
(E(2)) and
˜
A ∈ O(2),
the following rules are satisfied:
Ψ
h
π
R
˜
A
[r]
i
= π
E
˜
A
[Ψ [r]] ,
Φ
h
π
E
˜
A
[e]
i
= π
E
˜
A
[Φ [e]] ,
Υ
h
π
E
˜
A
[e]
i
= π
R
˜
A
[Υ [e]] ,
(16)
where π
R
˜
A
, π
E
˜
A
, Ψ, Φ and Υ are defined by (11), (12), (13), (14)
and (15), respectively.
Furthermore, since the convolution operators are nat-
urally translation equivariant, it is easy to verify that the