![](https://csdnimg.cn/release/download_crawler_static/86321447/bg5.jpg)
注意,这些矩阵是通过对局部非线性滤波器进行平均得到的
:
它们是神经网络某一层滤波器的外积。因
此,
Gatys
等人的样式损失与等式的形式相同。
(1).
G = argmin e l (g (z)) :
G pz
如果这个最佳化问题可以解决,那么最小值 x 定义为
纹理图像。然而,这个过程没有理由从分布 p (x)中生成
公平的样本。事实上,公式的唯一原因。(优化算法是随
机初始化的,损失函数是高度非凸的,搜索是局部的,
不可能总是返回相同的图像。仅仅是因为这个公式。(3)
可以降落在不同的样本 x 在不同的运行。
深度过滤器组。构建 Julesz 集成需要选择滤波器组 f。最
初,研究人员考虑了明显的候选者: 高斯导数滤波器,Gabor
滤波器,小波,直方图,和类似的[20,16,21]。最近,Gatys
等[4,5]的工作表明,深度卷积神经网络(cnn)能够自动学习更
优秀的滤波器,即使在训练显然不相关的问题,如图像分类
时也是如此。在本文中,特别是我们选择 l (x)提出的样式损
失[4]。后者是 CNN 中深度滤波响应的经验相关矩阵之间的
距离
程式化。Gatys 等[4]的纹理生成方法可以看作是 Portilla
和 Simoncelli [16]的最小化纹理生成技术(3)的直接延伸。后
来,Gatys 等[5]证明了同样的技术可以用来生成一个混合了
其他两个图像的统计信息的图像,一个用作纹理模板,另一
个用作内容模板。通过引入第二个损失 Lcont 来捕获内容。
(x; x0) ,比较从生成的图像 x 和内容图像 x0 提取的深度
CNN 滤波器的响应。最小化组合损失 l (x) + Lcont。(x; x0)
产生令人印象深刻的艺术图像,其中定义艺术风格的纹理与
内容图像融合。
前向发电机网络。尽管与马尔可夫采样技术相比,逐步优
化(3)具有简单和高效的特点,但对于实时应用来说,它仍然
相对缓慢,而且肯定太慢了。因此,在过去的几个月中,一
些作者[8,19]已经提出学习发生器神经网络 g (z) ,可以直接
映射随机噪声样本 z pz = n (0; i)到方程的局部最小值。(3).
学习神经网络 g 等于最小化目标
While this approach works well in practice, it shares the
same important limitation as the original work of Portilla
and Simoncelli: there is no guarantee that samples
gener-ated by g would be fair samples of the texture
distribu-tion (2). In practice, as we show in the paper,
such samples tend in fact to be not diverse enough.
虽然这种方法在实践 中 运行良好, 但它与 Portilla 和
Simoncelli 的原始工作具有同样重要的局限性: 不能保证
由 g 生成的样本是纹理分布的公平样本(2)。在实践中,
正如我们在论文中所展示的,这样的样本实际上往往不
够多样化。
Both [8, 19] have also shown that similar generator
net-works work also for stylization. In this case, the
generator g(x
0
; z) is a function of the content image x
0
and of the random noise z. The network g is learned to
minimize the sum of texture loss and the content loss:
这两个[8,19]也表明,类似的发电机网络也工作程式化。
在这种情况下,生成器 g (x0; z)是内容图像 x0 和随机噪
声 z 的函数。网络 g 用于最小化纹理损失和内容损失的
总和:
1
Note that such matrices are obtained by averaging local non-linear filters:
these are the outer products of filters in a certain layer of the neural network.
Hence, the style loss of Gatys et al. is in the same form as eq. (1).
If this optimization problem can be solved, the
minimizer x is by definition a texture image. However,
there is no reason why this process should generate fair
samples from the distribution p(x). In fact, the only
reason why eq. (3) may not simply return always the
same image is that the op-timization algorithm is
randomly initialized, the loss func-tion is highly non-
convex, and search is local. Only because of this eq. (3)
may land on different samples x on different runs.
Deep filter banks. Constructing a Julesz ensemble re-
quires choosing a filter bank F . Originally, researchers con-
sidered the obvious candidates: Gaussian derivative filters,
Gabor filters, wavelets, histograms, and similar [20, 16, 21].
More recently, the work of Gatys et al. [4, 5] demonstrated
that much superior filters are automatically learned by deep
convolutional neural networks (CNNs) even when trained
for apparently unrelated problems, such as image classifi-
cation. In this paper, in particular, we choose for L(x) the
style loss proposed by [4]. The latter is the distance
between the empirical correlation matrices of deep filter
responses in a CNN.
1
Stylization. The texture generation method of Gatys et al.
[4] can be considered as a direct extension of the texture
generation-by-minimization technique (3) of Portilla and
Simoncelli [16]. Later, Gatys et al. [5] demonstrated that the
same technique can be used to generate an image that
mixes the statistics of two other images, one used as a tex-
ture template and one used as a content template. Content
is captured by introducing a second loss L
cont.
(x; x
0
) that
compares the responses of deep CNN filters extracted from
the generated image x and a content image x
0
. Minimizing
the combined loss L(x) + L
cont.
(x; x
0
) yields impressive
artistic images, where a texture , defining the artistic style, is
fused with the content image x
0
.
Feed-forward generator networks. For all its simplicity
and efficiency compared to Markov sampling techniques,
generation-by-optimization (3) is still relatively slow, and
certainly too slow for real-time applications. Therefore, in
the past few months several authors [8, 19] have proposed
to learn generator neural networks g(z) that can directly
map random noise samples z p
z
= N (0; I) to a local mini-
mizer of eq. (3). Learning the neural network g amounts to
minimizing the objective