没有合适的资源?快使用搜索试试~ 我知道了~
首页实时神经网络视频风格迁移:保持一致性
实时神经网络视频风格迁移:保持一致性
需积分: 0 1 下载量 157 浏览量
更新于2024-08-05
收藏 2.92MB PDF 举报
视频风格迁移-20171是一篇关注于利用深度学习技术在实时视频领域进行风格转换的研究论文。作者Haozhi Huang等人,来自清华大学和腾讯AI实验室,主要探讨了如何通过前馈卷积神经网络(Feed-Forward Convolutional Neural Networks, FF-CNNs)实现视频风格迁移,并同时保持帧间的时间一致性。 传统上,图像风格转移研究已经展示了深度学习的潜力,但该工作将这一概念扩展到了视频处理。他们提出了一种新型的前馈网络架构,其设计目的是在快速风格化视频帧的同时,确保相邻帧之间的内容和风格保持连贯。为了实现这一目标,研究人员开发了一个混合损失函数,它结合了输入帧的内容信息、特定风格图像的风格信息以及连续帧的时序信息。 关键创新之一是他们引入了一种新颖的两帧协同训练机制,这个机制在训练过程中计算时间损失,旨在优化模型对连续帧之间风格转移的一致性。这意味着网络不仅需要学习如何独立地处理每一帧的风格转换,还要学会在连续帧中传递和融合这种风格,以维持视觉叙事的流畅性。 与直接应用现有的图像风格转移方法不同,这种方法更注重视频数据的特性,这在实时应用场景中具有重要的实际价值。通过这种方法,视频风格迁移可以在保持艺术效果的同时,提供更自然和连贯的观感体验,这对于视频内容创作、电影特效以及实时视频编辑等领域具有显著的意义。 这篇论文为视频风格迁移提供了一个全新的深度学习框架,它在速度和一致性方面取得了突破,为视频内容的创意表达和实时处理开辟了新的可能性。通过深入理解这篇论文,读者可以了解到如何利用现代神经网络技术来增强视频的艺术表现力,同时保证动态内容的连贯性和用户体验。
资源详情
资源推荐
Real-Time Neural Style Transfer for Videos
Haozhi Huang
†‡∗
Hao Wang
‡
Wenhan Luo
‡
Lin Ma
‡
Wenhao Jiang
‡
Xiaolong Zhu
‡
Zhifeng Li
‡
Wei Liu
‡∗
†
Tsinghua University
‡
Tencent AI Lab
∗
Correspondence: huanghz08@gmail.com wliu@ee.columbia.edu
Abstract
Recent research endeavors have shown the potential of
using feed-forward convolutional neural networks to ac-
complish fast style transfer for images. In this work, we
take one step further to explore the possibility of exploiting
a feed-forward network to perform style transfer for videos
and simultaneously maintain temporal consistency among
stylized video frames. Our feed-forward network is trained
by enforcing the outputs of consecutive frames to be both
well stylized and temporally consistent. More specifically,
a hybrid loss is proposed to capitalize on the content in-
formation of input frames, the style information of a given
style image, and the temporal information of consecutive
frames. To calculate the temporal loss during the train-
ing stage, a novel two-frame synergic training mechanism
is proposed. Compared with directly applying an existing
image style transfer method to videos, our proposed method
employs the trained network to yield temporally consistent
stylized videos which are much more visually pleasant. In
contrast to the prior video style transfer method which relies
on time-consuming optimization on the fly, our method runs
in real time while generating competitive visual results.
1. Introduction
Recently, great progress has been achieved by apply-
ing deep convolutional neural networks (CNNs) to image
transformation tasks, where a feed-forward CNN receives
an input image, possibly equipped with some auxiliary in-
formation, and transforms it into a desired output image.
This kind of tasks includes style transfer [12, 27], seman-
tic segmentation [19], super-resolution [12, 7], coloriza-
tion [11, 31], etc.
A natural way to extend image processing techniques to
videos is to perform a certain image transformation frame
by frame. However, this scheme inevitably brings temporal
inconsistencies and thus causes severe flicker artifacts. The
second row in Fig. 1 shows an example of directly applying
the feed-forward network based image style transfer method
Style Image
Figure 1: Video style transfer without and with temporal
consistency. The first row displays two consecutive input
frames and a given style image. The second row shows
the stylized results generated by the method of Johnson
et al. [12]. The zoom-in regions in the middle show that
the stylized patterns are of different appearances between
the consecutive frames, which creates flicker artifacts. The
third row shows the stylized results of our method, where
the stylized patterns maintain the same appearance.
of Johnson et al . [12] to videos. It can be observed that the
zoom-in content marked by white rectangles is stylized in-
to different appearances between two consecutive frames,
therefore creating flicker artifacts. The reason is that slight
variations between adjacent video frames may be amplified
by the frame-based feed-forward network and thus result in
obviously different stylized frames. In the literature, one
solution to retain temporal coherence after video transfor-
mation is to explicitly consider temporal consistency during
the frame generation or optimization process [18, 1, 14, 22].
While effective, they are case-specific methods and thus
cannot be easily generalized to other problems. Among
them, the method of Ruder et al. [22] is specifically de-
signed for video style transfer. However, it relies on time-
consuming optimization on the fly, and takes about three
minutes to process a single frame even with pre-computed
optical flows. Another solution to maintaining temporal
consistency is to apply post-processing [15, 2]. A draw-
783
下载后可阅读完整内容,剩余8页未读,立即下载
chenbtravel
- 粉丝: 25
- 资源: 296
上传资源 快速赚钱
- 我的内容管理 展开
- 我的资源 快来上传第一个资源
- 我的收益 登录查看自己的收益
- 我的积分 登录查看自己的积分
- 我的C币 登录后查看C币余额
- 我的收藏
- 我的下载
- 下载帮助
最新资源
- 最优条件下三次B样条小波边缘检测算子研究
- 深入解析:wav文件格式结构
- JIRA系统配置指南:代理与SSL设置
- 入门必备:电阻电容识别全解析
- U盘制作启动盘:详细教程解决无光驱装系统难题
- Eclipse快捷键大全:提升开发效率的必备秘籍
- C++ Primer Plus中文版:深入学习C++编程必备
- Eclipse常用快捷键汇总与操作指南
- JavaScript作用域解析与面向对象基础
- 软通动力Java笔试题解析
- 自定义标签配置与使用指南
- Android Intent深度解析:组件通信与广播机制
- 增强MyEclipse代码提示功能设置教程
- x86下VMware环境中Openwrt编译与LuCI集成指南
- S3C2440A嵌入式终端电源管理系统设计探讨
- Intel DTCP-IP技术在数字家庭中的内容保护
资源上传下载、课程学习等过程中有任何疑问或建议,欢迎提出宝贵意见哦~我们会及时处理!
点击此处反馈
安全验证
文档复制为VIP权益,开通VIP直接复制
信息提交成功