深度学习驱动的视频去模糊：利用时间清晰度先验

需积分: 17 183 浏览量更新于2024-09-04 收藏 5.18MB PDF 举报

"这篇论文是关于视频去模糊的，标题为‘Cascaded Deep Video Deblurring Using Temporal Sharpness Prior’，由南京理工大学的Jinshan Pan, Haoran Bai和Jinhui Tang撰写。该研究利用深度学习技术，特别是通过时间锐度先验（Temporal Sharpness Prior）来实现视频去模糊，且在数据集上表现优秀。文章对比了不同的去模糊方法，包括Kim and Lee [12]、STFAN [32]、EDVR [27]以及作者提出的算法，指出通过在深度卷积神经网络（CNN）中引入时间锐度先验并采用级联推理方式训练，可以使得CNN模型更紧凑，从而产生比基于CNN和变分模型的方法更好的去模糊结果。在摘要中，作者介绍了一个简单而有效的深度卷积神经网络模型，用于视频去模糊。该算法主要包括两部分：中间潜在帧的光流估计和潜在帧恢复步骤。首先，它建立了一个深度CNN模型来估算中间帧的光流。接着，使用这个光流信息进行潜在帧的恢复，通过级联推理过程逐步去除模糊。这种方法的优势在于，它能从相邻帧中探索锐利像素，利用时间序列的信息帮助恢复清晰视频。论文中提到的光学流估计是关键步骤，它能够捕捉帧间的运动信息，这对于处理动态场景中的模糊至关重要。光流估计后的恢复步骤则通过CNN模型进行，该模型不仅考虑当前帧的模糊信息，还利用相邻帧的时间锐度信息，以提升去模糊效果。这种方法优于传统的基于变分模型的方法，因为它能更好地捕获动态模糊的复杂性，并且通过级联的方式优化网络结构，减少计算复杂度，提高恢复质量。此外，与以往基于CNN的去模糊方法相比，本文提出的方法更注重模型的紧凑性和性能的提升。通过级联学习，模型能够在保留细节和提高清晰度的同时，保持较小的模型大小，这在实际应用中具有重要意义，因为更小的模型通常意味着更快的运行速度和更低的资源需求。这篇论文为视频去模糊提供了一种创新的深度学习解决方案，利用时间序列信息增强CNN的性能，展示了在复杂视频模糊处理中的优越性能。这一工作对于视频处理、计算机视觉和机器学习领域都有重要的参考价值，有助于推动视频去模糊技术的进步。"

Cascaded Deep Video Deblurring Using Temporal Sharpness Prior

Jinshan Pan, Haoran Bai, and Jinhui Tang

Nanjing University of Science and Technology

(a) Input frame (b) Kim and Lee [12] (c) STFAN [32] (d) EDVR [27] (e) Ours (f) Sharpness prior of (a)

Figure 1. Deblurred result on a real challenging video. Our algorithm is motivated by the success of variational model-based methods. It

explores sharpness pixels from adjacent frames by a temporal sharpness prior (see (f)) and restores sharp videos by a cascaded inference

process. As our analysis shows, enforcing the temporal sharpness prior in a deep convolutional neural network (CNN) and learning the

deep CNN by a cascaded inference manner can make the deep CNN more compact and thus generate better-deblurred results than both the

CNN-based methods [27, 32] and variational model-based method [12].

Abstract

We present a simple and effective deep convolutional

neural network (CNN) model for video deblurring. The pro-

posed algorithm mainly consists of optical ﬂow estimation

from intermediate latent frames and latent frame restora-

tion steps. It ﬁrst develops a deep CNN model to estimate

optical ﬂow from intermediate latent frames and then re-

stores the latent frames based on the estimated optical ﬂow.

To better explore the temporal information from videos, we

develop a temporal sharpness prior to constrain the deep

CNN model to help the latent frame restoration. We de-

velop an effective cascaded training approach and jointly

train the proposed CNN model in an end-to-end manner. We

show that exploring the domain knowledge of video deblur-

ring is able to make the deep CNN model more compact and

efﬁcient. Extensive experimental results show that the pro-

posed algorithm performs favorably against state-of-the-art

methods on the benchmark datasets as well as real-world

videos. The training code and test model are available at

https://github.com/csbhr/CDVD-TSP.

1. Introduction

Video deblurring, as a fundamental problem in the vision

and graphics communities, aims to estimate latent frames

from a blurred sequence. As more videos are taken us-

ing hand-held and onboard video capturing devices, this

problem has received active research efforts within the last

decade. The blur in videos is usually caused by camera

shake, object motion, and depth variation. Recovering la-

tent frames is highly ill-posed as only the blurred videos are

given.

To recover the latent frames from a blurred sequence,

conventional methods usually make assumptions on mo-

tion blur and latent frames [12, 2, 4, 11, 5, 29]. Among

these methods, the motion blur is usually modeled as op-

tical ﬂow [12, 2, 5, 29]. The key success of these meth-

ods is to jointly estimate the optical ﬂow and latent frames

under the constraints by some hand-crafted priors. These

algorithms are physically inspired and generate promising

results. However, the assumptions on motion blur and la-

tent frames usually lead to complex energy functions which

are difﬁcult to solve.

The deep convolutional neural network (CNN), as one

of the most promising approach, has been developed to

solve video deblurring. Motivated by the success of deep

CNNs in single image deblurring, Su et al. [24] concatenate

consecutive frames and develop a deep CNN based on an

encoder-decoder architecture to directly estimate the latent

frames. Kim et al. [13] develop a deep recurrent network to

recurrently restore latent frames by the concatenating multi-

frame features. To better capture the temporal information,

Zhang et al. [31] develop spatial-temporal 3D convolutions

to help latent frame restoration. These methods perform

well when the motion blur is not signiﬁcant and displace-

ment among input frames is small. However, they are less

arXiv:2004.02501v1 [cs.CV] 6 Apr 2020

下载后可阅读完整内容，剩余9页未读，立即下载

ziqianluck

粉丝: 0

深度学习驱动的视频去模糊：利用时间清晰度先验

TI-UCD90320.pdf详细配置说明及应用示例集锦

MATLAB制作音乐压缩包教程及资源下载

Cascaded Shadow Maps技术详解

3-D Indoor Localization and Tracking Method Using UWB and IMU.pdf

Cascaded Partial Decoder for Fast and Accurate Salient Object Detection.pdf

cascaded drop down example using AJAX

A cascaded deep convolutional network for vehicle logo recognition from frontal and rear images of vehicles

Face-Recognition-using-FaceNet-master.zip

AD9690.pdf

阀体建模测试.pdf

最新资源