深度压缩图像质量提升的CNN级联模型

118 浏览量更新于2024-08-31 收藏 789KB PDF 举报

本文提出了一种针对压缩深度图像质量增强的卷积神经网络（CNN）级联方法，名为"ACNNCascade for Quality Enhancement of Compressed Depth Images"。随着三维（3D）应用的广泛需求，深度图像与纹理数据一起传输变得越来越重要。然而，在传输过程中，由于深度图像的每个像素蕴含着3D场景的几何信息，压缩过程中的失真可能导致严重的几何扭曲和视觉感知下降。该研究针对这一问题，设计了一种专门针对压缩深度图像压缩 artifact 的抑制策略。通过深度学习的CNN架构，网络能够学习并恢复被压缩过程中损失的细节。CNN级联的设计允许模型在处理不同尺度和复杂度的图像特征时，逐层优化，逐步提升深度图像的质量。在模型构建上，论文强调了对深度图像特征的理解，因此采用了一种自适应的权重损失函数。这种损失函数能根据训练数据动态调整学习效率和精度，有助于提高模型在有限训练数据条件下的泛化能力。它能够更好地平衡图像的几何精确性和视觉真实感，使得经过训练的网络能够在压缩深度图像复原过程中，有效地减少锯齿、块效应和其他常见的压缩失真。这项工作为解决压缩深度图像的质量问题提供了一个创新且有效的解决方案，通过深度学习技术优化了图像恢复过程，并展示了在实际应用中提升用户体验的可能性。这对于3D视觉通信、虚拟现实和增强现实等领域具有重要意义，有望推动未来深度图像压缩标准的发展。

A CNN Cascade for Quality Enhancement

of Compressed Depth Images

Zhi JIN

∗

,LeiLUO

†

, Yi TANG

∗

, Wenbin ZOU

∗

,XiaLI

∗

College of Information Engineering, Shenzhen University, Shenzhen, P.R. China.

E-mail: jinzhi

126@163.com; wzouszu@sina.com

†

College of Telecommunication and Information Engineering,

Chongqing University of Posts and Telecommunications, Chongqing, P.R. China.

Abstract—Transmitting depth images along with the corre-

sponding textures enables a wide range of receiver-side 3D

applications. Since each pixel on the depth images represents a

corresponding 3D scene geometric information, when compressed

during transmission the compression artifacts will lead to severe

geometry distortions and visual perceptual degradation. To solve

this problem, in this paper we proposed a convolutional neural

network (CNN) cascade for suppressing the compression artifacts

on depth images. According to the feature of depth images,

we furthermore, adopt a weighted loss function for network

training which can adaptively improve the learning efﬁciency

and accuracy. Meanwhile, in order to over come the limited

training data problem, we audaciously trained our network on

textures ﬁrst and then ﬁnetune on the target depth images. To our

best knowledge, few works have applied CNN on depth images

targeting for compression artifacts reduction (CAR). Through

extensive experiments, our proposed solution achieves higher

quality for both reconstructed depth images and synthesized

virtual views than the state-of-the-art methods.

Index Terms—Convolutional neural network, Depth images,

JPEG compression, Compression artifacts reduction, Quality

enhancement.

I. INT RODUCT ION

Texture images associated with per-view depth image not

only can provide a depth perception of real scenes, but also can

support free navigation into other viewpoints by view synthesis

techniques, such as depth-image-based-rendering (DIBR) [1].

However, in comparison with traditional 2D images, th is

format still puts more pressure on the acq uisition, storage and

transmission units of multimedia systems. In this case, image

compression schemes are highly demanded both for texture

and depth images. On the one hand, lossy compression (e.g.

JPEG [2]) has been widely employed in social media networks

due to its high compression efﬁciency. On the other hand,

any lossy compression inevitably degrades the image’s quality.

Especially for depth image whose value presents 3D scene

geometric information, when encountered compression, severe

geometry distortions and visual p erceptual degradation over

discontinuous regions will occur, such as blocking artifacts and

blurring, which will affect quality of both depth image itself

and the synthesized views in stereoscopic image applications.

Recently, a lot of novel proposals focus on denoising

and super-resolu tion of depth images corrupted by estimation

noise and acquisition. In terms of adoption methods, they

can be classiﬁed into ﬁlter-based, model-based and currently

the most popular one, learning-based methods. Among ﬁlter-

based m ethods, one typical representation is joint bilateral

upsampling (JBU) [3] where the bilateral weights are based

on the guidance from textures. Start from it, more complex

and sophisticated ﬁlters have been proposed, for example, the

joint trilateral ﬁlter (JTF) [4]. Based on the structural similarity

between textures and depth images, ﬁlter-based methods are

used to transfer the salient structure from inten sity image to

the enhanced depth image, while for model-based methods,

the modeling dependency between texture and depth images

plays an important role, such as markov random ﬁelds (MRF)

[5] and nonlocal mean (NLM) [6] models. Motivated by the

success of deep learning on object detection and classiﬁcation,

it also has been applied to low-level vision task. Dong et al. [7]

proposed a 3-layer CNN to implement image super-resolution

and by adding one more layer, they successfully reduce the

compression artifacts on textures [8]. Inspired by [7], Zhang

et al. [9] proposed a 3-layer light convolutional network with

textures’ assistance to implement depth denoising. Besides, he

also proposed to utilize weighted loss function to emphasize

the edges inﬂuence in depth image and this also inspires u s.

Compared with acquisition and estimation noise, lossy com-

pression caused artifacts are more complex, which includes not

only noise but also blocking and blurring effects. Therefore,

the reduction of compression artifacts is more challenging and

demanded. Xu et al. [10] presented a low complexity adaptive

depth truncation ﬁlter in which all edge pixels are replaced

by a mean value in each block to reduce the artifacts in a

compressed depth image. However, such a direct region-based

replacement often leads to some distortions in non-ﬂat regions,

such as slop or curved surfaces. Zhao et al. [11] proposed a fast

candidate values based boundary ﬁltering (CVBF) method to

reduce the boundary distortions of compressed depth images.

Motivated by the above methods, we consider that learning-

based methods would be helpful in extracting and mapping the

hidden information in the compressed depth images so as to

reduce co mpression artifacts. Meanwhile, in the literature, the

majority of depth enhancement methods require assist from

textures, however, it can not always guarantee that the aligned

textures are accessible. Hence, with this concern, we introduce

a cascaded fully convolutional network (FCN) which directly

VCIP 2017, Dec. 10 – 13, 2017, St Petersburg, U.S.A.

下载后可阅读完整内容，剩余3页未读，立即下载

weixin_38522323

粉丝: 5
资源: 908

深度压缩图像质量提升的CNN级联模型

Rapid Object Detection using a Boosted Cascade of Simple

cascade r-cnn paper

A Convolutional Neural Network Cascade for Face Detection

A simple method of tuning series cascade controllers for unstable systems (2013年)

Rapid Object Detection using a Boosted Cascade of Simple Features

Observer design for a class of nonlinear system in cascade with counter-convecting transport dynamics

Rapid Object Detection using a Boosted Cascade of Simple Features.pdf

Cascade_R-CNN_Delving_into_High_Quality

A framework of tracking by multi-trackers with multi-features in a hybrid cascade way

haarcascade for opencv github

最新资源