深度学习提升：密集残差网络优化彩色引导的高分辨率深度图增强

2 浏览量更新于2024-08-27 收藏 1.73MB PDF 举报

本文主要探讨了"Residual Dense Network for Intensity-guided Depth Map Enhancement"，这是一项针对深度图增强的深入研究，特别是在利用颜色图像的多尺度指导方面。深度传感器获取的深度地图通常存在分辨率低和随机噪声的问题，这些问题限制了其在现实世界应用中的精度和可靠性。近年来，深度卷积神经网络（Deep Convolutional Neural Network, DCNN）因其在利用颜色图像指导深度地图增强方面的显著效果而受到关注。传统的DCNN方法往往未能充分利用颜色图像的多尺度信息，导致性能未达到最优。为了解决这一问题，研究人员提出了一种创新的DCNN架构，即残差密集网络。这种网络设计采用了残差学习（Residual Learning）和密集连接（Dense Connection），旨在通过逐级重建高分辨率深度地图，同时充分利用颜色图像的强度信息作为引导，从而提高深度图的清晰度和细节。残差学习在深度学习中是一种重要的技术，它允许网络直接学习输入与输出之间的残差，简化了训练过程，减少了梯度消失或爆炸的问题。而密集连接则通过在每一层网络中引入前一层的所有特征作为输入，增加了信息流动的路径，提高了特征重用效率。论文详细阐述了该网络的结构、训练策略以及如何有效地融合颜色图像信息，包括数据预处理、网络模块设计（如不同尺度卷积块的堆叠）、以及损失函数的选择。实验部分展示了新方法在标准深度地图增强基准数据集上的性能提升，对比了与传统方法的对比结果，证明了残差密集网络在解决低分辨率和噪声问题上具有明显优势。这篇研究论文不仅贡献了一个新颖的深度学习框架，而且还提供了深度图增强任务中的一个实用解决方案。这对于计算机视觉、机器人导航、虚拟现实等领域具有实际意义，未来可能推动相关技术在这些领域的进一步发展。通过阅读这篇文章，读者将对深度图增强的最新进展有更深入的理解，并可能启发他们在类似问题上进行新的探索和创新。

54 Y. Zuo, Y. Fang and Y. Yang et al. / Information Sciences 495 (2019) 52–64

2. Related works

As the focus of early research, single depth map enhancement is the extension of color image super-resolution (SR)

which does not use guidance from the corresponding color image. Xie et al. [38] enhance the LR depth map by using joint

bilateral ﬁlter guided by the HR edge map. Such edge map is constructed from the edges of the LR depth map through

MRF inference. Following the success of sparse coding in color image SR [36,47] , Ferstl et al. [8] use an anisotropic diffusion

tensor to design the Markov Random Field (MRF) regularization term. Such tensor is extracted form the HR edge map

predicted by sparse coding. The method proposed by Xie et al. [39] is to simultaneously perform single depth map SR and

denoising. It trains a robust coupled dictionaries with locality coordinate constraints. The HR depth map is reconstructed by

using the sparse vector on the learned dictionaries, the adaptively regularized shock ﬁlter and L0 gradient smooth constraint.

Recently, motivated by the latest developments in color image SR which introduces DCNN [20] , Riegler et al. [34] propose

a novel variational method for single depth map enhancement of which the data term and regularization term are learned

by DCNN. Chen et al. [3] adopt DCNN to acquire a high-quality edge map from the LR depth map. The HR depth map

is reconstructed via MRF inference by embedding the acquired edge map. Song et al. [35] propose to represent the task

of depth map super-resolution as a series of novel view synthesis sub-tasks. Such type of methods performs well in the

case of small up-sampling factors, e.g., 2 × and 4 × . However, as the up-sampling factor rising, the drawback will be more

and more clear due to the limitation of single depth map enhancement. To improve the performance, the corresponding

HR color image is introduced to provide guidance. According to the usage of training data, the existing methods can be

classiﬁed into ﬁlter-based, optimization-based and learning-based categories. The ﬁlter-based methods and the optimization-

based methods explicitly exploit the co-occurrence of the edges between the color image and the depth map via predeﬁned

models, while learning-based counterparts extract the guidance from the HR color image in data-driven ways. The following

subsections review three types of methods respectively.

2.1. Filter-based method

Filter-based methods only use the local information which independently compute the depth values of pixels. As the ﬁrst

work, Kopf et at. [15] propose the Joint Bilateral Up-sampling (JBU) framework which uses HR color edges to reﬁne the LR

depth edges through bilateral ﬁlter. Based on JBU, many variants further improve the performance. Liu et al. [21] compute

the weight based on geodesic to protect the depth edges. Yang et al. [43] construct a cost volume for depth candidates by

using JBU [15] . The coarsely upsampled depth map is iteratively reﬁned within such cost volume. He et al. [9] enhance the

LR depth map by assuming the linear relation between the patch pair from the output and the guidance image. Min et al.

[27] use the joint histogram of the depth candidates to up-sample the LR depth map. Barron and Poole [1] propose a fast

bilateral solver which can be used for color-guided depth map enhancement.

2.2. Optimization-based method

Optimization can be used to solve the problems in many ﬁelds, e.g., human health [31] , image search [29] , hash code

learning [25] . For depth map enhancement, some hand-craft priors are introduced in optimization-based methods which

simultaneously compute the depth values for all the pixels. Compared with ﬁlter-based methods, they always have superior

performances in depth map de-noising. Diebel et al. [5] model depth map enhancement as the inference problem of Markov

Random Field (MRF) for the ﬁrst time. By following it, Park et at. [32] integrate the edge, gradient and segmentation infor-

mation from the HR color image to design the anisotropic aﬃnities of the regularization term. Ferstl et al. [7] regularize the

HR depth map by using a second order total generalized variation constraint which is guided by an anisotropic diffusion

tensor extracted from the HR color image. Liu et al. [22] implicitly mitigate the texture-copying artifacts and maintain the

depth edges by designing the regularization term based on robust M-estimator. Zuo et al. [49] explicitly evaluate the edge

inconsistency between the color image and the depth map which is further embedded into MRF inference. By considering

the structure of depth map, Zuo et al. [50] compute the anisotropic aﬃnities in the distance space consisting of minimum

spanning trees. The edge inconsistency [49] is embedded into the edge weights of spanning trees. Li et al. [18] propose

a hierarchical global optimization framework where the depth map is literately reﬁned by using the fast weighted least

squares solver [26] . Yu et al. [46] propose intensity-guided depth up-sampling using edge sparsity and weighted L0 gradient

minimization. In addition to the MRF model, Yang et al. [41,42] propose a novel color-guided depth map enhancement via

auto-regression model.

2.3. Learning-based method

As more and more RGB-D datasets are available, sparse coding which shows great success for low-level computer vision

is introduced into color-guided depth map enhancement. As a pioneer work, Li et al. [19] jointly train three dictionaries

for the corresponding patches from the LR depth map, the HR depth map and the HR color image. The sparse vector is

shared between the dictionaries to independently reconstruct the HR depth patches. Kwon et al. [16] further improve the

performance by using the multi-scale dictionary training scheme. In the reconstruction phase, the consistency constraint

is deﬁned on the overlapping patches. In addition to the synthesis model of sparse coding [16,19] , based on the analysis

剩余12页未读，继续阅读

weixin_38610277

粉丝: 8

深度学习提升：密集残差网络优化彩色引导的高分辨率深度图增强

最新资源