CREStereo：消费级立体匹配的适应性相关级联循环网络

需积分: 0 154 浏览量更新于2024-08-04 收藏 32.87MB PDF 举报

"CREStereo: 通过级联循环网络与自适应相关实现实用立体匹配" CREStereo是基于深度学习的立体匹配方法，它着重解决了在消费级设备（如智能手机）上处理真实世界图像对时遇到的精确度挑战。论文《CREStereo：通过级联循环网络与自适应相关实现实用立体匹配》由来自Megvii Research、Tencent和中国电子科技大学的研究人员提出，其目标是在复杂环境中提高立体匹配的准确性。立体匹配是计算机视觉领域的一个重要任务，主要目的是计算出图像对（通常是左右视图）中对应像素点之间的视差，从而推断出场景的三维结构。随着卷积神经网络（CNNs）的发展，立体匹配算法取得了显著的进步，但针对实际应用中的问题，如薄结构、非理想校正、相机模块不一致以及各种困难场景，仍然存在挑战。 CREStereo的核心创新在于引入了级联循环网络（Cascaded Recurrent Network, CRN）与自适应相关（Adaptive Correlation）策略。级联结构允许模型逐步细化和修正匹配结果，每个阶段都建立在前一阶段的基础上，逐步提高精度。而循环网络则能够捕获长期依赖性，对于解决立体匹配中的连续性和一致性问题非常有效。自适应相关层是CREStereo的另一个关键组件，它能动态地调整匹配过程中的权重视野，以适应不同场景的需求。传统的相关层通常使用固定大小的滤窗进行匹配，但在处理具有复杂结构或不同尺度的物体时可能会失效。CREStereo的自适应相关层通过学习来调整滤窗大小，提高了在处理细小结构和不规则物体时的匹配性能。论文展示了CREStereo在Holopix50K数据集上的预测结果，证明了该方法能够在保持高精度的同时，展现出精细结构物体的高质量细节。此外，CREStereo的代码已在GitHub上公开，可供其他研究者和开发者使用和进一步开发。 CREStereo提供了一种实用的解决方案，以应对现实世界中的立体匹配问题，特别是在消费级设备上。通过结合级联循环网络和自适应相关性，它提升了在复杂环境下的匹配性能，有助于推动立体视觉技术在自动驾驶、机器人导航、增强现实等领域的应用。

Practical Stereo Matching via Cascaded Recurrent Network

with Adaptive Correlation

Jiankun Li

Peisen Wang

Pengfei Xiong

Tao Cai

Ziwei Yan

Lei Yang

Jiangyu Liu

Haoqiang Fan

Shuaicheng Liu

3,1†

Megvii Research

Tencent

University of Electronic Science and Technology of China

https://github.com/megvii-research/CREStereo

Figure 1. Examples of our predictions on images from Holopix50K [16] dataset. We show left images of the stereo pairs and their

corresponding predicted disparities. Our results achieve high accuracy and exhibit high-quality details for ﬁne-structured objects.

Abstract

With the advent of convolutional neural networks, stereo

matching algorithms have recently gained tremendous

progress. However, it remains a great challenge to accu-

rately extract disparities from real-world image pairs taken

by consumer-level devices like smartphones, due to practi-

cal complicating factors such as thin structures, non-ideal

rectiﬁcation, camera module inconsistencies and various

hard-case scenes. In this paper, we propose a set of in-

novative designs to tackle the problem of practical stereo

matching: 1) to better recover ﬁne depth details, we design

a hierarchical network with recurrent reﬁnement to update

disparities in a coarse-to-ﬁne manner, as well as a stacked

cascaded architecture for inference; 2) we propose an adap-

tive group correlation layer to mitigate the impact of erro-

neous rectiﬁcation; 3) we introduce a new synthetic dataset

with special attention to difﬁcult cases for better generaliz-

ing to real-world scenes. Our results not only rank 1

both Middlebury and ETH3D benchmarks, outperforming

existing state-of-the-art methods by a notable margin, but

also exhibit high-quality details for real-life photos, which

clearly demonstrates the efﬁcacy of our contributions.

Equal contribution. † Corresponding author.

1. Introduction

Stereo matching is a classical research topic of computer

vision, the goal of which, given a pair of rectiﬁed images,

is to compute the displacement between two corresponding

pixels, namely “disparity” [34]. It plays an important role

in many applications, including autonomous driving, aug-

mented reality, simulated bokeh rendering and so forth.

Recently, with the support of large synthetic datasets

[5, 27, 46], convolutional neural network (CNN) based

stereo matching methods have taken the accuracy of dis-

parity estimation to a new height [8, 23, 44]. However, to

make the algorithm truly practical in the scenario of every-

day consumer photography, we are still faced with three ma-

jor obstacles.

Firstly, it remains a complicated issue for most existing

algorithms to precisely recover the disparity of ﬁne image

details, or thin structures such as nets and wire frames. The

fact that consumer photos are being produced in higher res-

olutions only serves to worsen the problem. In computa-

tional bokeh, for instance, disparity error around ﬁne details

would result in degraded rendering results that are unpleas-

ing to human perception [32]. Secondly, perfect rectiﬁca-

tion [24, 56] is hard to obtain for real-world stereo image

pairs, as they are often produced by camera modules with

下载后可阅读完整内容，剩余9页未读，立即下载

小张Tt

粉丝: 1w+
资源: 31

CREStereo：消费级立体匹配的适应性相关级联循环网络

CREStereo-test

CREStereo模型详解与参数解读

sgbm 和crestereo 算法对比

【LSTM回归预测】基于emd结合长短记忆神经网络lstm实现风速回归预测附Matlab源码.rar

SVM classifier train test result

奇迹书屋_24.08.10.apk

Qt文件拷贝器（源码）

【Unity顶点纹理烘焙插件】Vertex Texture Baker

技术资料分享0b-esp8266-system-description-cn-v1.4很好的技术资料.zip

115不同的子序列.zip

最新资源