基于超像素的单目dense 3D重建算法：复杂动态场景三维重建

superpixel

reconstruction

需积分: 12 130 浏览量更新于2024-09-05 收藏 6.58MB PDF 举报

身份认证购VIP最低享 7 折!

30元优惠券

Superpixel Soup: Monocular Dense 3D Reconstruction of a Complex Dynamic Scene 本文旨在解决从单目图像中重建复杂动态场景的密集三维重建问题。传统方法通常将该任务分解为一系列步骤，每个步骤的成功都取决于前一个步骤的结果。为了克服这种限制，我们提出了一个统一的方法来解决这个问题。我们的方法基于以下假设：动态场景可以近似为多个 piecewise 平面，每个平面都拥有其自己的刚性运动，而场景之间的全局变化是一个尽可能刚性的变化（As-Rigid-As-Possible，ARAP）。因此，我们的动态场景模型可以简化为一个由平面结构和刚性运动的“soup”。使用超像素重建技术，我们可以将场景分割成多个平面结构，然后将其组装成一个完整的三维形状，从而满足场景的几何约束。这个过程可以看作是一个“3D 拼图”问题，我们需要正确地组装每个刚性-piece，以构建一个符合场景几何的三维形状。我们的方法有两个主要优点：首先，我们的方法可以处理复杂的动态场景，包括多个移动物体和非刚性变换；其次，我们的方法可以生成高质量的三维重建结果，满足实际应用的需求。在实现方面，我们使用了基于超像素的图像分割技术来分割场景，然后使用三维重建算法来组装这些平面结构。我们的实验结果表明，该方法可以生成高质量的三维重建结果，超过了传统方法的性能。我们的方法可以解决复杂动态场景的密集三维重建问题，可能具有广泛的应用前景，如计算机视觉、机器人学、自动驾驶等领域。知识点： 1. 超像素重建技术：一种基于图像分割的三维重建方法，旨在重建复杂场景的三维形状。 2. Piecewise 平面近似：一种近似方法，将动态场景近似为多个 piecewise 平面。 3. As-Rigid-As-Possible（ARAP）：一种假设，认为场景之间的全局变化是一个尽可能刚性的变化。 4. 3D 拼图问题：一种组装问题，将多个平面结构组装成一个完整的三维形状。 5. 密集三维重建：一种三维重建方法，旨在生成高质量的三维模型。 6. 计算机视觉：一种交叉学科，旨在使计算机能够看到和理解世界。 7. 机器人学：一种交叉学科，旨在研究和开发能够感知和交互的机器人。 8. 自动驾驶：一种技术，旨在使车辆能够自动驾驶。

资源详情

资源推荐

contrast, our method is free from object segmentation, hence

circumvent the difﬁculty associated with motion segmenta-

tion in a dynamic setting.

The template-based approach is yet another method for

deformable surface reconstruction. Yu et al. [40] proposed

a direct approach to capture dense, detailed 3D geometry

of generic, complex non-rigid meshes using a single RGB

camera. While it works for generic surfaces, the require-

ment of template prevents its wider application to more

general scenes. Wang et al. [41] introduced a template-

free approach to reconstruct a poorly-textured, deformable

surface. Nevertheless, its success is restricted to a single

deforming surface rather than the entire dynamic scene.

Varol et al. [42] reconstructed deformable surfaces based on

a piecewise reconstruction assuming overlapping patches to

be consistent over the entire surface, but again limited to the

reconstruction of a single deformable surface.

While the conceptual idea of our work appeared in ICCV

2017, this journal version provides (i) in-depth realization of

our overall optimization (ii) Qualitative comparison with

[1], Video-PopUp [39] as well as statistical comparison with

deep-learning method [43]. (iii) Comprehensive ablation

test showing the importance of each term in the overall

optimization. (iv) Extensive performance analysis showing

the performance with the variation in the number of super-

pixels, choice of k-nearest neighbor, choice of dense optical

ﬂow algorithm and change in the shape of the superpixel.

(v) Detail discussion on the failure cases, choice of euclidean

metric for nearest neighbor graph construction, and limita-

tion of our work with possible direction for improvements.

3 MOTIVATION AND CONTRIBUTION

The formulation proposed in this work is motivated by the

following endeavor in dense structure from motion of a

dynamic scene.

3.1 Object level motion segmentation

To solve dense reconstruction of an entire dynamic scene

from perspective images, the ﬁrst step that is practiced

usually is: Perform object-level motion segmentation to infer

distinct motion models of multiple rigidly moving object in

the scene. As alluded before, dense segmentation of moving

object in a dynamic scene in itself is a challenging task.

Also, non-rigidly moving object themselves may compose

of a union of distinct motion models. Therefore, object-

level segmentation build upon the assumption of per object

rigid motion will fail to describe a general dynamic scene.

This motivates us to develop an algorithm that can recover

a dense-detailed 3D model of a complex dynamic scene

from its two perspective images, without object-level motion

segmentation as an essential intermediate step.

3.2 Separate treatment for rigid SfM and non-rigid SfM

Our investigation shows that the algorithms for deformable

object 3D reconstruction often differs from a rigidly mov-

ing object. Not only solutions, but even the assumptions

varies signiﬁcantly e.g orthographic projection, low-rank

shape [11] [12] [13] [15]. The reason for such inadequacy

is perfectly valid due to the under-constraint nature of the

problem itself. This motivated us to develop an algorithm

that can provide i.e “ 3D reconstruction of entire dynamic scene

and the non-rigidly deforming object under similar assumptions

and formulation.”

Although to accomplish this goal for any arbitrary non-

rigid deformation remains an open problem, experiments

suggest that our framework under the aforementioned as-

sumptions about the scene and the deformation, can re-

construct a general dynamic scene irrespective of the scene

rigidity type. Thanks to the recent advancement in the

dense optical ﬂow algorithms [44] [45] which can reliably

capture smooth non-rigid deformation over frames. These

robust dense optical ﬂow algorithms allow us to exploit

local motion of deforming surfaces. Thus, our formulation

is competent enough to bridge this gap between rigid and

non-rigid SfM.

The main contributions of our work are as follows:

1) A framework which disentangles object-level motion

segmentation for dense 3D reconstruction of a complex

dynamic scene.

2) A common framework for dense two-frame 3D recon-

struction of a complex dynamic scene (including de-

formable objects), which achieves state-of-the-art per-

formance.

3) A new idea to resolve the inherent relative scale am-

biguity problem in monocular 3D reconstruction by

exploiting the as-rigid-as-possible (ARAP) constraint

[46].

4 OUTLINE OF THE ALGORITHM

Before providing the details of our algorithm, we would

like to introduce some common notations that are used

throughout the paper.

4.1 Notation

We represent two consecutive images as I, I

: Ω → R

|Ω ⊂ Z

, also referred as reference image and next image

respectively. Vectors are represented by bold lower case let-

ter, such as ‘x’ and matrices are represented by bold upper

case letter such as ‘X’. The subscript ‘a’, ‘b’ denotes anchor

point and boundary point respectively, for e.g x

, x

represents anchor point and boundary point corresponding

to i

superpixel in the image space. The 1-norm, 2-norm of

a vector is denoted as |.|

and k.k

respectively. For matrices,

Frobenius norm is denoted as k.k

4.2 Overview

We ﬁrst over-segment the reference image into superpixels,

then model the deformation of the scene by a union of piece-

wise rigid motions of these superpixels. Speciﬁcally, we

divide the overall non-rigid reconstruction into a local rigid

reconstruction of each superpixel, followed by an assembly

process which glues all these individual local reconstruc-

tions in a globally coherent manner. While the concept of the

above divide-and-conquer procedure looks simple, there is

however a fundamental difﬁculty (of scale indeterminacy) in

its implementation. Scale-Indeterminacy refers to the well-

known fact that using a moving camera one can only recover

the 3D structure up to an unknown scale. In our method,

剩余11页未读，继续阅读

qq_31628315

粉丝: 0
资源: 9

基于超像素的单目dense 3D重建算法：复杂动态场景三维重建

3D_building_reconstruction：通过使用全景图像序列和建筑足迹数据自动增强CityGML LOD2建筑的外观细节

Beautiful_Soup_中文文档

soup_session_new需要释放吗

SoupSession *session = soup_session_new ();需要引用那个头文件

运行显示下面内容Traceback (most recent call last): File "C:/Users/w/Desktop/1.py", line 7, in <module> content = soup.find('div', class_='lemma-summary').get_text().strip() AttributeError: 'NoneType' object has no attribute 'get_text'

cmakelist 怎么添加soup库和头文件

PKG路径下有libsoup2.4.pc但是find_package就是找不到是什么原因

爬取三页，对所有信息爬取

AttributeError: 'NoneType' object has no attribute 'get_text'

soup.find_all() 方法参数

soup库搜索不到怎么办

soup.new_tag

soup.find_all命令怎么使用

光纤课程设计.doc

最新资源