无监督深度图像拼接：从特征到图像的重建

81 浏览量更新于2024-08-03 收藏 16.54MB PDF 举报

"TIP2021-UDIS - Unsupervised Deep Image Stitching: Reconstructing Stitched Features to Images" 这篇论文发表在2021年的《IEEE Transactions on Image Processing》期刊上，作者包括Lang Nie、Chunyu Lin、Kang Liao、Shuaicheng Liu和Yao Zhao。该研究主要关注的是无监督深度图像拼接技术，这是一个解决传统基于特征的图像拼接方法局限性的创新尝试。传统的图像拼接技术依赖于特征检测的质量，当图像特征稀少或分辨率较低时，这些方法往往效果不佳。然而，由于缺乏标注数据，基于学习的图像拼接解决方案研究相对较少，使得监督方法的可靠性受到质疑。为了克服这些问题，论文提出了一个两阶段的无监督深度图像拼接框架。第一阶段是无监督粗图像对齐。这一阶段引入了一种基于消除的损失函数来约束无监督的同构网络，使其更适合大基线场景的对齐。同构网络能够自动学习图像间的对应关系，即使在特征不明显的情况下也能进行有效对齐。此外，论文还引入了变换层，它能在拼接域空间中对输入图像进行扭曲，进一步优化对齐效果。第二阶段是无监督图像重建。研究人员受到启发，认为在像素级的错位可以在一定程度上通过特征级别的调整来消除。因此，他们设计了一个机制，利用特征级别的信息来校正和重建拼接后的图像，以减少像素级别的失真，提高拼接质量。这种方法的创新之处在于，它无需标注数据即可进行训练，降低了对大量人工注释的依赖，扩大了应用范围。同时，通过两阶段的方法，不仅解决了图像对齐的问题，还关注到图像内容的一致性和视觉质量的提升，这对于大视角图像拼接或者低质量图像处理具有重要意义。 "TIP2021-UDIS" 提出的无监督深度图像拼接框架为图像处理领域提供了一种新的、有效的方法，特别是在处理具有挑战性的图像拼接问题时，如特征贫乏和大视角变化的场景。这一工作为未来无监督学习在图像处理中的应用提供了有价值的参考。

6186 IEEE TRANSACTIONS ON IMAGE PROCESSING, VOL. 30, 2021

A. Feature-Based Image Stitching

According to different strategies to eliminate artifacts,

the feature-based image stitching algorithms can be divided

into the following two categories:

1) Adaptive Warping Methods: Considering that a single

transformation model is not enough to accurately align images

with parallax, the idea of combining multiple parametric

alignment models to align the images as much as possible

is introduced. In [11], the dual-homography warping (DHW)

is presented to align the foreground and the background,

respectively. This method works well in the scene composed

of two p redominating planes but shows poor performance in

more complex scenes. Lin et al. [12] apply multiple smoothly

varying afﬁne (SVA) transformations in different regions,

enhancing local deformation and alignment performance.

Zaragoza et al. [13] propose the as-projective-as-possible

(APAP) approach, where an image can be partitioned into

dense grids, and each grid would b e allocated a corresponding

homography by weighting the features. In fact, APAP would

still exhibit parallax artifacts in the vicinity of the object

boundaries, for dramatic depth changes might occur in these

areas. To get rid of this problem, the warping residual vectors

are proposed to distinguish matching features from different

depth planes in [19], contributing to more naturally stitched

images.

2) Seam-Driven Methods: Seam-driven image stitching

methods are also inﬂuential, ac quiring natural stitched images

by hiding the artifacts. Inspired by the idea of interactive

digital photomontage [39], Gao et al. [24] propose to choose

the best homography with the lowest seam-related cost from

candidate homography matrices. Then the artifacts are hidden

through seam cutting. Referring to the optimization strategy of

content-preserving warps (CPW) [40], Zhang and Liu [22] pro-

pose a seam-based local alignment approach while maintaining

the global image structure using an optimal homography. This

work was also extended to stereoscopic image stitch ing [41].

Using the iterative warp and seam estimation, Lin et al. [23]

ﬁnd the optimal local area to stitch images, wh ich can protect

the curve and line structure during image stitching.

These feature-based algorithms contribute to perceptually

nature stitched results. However, they rely heavily on th e

quality of featu re detection, often failing in scenes with few

features or at low resolution.

B. Learning-Based Image Stitching

Getting a real dataset for stitching is difﬁcult. In additio n,

deep stitching is quite challenging for the scenes with low

overlap rate and large parallax. Subjected to these two prob-

lems, learn ing-based image stitching is still in development.

1) View-Fixed Methods: View-ﬁxed image stitching meth-

ods are task-driven, which are designed for the speciﬁc appli-

cation scenes such as autonomous driving [6], [7], surveillance

videos [4]. In these works, the end-to-end networks are pro-

posed to stitch images from ﬁxed views while they cannot be

extended to stitch images from arbitrary views.

2) View-Free Methods: To stitch images from arbitrary

views u sing CNNs, some researchers propose to adopt CNNs

in the stage of feature detection [32], [33]. However, these

methods cannot be regarded as a complete learning-based

framework strictly. The ﬁrst complete learning-based frame-

work to stitch images from arbitrary views was proposed

in [35]. The images can be stitched through three stages:

homography estimation, spatial transformation, and content

reﬁnement. Nevertheless, this work cannot handle input

images with arbitrary resolutions d ue to the fully connected

layers in the network, and the stitching quality in real appli-

cations is unsatisfying. Following this deep stitching pipeline,

an edge-preserved deep image stitching solution was proposed

in [36], freeing the limitation of input resolution and signiﬁ-

cantly improving the stitch ing perfo rmance in real scenes.

C. Deep Homography Schemes

The ﬁrst deep homography method was put forward in [42],

where a VGG-style [27] network was used to predict the eight

offsets of four vertices of an image, thus uniquely determine

a corresponding homography. Nguyen et al. [37] proposed

the ﬁrst unsupervised deep homography approach with the

same architecture as [42] with an effective unsupervised loss.

Introducing spatial attention to deep homography network,

Zhang et al. [38] proposes a content-aware unsupervised

network, contributing to SOTA performance in small-baseline

deep homography. In [43], multi-scale featur es are extracted

to predict the homography from coarse to ﬁne using image

pyramids.

Besides that, the deep homography network is usually

adopted as a part of the view-free image stitching frameworks

[35], [36]. Different from [37], [38], [42], [43], the deep

homography in image stitching is more ch allenging, for the

baseline between input images is usually 2X∼3X larger.

III. U

NSUPERVISED COAR S E IMAGE ALIGNMENT

Given two high-resolution input images, we ﬁrst estimate

the homography using a deep homography network in an

unsupervised manner. Then the input images can be warped

to align each other coarsely in the proposed stitching-domain

transformer layer.

A. Unsupervised Homography

The existing unsupervised deep homography methods [37],

[38] take the image patches as the input, which is shown in

the white squares in Fig. 3(a). The objective function of these

methods can be expressed as Eq. (1):



P(I

) − P(H(I

))



, (1)

where I

, I

represent the full images of the reference image

and the target image, respectively. P (·) is the operation of

extracting an image patch from a full image, and H(·) warps

one image to align with the other using estimated homography.

From Eq. (1), we can see that to make the warped target

patch close to the reference patch, the extra contents around

the target patch are utilized to pad the invalid pixels in the

warped target patch. We call it a padding-based constraint

strategy. This strategy works well in small-baseline [38],

Authorized licensed use limited to: National University of Singapore. Downloaded on September 19,2023 at 02:07:50 UTC from IEEE Xplore. Restrictions apply.

剩余13页未读，继续阅读

Seung-YimYau

粉丝: 304

无监督深度图像拼接：从特征到图像的重建

反汇编引擎udis86

基于UDIS-D数据集+无监督的深度图像拼接方案python源码+数据集下载链接（可拼接特征到图像）.zip

udis86-d:dlang 绑定到 udis86 库

Udis86 Disassembler for x86 and x86-64-开源

udis86.rar

基于并行容忍的无监督深度图像拼接python实现源码+项目说明（使用UDIS-D数据集、无监督图像变形和合成）.zip

JayD:一个用 Java 编写的 x86 反汇编器（udis86 的端口）

MIPS程序员指南 卷I-A 架构介绍

（ICCV 2023）Parallax-Tolerant Unsupervised Deep Image Stitching

UDIS-D数据集支持的无监督深度图像拼接Python实现

最新资源

MIPS程序员指南卷I-A 架构介绍