J. Vis. Commun. Image R. 73 (2020) 102950
Available online 4 November 2020
1047-3203/© 2020 Elsevier Inc. All rights reserved.
A view-free image stitching network based on global homography
☆
Lang Nie, Chunyu Lin
*
, Kang Liao, Meiqin Liu, Yao Zhao
Institute of Information Science, Beijing Jiaotong University, Beijing Key Laboratory of Advanced Information Science and Network, Beijing 100044, China
ARTICLE INFO
Keywords:
41A05
41A10
65D05
65D17
ABSTRACT
Image stitching is a traditional but challenging computer vision task, aiming to obtain a seamless panoramic
image. Recently, researchers begin to study the image stitching task using deep learning. However, the existing
learning methods assume a relatively xed view during the image capturing, thus show a poor generalization
ability to exible view cases. To address the above problem, we present a cascaded view-free image stitching
network based on a global homography. This novel image stitching network does not have any restriction on the
view of images and it can be implemented in three stages. In particular, we rst estimate a global homography
between two input images from different views. And then we propose a structure stitching layer to obtain the
coarse stitching result using the global homography. In the last stage, we design a content revision network to
eliminate ghosting effects and rene the content of the stitching result. To enable efcient learning on various
views, we also present a method to generate synthetic datasets for network training. Experimental results
demonstrate that our method can achieve almost 100% elimination of artifacts in overlapping areas at the cost of
acceptable slight distortions in non-overlapping areas, compared with traditional methods. In addition, the
proposed method is view-free and more robust especially in a scene where feature points are difcult to detect.
1. Introduction
Image stitching is a technology that can create a seamless panorama
or high-resolution image by stitching images with overlapping parts.
The images may be obtained from different moments, different per-
spectives or different sensors. In recent years, it has received increasing
attention and has become a popular topic in photographic graphics,
surveillance videos [1], and VR [2], etc.
The classical image stitching follows these steps. First, a 3 × 3
homography matrix including translation, rotation, scaling and van-
ishing point transformation is estimated after the feature extraction and
feature matching between a pair of images. Then the homography is
utilized to warp the original image into alignment with the other one.
Finally the original image and the warped image are fused to get the
stitching result. However, this basic algorithm needs to satisfy a basic
assumption: the scene of the picture should be near planar [3]. In fact,
the depth of image contents always differs, which does not satisfy the
prior hypothesis. Therefore, it is easy to cause ghosting effects or mis-
alignments for overlapping parts in the stitching image. In order to
mitigate ghosting effects and improve stitching quality, some existing
image stitching algorithms calculate multiple content-aware local
warpings [4–11] to align the overlapping parts of images, and some
reduce the artifacts generated using projection transformation by
nding the optimal seams [12–15] around objects. As for deep stitching
methods, some methods [16–20] are qualied for stitching images from
arbitrary views, but there is only some steps of its frameworks, such as
feature extraction or feature matching, is achieved by deep learning,
which cannot be called a complete deep image stitching model. Some
other methods [21–23] are all implemented using deep learning, but
they are only specially designed for some specic conditions, such as
xed views.
Different from these deep stitching methods, we aim to establish a
complete deep learning model that can handle images captured from
arbitrary views. In this paper, we present a cascaded view-free image
stitching network based on the global homography, which can eliminate
the ghosting effects as much as possible.
The overview of our approach is illustrated in Fig. 1(e). Specically,
the rst stage is the homography estimation. Different from the existing
deep homography estimation [24–26], the proportion of overlapping
parts between two images in our image stitching is much lower, which
brings great challenges to the stitching performance. To address this
problem, we introduce a global correlation layer [27,28] into this stage
This paper has been recommended for acceptance by Zicheng Liu.
* Corresponding author.
E-mail address: cylin@bjtu.edu.cn (C. Lin).
Contents lists available at ScienceDirect
Journal of Visual Communication and Image Representation
journal homepage: www.elsevier.com/locate/jvci
https://doi.org/10.1016/j.jvcir.2020.102950
Received 19 April 2020; Received in revised form 17 July 2020; Accepted 10 October 2020