GPU加速的图像镶嵌并行实现：提升实时性能

133 浏览量更新于2024-08-27 收藏 1020KB PDF 举报

本文主要探讨了在计算机科学的多个领域广泛使用的图像镶嵌（Image Mosaicing）技术，但该过程中的特征匹配、图像变形（warping）和融合步骤涉及大量的计算量，这在某些实时应用中难以满足性能需求。为解决这个问题，研究人员开始开发并利用图形处理器单元（GPU）进行并行操作来加速图像镶嵌的过程。CUDA（Compute Unified Device Architecture）是GPU编程模型的一种，本文作者利用CUDA构建了一个高效的并行图像镶嵌算法。 CUDA的优势在于其并行处理能力，可以同时处理大量数据，极大地提高了执行效率。与传统的中央处理器（CPU）相比，通过GPU实现的图像镶嵌有显著的时间优势。实验结果显示，当使用集成的NVIDIA GeForce GTX 745 GPU时，对于大尺寸输入图像，我们的并行图像镶嵌方法能够实现高达27.6倍的性能提升，这意味着在相同时间内，GPU可以处理更多的图像数据，显著改善了实时应用的响应速度和吞吐量。文章的核心内容包括以下几个方面： 1. 特征匹配：这是图像镶嵌的关键步骤，需要对不同图像之间的相似性进行检测，通常使用模板匹配或特征点检测算法，如SIFT、SURF等。在GPU上，这些计算密集型任务可以通过并行化加速，减少单个计算单元的负载。 2. 图像变形（Warping）：根据特征匹配的结果，需要将源图像的部分区域按照一定的规则映射到目标图像上，这涉及到大量的像素级运算。在GPU上，通过矩阵乘法和纹理采样等操作，可以高效地进行大规模的像素变换。 3. 融合（Blending）：将变形后的图像部分与目标图像的其余部分无缝融合，形成最终的镶嵌效果。GPU的浮点运算能力和纹理混合操作使其非常适合这种高精度的融合过程。 4. CUDA编程：文章详细介绍了一种基于CUDA的并行图像镶嵌实现，包括数据加载、任务调度、并行计算以及结果合并等步骤，这些步骤都是在GPU的线程块和网格级别进行的，充分利用了GPU的并行处理能力。 5. 实验评估：通过实际的性能测试，验证了GPU实施的图像镶嵌算法在大型图像上的优势，对比了与CPU实现的效率，展示了GPU在处理图像镶嵌任务时的巨大性能提升。总结来说，本文提出了一种在GPU上快速实施图像镶嵌的方法，通过并行处理技术显著提升了计算效率，这对于需要实时图像处理和渲染的领域，如虚拟现实、无人机航拍和实时地图更新等具有重要意义。

Fast Implementation of Image Mosaicing on GPU

Yixiang Lu

, Qingwei Gao

1,∗

, Shuai Chen

School of Electrical Engineering and Automation,

Anhui University, Hefei 230601, China

Dong Sun

, Yi Xia

, Xueming Peng

1,2

Shanghai Huawei Technology

Co., Ltd, Shanghai 200120, China

Abstract—Image mosaicing has been studied and widely used

in many ﬁelds of computer science, but there exists a huge

amount of computations involved in steps of feature matching,

warping and blending. And thus it could not meet the real-

time demands of some applications. Fortunately, some related

parallel operations which can speed up the process of mosaicing

have been developed and implemented on the Graphics Processor

Unit (GPU). In this paper, we present a parallel implementation

of image mosaicing based on GPU using the Compute Uniﬁed

Device Architecture (CUDA). We obtain better results in terms

of execution time than that of implementation on the central

processing unit (CPU). When an integrated GPU GTX745 was

used in the experiment, we achieved a speedup ratio up to 27.6

times for large input images.

Index Terms—Image mosaicing; Matching; Parallel; Graphics

Processor Unit (GPU).

I. INTRODUCTION

Image mosaicing is an active area of research in the ﬁelds of

photogrammetry, computer vision, image processing and com-

puter graphics. It can be deﬁned as a process of constructing

panoramic image mosaics from a sequence of partial images

obtained from different views [1]. The initial application of

image mosaicing mainly focuses on the construction of large

aerial and satellite photographs from collection of images

[2]. Nowadays, a variety of new applications of mosaicing

have been emerged, including scene stabilization and change

detection [3], increasing the ﬁeld of view and resolution

[4], video compression [5], wide-area video surveillance [6],

the construction of virtual environments [7] and image-based

rendering [8]. A typical mosaicing process mainly consists of

three different steps of image processing, that is, registration,

warping and interpolation, and blending. Image registration is

the key task of image mosaicing [9]. Registration refers to the

establishment of a geometric transformation between a pair

of images depicting the same scene, and the transformation is

determined by an 8 degrees of freedom planar homography.

If the homography have some errors, it will result in image

misalignment and make it difﬁcult to the subsequent blending.

To ensure the elements of the homography to be more accurate,

we must search for the best correct matching feature points

which are used to estimated the homography. However, the

searching process is computationally extremely expensive,

especially for the images with large sizes. Moreover, when

the mosaicing technique is used to video processing (e.g. video

indexing and wide-area video surveillance) which contains a

great large number of images, the mosaicing speed is very

important in such practical applications.

In recent years, the Graphics Processor Unit (GPU) has

attracted researches’ attention in many ﬁelds for its massive

parallel computational power. Using the GPU as a copro-

cessor to accelerate the algorithms with heavy computational

burden has become an important way in practice, and many

image processing algorithms have already been successfully

implemented on GPU. For example, Luo and Duraiswami [10]

implemented a version of the complete (including all stages

of the algorithms) Canny edge detector under CUDA, and

achieved a speedup of more than 3 times against its straight

CPU implementation. In their work, the author considered the

hysteresis labeling connected component stage which was not

included in previous GPU versions, this is the main reason that

they could not achieve a faster implementation performance.

For image matching and mosaicing, many related applications

are also available on GPU. In [11], Schatz and Trapnell

implemented a string-matching program that runs on the GPU

and achieved a speedup of as much as 35x over the equivalent

CPU-bound version. They presented string-matching kernel for

use in the CUDA, which executes parallelized searching of a

sufﬁx tree to ﬁnd exact matches for a set of query strings.

M. Adam et al. [12] presented a novel approach to local

alignment of images of real-time video stitching application on

GPU. To achieve a nearly double-sized panorama, they mainly

focused on stitching the margin regions of high deﬁnition

stereo images. To accelerate the assembling large mosaics of

electron microscope images, K. U. Venkataraju [13] proposed

to use texture memory lookups to speedup the access to

microscopy image tiles and data parallel computing which

leads to the root of complexity of the calculation. Due to the

usage of unsigned char as the image data type, this results in

slightly inaccurate calculation for pixel values in the mosaic.

Even though good results were achieved by these papers

mentioned above, they all avoided considering two extremely

time-consuming steps, that is, feature matching and random

sample consensus (RANSAC). As two key processes in image

registration, they should be considered in the proposed GPU-

accelerated parallel algorithms.

In this paper, a parallel image mosaicing method imple-

mented on GPU using Computed Uniﬁed Device Architecture

(CUDA) programming model is presented. To reduce compu-

tation time efﬁciently, this paper mainly focuses on the most

time-consuming part of mosaicing. In fact, for most precision

mosaicing, the execution time mainly depends on the number

of matched point pairs in the overlapping images, not on the

image size. Thus, our method starts with feature matching and

2017 10th International Congress on Image and Signal Processing, BioMedical Engineering and Informatics( CISP-BMEI 2017)

下载后可阅读完整内容，剩余4页未读，立即下载

weixin_38703295

粉丝: 10
资源: 935

GPU加速的图像镶嵌并行实现：提升实时性能

MATLAB算法库源码探索与学习：Mosaicin实用指南

GPU加速下的大图图像拼接高效实现

MATLAB图像拼接例程：创建图像马赛克

Image Mosaicing and Super-Resolution

Multi-focus Image Fusion via Region Mosaicing on Contrast Pyramids

图像拼接.rar_image mosaicing_图像拼接_图像重叠_拼接

Image-Mosaicing-master.zip

Image-mosaicing-technique.rar_image splicing_图片拼接_拼接 matlab

matlab影像镶嵌代码-Image-Mosaicing:图像拼接

Image-Mosaicing:将多个图像组合成全景图

最新资源