SIFT实时嵌入式架构：FPGA与DSP协同优化

需积分: 9 6 浏览量更新于2024-09-09 收藏 2.95MB PDF 举报

本文主要探讨了一种针对SIFT（Scale-Invariant Feature Transform）算法的实时嵌入式架构，旨在解决在嵌入式系统中实现SIFT复杂计算所带来的挑战。SIFT在计算机视觉领域表现出色，但其庞大的计算需求对大多数嵌入式平台构成了阻碍。作者团队——来自华中科技大学模式识别与人工智能研究所多光谱信息处理实验室的Sheng Zhong、Jianhui Wang、Luxin Yan等人，提出了一种低成本的解决方案。他们设计的新架构巧妙地将现场可编程门阵列（FPGA）与数字信号处理器（DSP）相结合。在SIFT的特征检测阶段，该架构优化了FPGA的设计，通过减少资源消耗，提高了执行效率。而在特征描述阶段，利用高性能的DSP进行了优化，以提升处理速度。这种创新设计使得该系统能够在实时环境中有效地检测和提取SIFT特征，从而适应了嵌入式环境对资源和性能的需求。具体来说，文章首先概述了SIFT算法的基本原理和在不同应用场景中的优势，然后着重介绍了新架构的设计细节。FPGA的优化部分可能包括并行处理单元的实现，以及通过硬件级的流水线或者层次结构来降低延迟。而DSP的使用则可能涉及到高效的浮点运算和滤波器处理，以处理SIFT中的尺度空间极值检测和关键点定位等步骤。此外，文中可能会讨论了FPGA和DSP之间的接口设计、数据流管理和同步问题，以及如何通过硬件加速来提高算法的实时性能。文章还可能包含了实验结果和性能分析，以验证新架构在实际应用中的效能和成本效益。这篇文章不仅提供了一种新颖的嵌入式SIFT实现策略，而且为其他研究者在设计高效实时计算机视觉系统时提供了有价值的参考。通过结合FPGA和DSP的优势，它突破了传统嵌入式平台在执行复杂视觉任务上的瓶颈，为进一步推动实时嵌入式计算机视觉技术的发展做出了贡献。

展开

performance of the feature description of these architectures is not

report.

In addition, most existing works [16–19] have been focused on

partial implementation of the SIFT algorithm. More speciﬁcally,

[16,17] are focused on the feature point detection step of the SIFT,

and [18,19] on the pyramid construction, gradient orientation and

magnitude computation, and descriptor extraction, respectively. In

contrast, this paper presents a complete SIFT implementation,

accommodating all components and steps of the original SIFT algo-

rithm. It is a true low-cost (more speciﬁcally, in terms of power

consumption and hardware resources) real-time complete imple-

mentation of the SIFT algorithm.

3. A brief introduction to SIFT

To make this paper self-contained, we brieﬂy review the SIFT

algorithm [2] in this section. As mentioned in Section 1, SIFT can

be divided into four components. We will describe them in detail,

respectively. For details please see [2].

3.1. DoG pyramid construction

The ﬁrst step in SIFT is to determine the image positions that

exhibit signiﬁcant local changes of the visual appearance. Such

places are the candidates for the SIFT features. In order to ﬁnd

these feature candidates, we need to construct a DoG image pyra-

mid which approximates the image gradient ﬁeld. Firstly, we need

to convolve the input image I(x,y) with a Gaussian kernel K(x, y;

where

is the scale of the Gaussian kernel. The result is Gaussian-

ﬁltered image denoted as G(x,y;

), i.e.

Gðx; y;

Þ¼conv2ðIðx; yÞ; Kðx; y;

ÞÞ ð1Þ

where conv2(,) represents the 2-D convolution operation, and

where

Kðx; y;

Þ¼

ðx

þy

Þ=2d

ð2Þ

The DoG image is the difference of two Gaussian-ﬁltered images

over two consecutive scales, denoted by:

Dðx; y;

Þ¼Gðx; y;

ÞGðx; y;

Þð3Þ

where k is a constant multiplicative factor.

To detect the candidate feature points, we need to construct the

DoG image pyramid. The process can be illustrated as Fig. 1 (from

[2]). As can be seen, there has 3 octaves (group of images have the

same resolution), and 6 scales for each octave. In each octave, the

Gaussian-ﬁltered image G(x,y;k

i+1

) i =0...4 is generated by con-

volving G(x, y; k

) with Kðx; y;

ﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃ

 1

Þ. Once a complete oc-

tave has been processed, we down-sample the Gaussian-ﬁltered

image by taking every other pixel in the rows and columns of

the ﬁrst image for the next octave. The ﬁrst image of Octave0 is

the original image.

3.2. Stable feature detection

After the DoG image pyramid has been constructed, we detect

the local maxima and minima in the DoG images by comparing a

pixel to its 26 neighbors in 3  3 regions at the current and adja-

cent scales. And we treat these local maxima and minima point

as candidate features.

Once the feature candidates have been located, we should elim-

inate the low contrast points and strong edge response points to

make the feature robust to noise. To eliminate low contrast points,

we compare the DoG image pixel value with a contrast value. A

poorly deﬁned peak in the DoG image has a large principal

curvature across the edge but a small one in the perpendicular

direction. And it is known that the principal curvature value can

be computed from Hessian matrix H, that is

H ¼

ð4Þ

where D

, which is estimated by taking the difference of neighbor

points, is the second order derivative in the x direction.

The eigenvalues of H is proportional to the principal curvatures

of D. Let

be the radio between the larger eigenvalue

and the

smaller one b, so that

b, then

TrðHÞ

DetðHÞ

þ bÞ

b þ bÞ

ð1 þ

ð5Þ

where Tr(H) means the trace of H, and Det(H) means the determi-

nant of H. It is clear that

represents the condition number of the

principal curvature matrix, which implies its singularity of the

degeneration of the local appearances.

Therefore, to eliminate strong edge response points, we only

need to set the threshold

, and check whether (6) is satisﬁed or

not. That is

TrðHÞ

DetðHÞ



ð1 þ

ð6Þ

3.3. Gradient magnitude and orientation assignment

To extract the SIFT descriptor for the detected features, we have

to compute the image gradient magnitude and orientation, de-

noted by m(x, y) and h(x,y) respectively, for the points close to

the feature point. The formula of m(x, y) and h(x,y) are:

hðx; yÞ¼a tanðGðx; y þ 1ÞGðx; y  1Þ=Gðx þ 1; y ÞGðx  1; yÞÞ

ð7Þ

mðx; yÞ¼

ﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃ

ðGðx þ 1; yÞGðx  1; yÞÞ

þðx; y þ 1ÞGðx; y  1ÞÞ

ð8Þ

3.4. SIFT descriptor representation

In this stage, there are two main tasks: dominant orientation

computation and descriptor representation. Dominant orientation

computation needs to form an orientation histogram, which has

36 bins in this paper, from a circular region centered at the feature

point. Each sample added to the histogram is weighted by its gra-

dient magnitude and by a Gaussian-weighted circular window, as

presented in Fig. 2(a). The highest bin in the histogram is named

as the dominant orientation of this feature, as presented in

Fig. 2(b). Then rotate the image relative to the dominant orienta-

tion, as presented in Fig. 2(c). Finally, in the descriptor representa-

tion stage, it utilizes the image gradient orientation and magnitude

of the points from a circular image region centered at the feature to

extract a 16  8 dimensional SIFT feature descriptor, as presented

in Fig. 2(d).

4. The proposed hardware architecture

SIFT has shown a great success in many computer vision appli-

cations, such as 3D reconstruction, target tracking and object rec-

ognition and so on. However, its large computational complexity

has been a challenge to most embedded implementations for

real-time and resource-limited application scenarios.

18 S. Zhong et al. / Journal of Systems Architecture 59 (2013) 16–29

下载后可阅读完整内容，剩余13页未读，立即下载

身份认证购VIP最低享 7 折!

30元优惠券

jssong66

粉丝: 0

SIFT实时嵌入式架构：FPGA与DSP协同优化

Real-Time Workshop任务执行分析工具包-快速分析和报告生成

单片机类毕业论文设计-英文翻译.doc: A Modelling-Based Methodology for Evaluating Real-Time Embedded ...

Simulink模型到独立EXE：Eclipse与Real-Time Workshop结合教程

Co-design techniques for distributed real-time embedded systems

Co-design techniques for distributed real-time embedded systems .pdf

Real-Time Embedded Systems 1st edition 2017

REAL-TIME EMBEDDED SYSTEM Open-Source Operating Systems Perspective

Real-Time Embedded Multithreading--Using ThreadX.7z

Real-Time Systems Architecture Scheduling and Application 2016

Real-Time Embedded Components and Systems with LINUX and RTOS, 2nd

最新资源