实时纹理无纹理物体检测：基于Gradient Response Maps的Linemod方法

模板匹配

需积分: 1 5 浏览量更新于2024-08-03 收藏 2.73MB PDF 举报

身份认证购VIP最低享 7 折!

30元优惠券

资源详情

资源推荐

decreases, which has the same effect as increasing the

amount of occlusion.

The method proposed in [18] tries to overcome these

limitations by considering the image gradients in contrast to

the image contours. It relies on the dot product as a

similarity measure between the template gradients and

those in the image. Unfortunately, this measure rapidly

declines with the distance to the object location or when the

object appearance is even slightly distorted. As a result, the

similarity measure must be evaluated densely and with

many templates to handle appearance variations, making

the method computationally costly. Using image pyramids

provides some speed improvements; however, fine but

important structures tend to be lost if one does not carefully

sample the scale space.

Contrary to the above-mentioned methods, there are also

approaches addressing the general visual recognition pro-

blem: They are based on statistical learning and aim at

detecting object categories rather than a priori, known object

instances. While they are better at category generalization,

they are usually much slower during learning and runtime,

which makes them unsuitable for online applications.

For example, Amit et al. [19] proposed a coarse to fine

approach that spreads gradient orientations in local

neighborhoods. The amount of spreading is learned for

each object part in an initial stage. While this approach—

used for license plate reading—achieves high recognition

rates, it is not real-time capable.

Histogram of Gradients (HoG) [1] is another related and

very popular method. It statistically describes the distribu-

tion of intensity gradients in localized portions of the image.

The approach is computed on a dense grid with uniform

intervals and usesoverlapping local histogram normalization

for better performance. It has proven to give reliable results

but tends to be slow due to the computational complexity.

Ferrari et al. [4] provided a learning-based method that

recognizes objects via a Hough-style voting scheme with a

nonrigid shape matcher on object boundaries of a binary

edge image. The approach applies statistical methods to

learn the model from few images that are only constrained

within a bounding box around the object. While giving very

good classification results, the approach is neither appro-

priate for object tracking in real time due to its expensive

computation nor is it precise enough to return the accurate

pose of the object. Additionally, it is sensitive to the results of

the binary edge detector, an issue that we discussed before.

Kalal et al. [20] very recently developed an online

learning-based approach. They showed how a classifier

can be trained online in real time, with a training set

generated automatically. However, as we will see in the

experiments, this approach is only suitable for smooth

background transitions and not appropriate to detect

known objects over unknown backgrounds.

Opposite to the above-mentioned learning-based meth-

ods, there are also approaches that are specifically trained

on different viewpoints. As with our template-based

approach, they can detect objects under different poses,

but typically require a large amount of training data and a

long offline training phase. For example, in [5], [21], [22],

one or several classifiers are trained to detect faces or cars

under various views.

More recent approaches for 3D object detection are

related to object class recognition. Stark et al. [23] rely on 3D

CAD models and generate a training set by rendering them

from different viewpoints. Liebelt and Schmid [24] combine

a geometric shape and pose prior with natural images. Su

et al. [25] use a dense, multiview representation of the

viewing sphere combined with a part-based probabilistic

representation. While these approaches are able to general-

ize to the object class, they are not real-time capable and

require expensive training.

From the related works which also take into account

depth data there are mainly appro aches related to

pedestrian detection [26], [27], [28], [29]. They use three

kinds of cues: image intensity, depth, and motion (optical

flow). The most recent approach of Enzweiler et al. [26]

builds part-based models of pedestrians in order to handle

occlusions caused by other objects and not only self-

occlusions modeled in other approaches [27], [29]. Besides

pedestrian detection, there has been an approach to object

classification, pose estimation, and reconstruction intro-

duced by Sun et al. [30]. The training data set is composed

of depth and image intensities ,while the object classes

are detected using the modified Hough transform. While

quite effective in real applications, these approaches still

require exhaustive training using large training data sets.

This is usually prohibited in robotic applications, where

the robot has to explore an unknown environment and

learn new objects online.

As mentioned in the introduction, we recently proposed

a method to detect textureless 3D object instances from

different viewpoints based on templates [7]. Each object is

represented as a set of templates, relying on local dominant

gradient orientations to build a representation of the input

images and the templates. Ext racting the dominant

orientations is useful to tolerate small translations and

deformations. It is fast to perform and, most of the time,

discriminant enough to avoid generating too many false

positive detections.

However, we noticed that this approach degrades

significantly when the gradient orientations are disturbed

by stronger gradients of different orientations coming from

background clutter in the input images. In practice, this

often happens in the neighborhood of the silhouette of an

object, which is unfortunate as the silhouette is a very

important cue especi ally for textureless o bjects. The

method we propose in this paper does not suffer from

this problem while running at the same speed. Addition-

ally, we show how to extend our approach to handle 3D

surface normals at the same time if a dense depth sensor

like the Kinect is available. As we will see, this increases

the robustness significantly.

3PROPOSED APPROACH

In this section, we describe our template representation and

show how a new representation of the input image can be

built and used to parse the image to quickly find objects. We

will start by deriving our similarity measure, emphasizing

the contribution of each aspect of it. We also show how we

implement our approach to efficiently use modern proces-

sor architectures. Additionally, we demonstrate how to

878 IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE, VOL. 34, NO. 5, MAY 2012

剩余12页未读，继续阅读

thisiszdy

粉丝: 393
资源: 11

实时纹理无纹理物体检测：基于Gradient Response Maps的Linemod方法

shape_based_matching.rar_linemod_linemod匹配拾取_somet5v_形状模板匹配_模板匹配

多模板匹配.rar_LABVIEW模板匹配_labview_labview图像_多模板匹配_模板匹配

模板匹配.rar_LABVIEW匹配_LABVIEW图像匹配_LABVIEW模板匹配_图像匹配_模板匹配

如何使用cv2.linemod进行模板匹配

使用python opencv ，创建一个圆形模板，使用linemod模板匹配,对img匹配，并显示匹配结果图像

使用python-opencv，读取图像，使用linemod进行模板匹配

使用opencv-contrib-python4.6.0 的linemod进行模板匹配

基于图像特征的模板匹配模板匹配为基于欧式距离的模板匹配

matlab试验之模板匹配.rar_matlab模板匹配_傅里叶 匹配_图像生成_模板匹配_模

使用python opencv 进行linemod模板匹配，并显示匹配结果图像

ncc模板匹配与sad模板匹配哪个更好

ssd模板匹配与NCC模板匹配哪种算法更好

使用python opencv 进行linemod模板匹配,创建一个圆形模板，对img匹配，并显示匹配结果图像

python opencv 4.7.0.72版本怎调用linemod函数 ，进行模板匹配

halcon 模板匹配算法

使用python-opencv，读取图像，使用linemod进行模板匹配，并显示，写出完整代码

Halcon模板匹配详细

使用opencv-contrib-python4.6.0.66 的linemod进行模板匹配，对我的image图像处理，写出整体代码

opencv 车牌模板匹配

halcon 多模板匹配

最新资源

matlab试验之模板匹配.rar_matlab模板匹配_傅里叶匹配_图像生成_模板匹配_模

python opencv 4.7.0.72版本怎调用linemod函数，进行模板匹配