RoI-BoW模型：图像内容表示的新方法

研究论文

183 浏览量更新于2024-08-26 收藏 3.45MB PDF 举报

身份认证购VIP最低享 7 折!

30元优惠券

"这篇研究论文探讨了一种名为RoI-BoW（Region of Interest - Bag of Words）的图像内容表示方法，旨在改进传统的BoW模型，通过关注图像中的重要区域来提高图像检索的准确性和效率。该方法结合了不同的大小分割、特征提取（如高斯-马尔可夫滤波）以及BoW模型，以更好地捕获和表达图像内容。" 在计算机视觉领域，图像内容的表示是图像注释和检索的关键部分，近年来受到了广泛的关注。传统的Bag of Words (BoW) 模型作为一种有效的图像内容表示工具，它将图像分割成多个区域并等价处理。然而，这种处理方式忽略了在图像检索中某些区域（如显著对象或感兴趣区域）的重要性。为了克服这一问题，论文提出了RoI-BoW模型。 RoI-BoW模型首先通过不同的大小分割技术来识别和区分图像中的关键区域，这些区域可能包含对检索最有价值的信息。接着，使用特征提取方法，例如Gabor滤波器，来提取这些区域的特征。Gabor滤波器在纹理分析和边缘检测方面表现出色，能够捕捉图像的细节和结构信息。然后，论文采用了词汇包（Vocabulary）构建的方法，通过对特征进行编码，将图像的各个区域转化为“单词”集合，即形成“Bag of Words”。这里的“单词”不是语言学意义上的，而是表示图像的特定特征模式。每个区域的特征向量被量化到最近的词典词，从而创建一个表示整个图像的稀疏向量。通过这种方法，RoI-BoW模型能够更精确地捕获图像的语义内容，尤其是在检索时关注那些对图像理解至关重要的区域。这种表示方法对于提高图像检索的精度和鲁棒性具有重要意义，因为它能更好地反映出图像的关键信息，而不仅仅是整体的视觉外观。这篇论文提出的RoI-BoW模型是对传统BoW模型的一种重要扩展，它考虑了图像中不同区域的重要性，并利用有效的特征提取技术提高了图像内容表示的质量。这对于提升计算机视觉系统在图像检索、分类和识别任务中的性能具有深远的影响。

资源详情

资源推荐

by DoG, and ﬁlter them by key point ﬁltering algorithm. Next, the

RoI is generated by these key points and represented by the BoW

model. At the same time, Non-RoI are also represented by the

BoW model. Finally, The visual words of RoI and Non-RoI are con-

nected to one visual word, which is used to represent the visual

features of the whole image.

Next we will introduce the RoI-BoW model in details.

Let the i-th image in the dataset be I

N

; i ¼ 1; 2; ...; N,

the original image dataset is denoted as D as follows

D ¼fI

; I

; ...; I

where N is the amount of images, N

and N

represent the size of

images. An input image can be seen as two variable function on

the rectangle

ðx; yÞ; ðx; yÞ2½1; 2; ...; N

½1; 2; ...; N

; i ¼ 1; 2; ...; N:

2.1. Key points ﬁltering

For each image, initial key points are detected ﬁrstly by differ-

ence-of-Gaussian (DoG) algorithm [44]. The difference-of-Gaussian

[44] with the scale

and constant multiplicative factor k can be

computed by

Dðx; y;

; kÞ¼Lðx; y; k

ÞLðx; y;

Þ; ð1Þ

in which, Lðx; y;

Þ is the scale space of an input image Iðx; yÞ. It can

be obtained by (seen in page 94 in [44])

Lðx; y;

Þ¼



þy

 Iðx; yÞ: ð2Þ

In practice, the size is usually chosen as

¼ 1:6 and the constant

multiplicative factor is chosen as k ¼

ﬃﬃﬃ

One sampling pixel(except for border pixels) is selected as key

point only if the value of Dðx; y;

; kÞ is larger than all of these

neighbors (in 3  3 region, the sampling pixel is the central one

and its eight neighbors, example, if the geometry coordinate is

ði; jÞ of sampling pixel, the 3  3 region includes night points, they

are ði  1; j  1Þ; ði  1; jÞ; ði  1; j þ 1Þ; ði; j  1Þ; ði; jÞ; ði; j þ 1Þ;

ði þ 1; j  1Þ; ði þ 1; jÞ; ði þ 1; j þ 1Þ) or smaller than all of them.

After the DoG algorithm, the set of the initial key points of

image I

is obtained and denoted as

¼fP

; P

; ...; P

where S

is the number of key points of image I

Since initial key points are too many, a ﬁlter algorithm is used to

remove some sparse points and retain the points distributed den-

sely. An example is illustrated in Fig. 3 and the ﬁltering algorithm

is introduced as follows.

The ﬁltering operator is denoted as h,

h : P

¼ P

; P

; ...; P

! Q

¼ Q

; Q

; ...; Q

; ð3Þ

where T

is the number of key points of image I

after ﬁltering.

Each initial key point is judged by a boolean function as formula

(4),



1; lðP

Þ P L;

0; lðP

Þ < L;

(

ð4Þ

where b ¼ 1 means retaining the point, and b ¼ 0 means removing. l

is a statistic function to calculate the number of key points around

Fig. 1. The above two pictures show the similarity between salient regions detection and region of interest with difference-of-Gaussian. the below two pictures tell us that

there is signiﬁcant difference. The ﬁrst column is the original images. The middle column is the corresponding salient regions. The last column is region of interest with

difference-of-Gaussian (the keypoints are labeled with red points). (For interpretation of the references to colour in this ﬁgure legend, the reader is referred to the web version

of this article.)

J. Zhang et al. / J. Vis. Commun. Image R. 26 (2015) 37–49

剩余12页未读，继续阅读

weixin_38659789

粉丝: 4
资源: 923

RoI-BoW模型：图像内容表示的新方法

rol-automizer:http的动作音序器

rozetlere-rol-verme

我在ida里遇到了rol4(-2147024896,4)>>1的代码，能帮我分析一下吗

请将<el-row v-for="row in slotNames[`${direction}`]" :key="row.key"> <el-rol v-for="col in row.cols" :key="col.name"> <el-checkbox class="checkAllbox" v-model="checkedList" :label="col.name">{{ col.name }}</el-checkbox> </el-rol> </el-row>这段代码增加一个全选按钮以及默认全选

Transaction silently rolled back becaus e it has been marked as rol lback-only

Transaction s ilently rolled back becaus e it has been marked as rol lback-only

vue 如何获取选择的所有选项列表 <el-row v-for="row in slotNames[`${direction}`]" :key="row.key"> <el-rol v-for="col in row.cols" :key="col.name"> <el-checkbox class="checkAllbox">{{ col.name }}</el-checkbox> </el-rol> </el-row>

汇编中ROL，是什么意思?

Rol Pooling的含义和feature map的含义

汇编语言ROL AX,1

rol指令意思

汇编rol和RCL区别

MOV AX,1234H MOV CL,4 ROL AX,CL 它怎么左循环

下列程序段执行后,求BX寄存器的内容: MOV CL,3 MOV BX,0B7H ROL BX,1 ROR BX,CL

下列程序段执行完后，BX寄存器中的内容是什么? MOV CL，3 MOV BX,0B7H ROL BX，1 ROR BX，CL给出具体过程

汇编 shr rol

faster rcnn用的是rol pooling还是 rol Align

设BX=6D16H，AX=1100H，下列三条指令执行后，AX和BX寄存器中的内容是 。 MOV CL , 06H ROL AX , CL SHR BX , CL

最新资源

设BX=6D16H，AX=1100H，下列三条指令执行后，AX和BX寄存器中的内容是。 MOV CL , 06H ROL AX , CL SHR BX , CL