大规模电商图像检索：加权卷积神经网络的应用

133 浏览量更新于2024-08-26 收藏 2.43MB PDF 举报

"这篇研究论文探讨了如何使用加权卷积神经网络（Top-Weighted Convolutional Neural Networks）进行大规模电子商务图像检索。作者Shichao Zhao, Youjiang Xu和Yahong Han来自中国天津大学计算机科学与技术学院和认知计算及应用天津重点实验室。他们提出了一种新的特征提取框架，旨在为大规模电子商务图像检索生成更具代表性和区分性的描述符。" 在当前的计算机视觉领域，卷积神经网络（CNNs）已成为图像分类和检索的首选模型，因其生成的图像特征表现出最先进的性能。研究发现，CNNs深层卷积层的特征比全连接层的特征更优，因为它们能更好地描述局部图像区域，这些描述符与特定特征的感受野相对应。为了在电子商务图像检索中实现更高效、准确的搜索，该论文提出了“Top-Weight”方法，用于检测图像中的有趣区域。这种方法强调了CNN中对图像识别至关重要的部分，通过赋予这些区域的特征更大的权重，使得检索系统能够更加关注图像的关键内容。这有助于提高检索结果的相关性和准确性，特别是在面对大量商品图像时，能够快速、准确地找到用户所寻找的特定商品。此外，考虑到电子商务环境中的图像多样性，如不同的拍摄角度、光照条件和产品展示方式，Top-Weighted CNNs的使用能够增强模型对这些变化的鲁棒性。通过优化特征提取过程，该方法可以提高检索系统的泛化能力，降低因图像变异性引起的误匹配。论文可能还深入讨论了实验设置，包括数据集的选择、模型训练细节以及评估指标，如平均精度（mAP）和其他相关性能度量。作者可能对比了Top-Weight方法与其他现有方法的性能，以证明其优越性，并提供了未来研究的潜在方向，比如进一步优化权重分配策略、改进网络结构或探索集成学习方法来提升整体性能。这篇论文为解决大规模电子商务图像检索问题提供了一个创新的解决方案，利用加权卷积神经网络增强了对关键图像特征的捕捉和利用，这对于改善用户体验和提高电商平台的运营效率具有重要意义。

Large-Scale E-Commerce Image Retrieval with

Top-Weighted Convolutional Neural Networks

Shichao Zhao

, Youjiang Xu

, Yahong Han

1,2

School of Computer Science and Technology, Tianjin University, Tianjin, China

Tianjin Key Lab. of Cognitive Computing & Application, Tianjin University, Tianjin, China

{zhaoshichao, yjxu, yahong}@tju.edu.cn

ABSTRACT

Several recent researches have shown that image features

produced by Convolutional Neural Networks (CNNs) pro-

vide the state-of-the-art performance for image classiﬁcation

and retrieval. Moreover, some researchers have found that

the features extracted from the deep convolutional layers of

CNNs perform better than that from the fully-connected lay-

ers. Features extracted from the convolutional layers have

a natural interpretation: descriptors of local image regions

correspond well to the receptive ﬁelds of the particular fea-

tures. In order to obtain both representative and discrimina-

tive descriptors for large-scale e-commerce image retrieval,

we come up with a new feature extraction framework. At

ﬁrst, we propose the Top-Weight method to detect the inter-

esting area of e-commerce images automatically. With the

estimated weight, we then aggregate local deep features and

produce high-quality global representation for e-commerce

image retrieval. We have conducted experiments on an e-

commerce dataset ALISC [1] released by Alibaba Group.

Experimental results show that our method outperforms

other deep learning based methods.

Keywords

Image Features, CNNs, Top-Weight

1. INTRODUCTION

With the rapid progress of digital techniques, the amount

of digital images is increasing explosively. This trend makes

image retrieval an important and challenging research topic

nowadays. Especially, the task of e-commerce image re-

trieval has bright prospect and great commercial value.

For much of the past decade, bag-of-features methods were

considered to be the state-of-the-art [9], especially when

built on top of locally invariant features like SIFT [8]. In

recent years, deep convolutional neural networks have at-

tracted much attention in visual recognition, largely due to

their good performance. It has been discovered that the ac-

tivations of CNNs pretrained on a large dataset, such as Im-

Permission to make digital or hard copies of all or part of this work for personal or

classroom use is granted without fee provided that copies are not made or distributed

for proﬁt or commercial advantage and that copies bear this notice and the full cita-

tion on the ﬁrst page. Copyrights for components of this work owned by others than

ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or re-

publish, to post on servers or to redistribute to lists, requires prior speciﬁc permission

and/or a fee. Request permissions from permissions@acm.org.

ICMR’16, June 06-09, 2016, New York, NY, USA

 2016 ACM. ISBN 978-1-4503-4359-6/16/06. .. $15.00

DOI: http://dx.doi.org/10.1145/2911996.2912052

ageNet [4], can be employed as a generalized image represen-

tation. And this representation could be adapted to many

visual problems which delivers impressive performance. At

ﬁrst, researchers tried to utilize the fully-connected layers

of deep networks as global image representation [3]. With

the evolution of deep representations, research attention has

shifted from the fully-connected layers to the deep convolu-

tional layers of CNNs. For example, local convolutional de-

scriptors are extracted from deep convolutional layers. How

to aggregate a set of local descriptors into a global one has

been studied extensively [5]. The best known aggregation

approaches which have been used on SIFT are VLAD [7]

and Fisher Vectors [10]. However, as the diﬀerences between

deep convolutional features and hand-crafted features like

dense SIFT, it was shown that the preliminary embedding

step is not necessary for deep convolutional features because

of their higher discriminative ability and diﬀerent distribu-

tion property. So for deep convolutional features, usually

only aggregation step is performed. What is more, SPoC

[2] proposed a method named center prior based on priori-

knowledge which achieves good performance on benchmark

datasets. However, this descriptor aggregation approach has

a strong hypothesis that the interesting object lies in the

center of image. In many applications, such a hypothesis

usually turns out to be a constraint that burdens the per-

formance. Especially for the e-commerce images, which are

diﬀerent with the scene images, the target objects of the e-

commerce images are very important for retrieval. Thus, the

crux of the problem is to ﬁnd the target object accurately.

Motivated by the above discussions, we put forward a

more eﬀective feature representation for e-commerce image

retrieval. In this framework, we propose a new concept

named Top-Weight, which pools the top convolutional layer

by channel using average pooling. Figure 1 illustrated the

ﬂowchart of extracting the Top-Weight. And we can see that

the Top-Weight exhibits high correlation with the target ar-

eas. Thus, we can get more discriminative image features

with Top-Weight. We ﬁrst calculate the Top-Weight from

the top convolutional layer. Then we multiply the calculated

Top-Weight by the features extracted from various convolu-

tional layers varying from the shallow layer to the deep layer

of the CNNs. Our goal is get both low-level and high-level

information. Finally, we aggregate a set of local convolu-

tional descriptors into a ﬁnal feature representation for im-

age retrieval. Based on the proposed method, we report

the experimental results on an e-commerce dataset sampled

from ALISC and the results demonstrate the eﬀectiveness of

our method compared with other CNNs method.

285

下载后可阅读完整内容，剩余3页未读，立即下载

weixin_38728277

粉丝: 3
资源: 864

大规模电商图像检索：加权卷积神经网络的应用

联合加权聚合深度卷积特征的图像检索方法.pdf

卷积神经网络ppt.pptx

大规模电商图像检索：基于顶权重卷积神经网络

图像检索系统

基于Sora的图像处理技术及其在电子商务中的应用

电子商务AI优化：如何用算法提升购物推荐系统的性能

【超强组合】基于VMD-星雀优化算法NOA-Transformer-BiLSTM的光伏预测算研究Matlab实现.rar

【java毕业设计】高校四六级报名管理系统源码（ssm+jsp+mysql+说明文档+LW）.zip

【超强组合】基于VMD-飞蛾扑火优化算法MFO-Transformer-LSTM的光伏预测算研究Matlab实现.rar

【java毕业设计】水果销售管理网站源码（ssm+jsp+mysql+说明文档+LW）.zip

最新资源