利用混合排名模型从Flickr用户评论中挖掘标签

164 浏览量更新于2024-08-29 收藏 427KB PDF 举报

“Mining Tags from Flickr User Comments Using a Hibrid Ranking Model”主要探讨了在Web2.0时代，用户生成的内容如何成为许多流行网站如Flickr的主要信息来源。在Flickr中，用户分享照片并轻松浏览他人的照片，而标签系统是照片管理的重要手段。然而，许多照片可能只有很少或没有标签，因为只有上传者可以为照片添加标签。当用户浏览他们感兴趣的照片时，可能会通过评论来表达自己对照片的独特观点，因此基于用户评论推荐新标签或丰富现有标签集变得至关重要。本文的研究重点在于利用自然语言处理（NLP）技术从Flickr用户的评论中挖掘出潜在的标签。作者提出了一个混合排名模型，旨在生成候选标签，并优化现有标签系统。这个方法首先依赖于NLP技术对用户评论进行分析，提取关键词和短语，这些词汇和短语可能与照片的主题或内容相关。接着，通过结合不同的排序算法（如PageRank、TF-IDF等），对这些候选标签进行排名，以确定最相关和最有用的标签。混合排名模型的构建通常包括以下几个步骤： 1. 数据预处理：收集用户评论数据，进行文本清洗，去除无关的标点符号、停用词等。 2. 词汇分析：应用词性标注和分词技术，识别出具有潜在意义的词语和短语。 3. 特征提取：根据词汇出现的频率、上下文关联性和词组的共现关系等，构建特征向量。 4. 排名策略：结合多种排序算法，如基于链接分析的PageRank和基于文档频率的TF-IDF，对候选标签进行综合排名。 5. 反馈机制：可能还包括用户反馈，不断优化推荐系统的性能和准确性。该研究的贡献在于提供了一种有效的方法，将用户评论中的隐含信息转化为可操作的标签，从而增强照片的可搜索性和可发现性。这种方法不仅有助于解决照片缺乏标签的问题，还可以提升用户体验，让用户更容易找到符合他们兴趣的照片。这项工作为社交媒体平台上的内容管理和信息检索提供了新的视角，尤其是在用户参与度高且内容丰富的平台上，如Flickr。通过深入分析用户评论，可以挖掘出更全面、更准确的元数据，从而提高系统的整体性能和用户满意度。这不仅对Flickr这样的照片分享平台，也对其他依赖用户生成内容的社交网络有着重要的启示作用。

Mining Tags from Flickr User Comments Using a Hibrid Ranking Model

Jingxuan Li, Haijun Zhang and Bin Luo

School of Computer Science and Technology,

HIT Shenzhen Graduate School

Shenzhen, China

[jing910307, aarhzhang, hitluobin]@gmail.com

Yan Li

School of Computer Engineering,

Shenzhen Polytechnic

Shenzhen, China

liyan@szpt.edu.cn

Abstract—In the Web2.0 era, user generated content has

become the main source of information of many popular

websites such as Flickr. In Flickr, each user can share

his/her photos and browse others’ easily. Tagging system is

an important approach to the photo management in Flickr.

Users can browse photos by clicking their attached tags.

However, many photos have very few or even no tags,

because only the uploader can mark tags for the photo.

Meanwhile, when a user browses the photo he/she is

interested in, he/she may have comments to express his/her

independent viewpoint on the photo. Therefore, it is critical

to recommend new tags or enrich the existing tag set based

on user comments. Relying on Natural Language Processing

(NLP) techniques, this paper introduces a word-based

method in generating candidate tags extracted from user

comments. In the phase of sorting and recommending tags,

we propose an algorithm by jointly modeling the location

information of candidate tags, statistical information and

semantic similarity. Extensive experimental results

demonstrate the effectiveness of our method.

Keywords-tag recommendation; user comment; Flickr

I. INTRODUCTION

The photo share platform has, undoubtedly, become

the first choice for photographers to store and display their

favorite digital photos taken by camera or smartphone.

Many photo share websites or applications are making

people easy access to online photos and share opinions

with others by leaving comments after browsing. Flickr, a

representative of image hosting websites, was created by

Ludicorp in 2004 and acquired by Yahoo in 2005. The

Verge reported in March 2013 that Flickr had a total of 87

million registered members and more than 3.5 million new

photos uploaded daily

. Previous work on the usage of user

comments mostly focuses on opinion mining, sentiment

analysis and semantic polarity analysis. In this paper, we

consider the problem of mining tags from Flickr user

comments using a hybrid ranking model .

Recommending in traditional e-commerce systems is

mainly explored between users and items. But, in a social

tagging system, recommending usually involves users, tags

and uploaded resources (e.g. photos and videos). Marinho

and Schmidt-Thieme applied CF to the tag recommend-

ation problem and made a quantitative evaluation of its

performance in comparison with other tag recommenders

[1]. Plurality, an interactive tagging system that couples

the collective intelligence of existing tag-based resources

with a personalized context and feedback-sensitive

interface, was presented [2]. Oliveira et al. introduced a

http://en.wikipedia.org/wiki/Flickr.

novel text-based tag suggesting system, Tess [3]. Hotho et

al. [4] argued that enhanced search facilities are vital for

emergent semantics within folksonomy-based systems.

According to [5], a framework called PAPERE was

presented for the integration of the web 2.0 paradigm,

especially social annotation, with user modeling, showing

that both the methodologies can benefit from a proper

integration.

Automatic image annotation is to make the computer

automatically tag the unsigned images with meaningful

semantic keywords to reflect the visual content. There is a

large body of literature working on image annotation.

Existing algorithms can be roughly divided into three

categories: classification based methods, probabilistic

modeling based methods, and graph learning based

methods. Monay and Gatica-Perez presented three

alternative algorithms to learn a Probabilistic Latent

Semantic Analysis (PLSA) model for annotated images [6].

Wang et al. have proposed a method for automatic image

annotation which is extended from typical cross-media

relevance model [7]. A graph learning framework

including the image-based graph learning and the word-

based graph learning was developed for image annotation

[8].

The research on comment mining for tag recommend-

ation is limited so far. Previous work mostly concerns

opinion mining, sentiment analysis, and semantic polarity

analysis. Li et al . presented a framework for news

recommendation in social media by incorporating

information from the entire discussion thread [9]. Momeni

et al. conducted an analysis of user-generated comments

on media objects of different museums and libraries to

shed some light on the characteristics of useful comments

and to identify the important key features of comments for

inferring usefulness [10]. Moreover, Sureka et al.

developed a rule-based system to automatically identify

comment spammers in YouTube forums by mining

comment activity logs of users [11].

Different to previous work, our work features on

considering tag as a new form of keyword in user

generated content. The rationale behind this is that Flickr is

one kind of social media in a more general context. One

form of social media of particular interest here is user

generated content. In user generated content, a user

uploads a photo only from his/her own viewpoint or

interest to tag this photo. However, an attractive photo

usually attracts many other viewers to express their

independent viewpoints through comments. According to

our observation, user comments in Flickr usually belong to

five categories: (1) description of photo content, (2)

description of photo shooting skill, (3) compliment, (4)

recommendation for photo group or photo albums, and (5)

下载后可阅读完整内容，剩余6页未读，立即下载

weixin_38519849

粉丝: 5
资源: 973

利用混合排名模型从Flickr用户评论中挖掘标签

Mining Tags from Flickr User Comments Using a Hibrid Ranking Model

Hibrid Electric Vechile Model.zip_electric-vehicle_vehicle hybri

spark_3_2_0-master-3.2.3-1.el7.noarch.rpm

浙大城市学院在河南2021-2024各专业最低录取分数及位次表.pdf

第4周玩转案例分析.pdf

基于MATLAB的教室人数统计系统源代码+使用说明，带有丰富的人机交互GUI界面

java-ssm+jsp药品销售网站系统实现源码(项目源码-说明文档)

AI健身体能测试之基于paddlehub实现引体向上计数个数统计源码+模型+视频例子+视频结果文件.zip

CPA《财务成本管理》刘正兵 专题班 资本成本 债务成本的估计+加权平均资本成本.pdf

CPA《财务成本管理》刘正兵 专题班 风险与报酬 资本资产定价模型.pdf

最新资源

CPA《财务成本管理》刘正兵专题班资本成本债务成本的估计+加权平均资本成本.pdf

CPA《财务成本管理》刘正兵专题班风险与报酬资本资产定价模型.pdf