Mining Tags from Flickr User Comments Using a Hibrid Ranking Model
Jingxuan Li, Haijun Zhang and Bin Luo
School of Computer Science and Technology,
HIT Shenzhen Graduate School
Shenzhen, China
[jing910307, aarhzhang, hitluobin]@gmail.com
Yan Li
School of Computer Engineering,
Shenzhen Polytechnic
Shenzhen, China
liyan@szpt.edu.cn
Abstract—In the Web2.0 era, user generated content has
become the main source of information of many popular
websites such as Flickr. In Flickr, each user can share
his/her photos and browse others’ easily. Tagging system is
an important approach to the photo management in Flickr.
Users can browse photos by clicking their attached tags.
However, many photos have very few or even no tags,
because only the uploader can mark tags for the photo.
Meanwhile, when a user browses the photo he/she is
interested in, he/she may have comments to express his/her
independent viewpoint on the photo. Therefore, it is critical
to recommend new tags or enrich the existing tag set based
on user comments. Relying on Natural Language Processing
(NLP) techniques, this paper introduces a word-based
method in generating candidate tags extracted from user
comments. In the phase of sorting and recommending tags,
we propose an algorithm by jointly modeling the location
information of candidate tags, statistical information and
semantic similarity. Extensive experimental results
demonstrate the effectiveness of our method.
Keywords-tag recommendation; user comment; Flickr
I. INTRODUCTION
The photo share platform has, undoubtedly, become
the first choice for photographers to store and display their
favorite digital photos taken by camera or smartphone.
Many photo share websites or applications are making
people easy access to online photos and share opinions
with others by leaving comments after browsing. Flickr, a
representative of image hosting websites, was created by
Ludicorp in 2004 and acquired by Yahoo in 2005. The
Verge reported in March 2013 that Flickr had a total of 87
million registered members and more than 3.5 million new
photos uploaded daily
. Previous work on the usage of user
comments mostly focuses on opinion mining, sentiment
analysis and semantic polarity analysis. In this paper, we
consider the problem of mining tags from Flickr user
comments using a hybrid ranking model .
Recommending in traditional e-commerce systems is
mainly explored between users and items. But, in a social
tagging system, recommending usually involves users, tags
and uploaded resources (e.g. photos and videos). Marinho
and Schmidt-Thieme applied CF to the tag recommend-
ation problem and made a quantitative evaluation of its
performance in comparison with other tag recommenders
[1]. Plurality, an interactive tagging system that couples
the collective intelligence of existing tag-based resources
with a personalized context and feedback-sensitive
interface, was presented [2]. Oliveira et al. introduced a
http://en.wikipedia.org/wiki/Flickr.
novel text-based tag suggesting system, Tess [3]. Hotho et
al. [4] argued that enhanced search facilities are vital for
emergent semantics within folksonomy-based systems.
According to [5], a framework called PAPERE was
presented for the integration of the web 2.0 paradigm,
especially social annotation, with user modeling, showing
that both the methodologies can benefit from a proper
integration.
Automatic image annotation is to make the computer
automatically tag the unsigned images with meaningful
semantic keywords to reflect the visual content. There is a
large body of literature working on image annotation.
Existing algorithms can be roughly divided into three
categories: classification based methods, probabilistic
modeling based methods, and graph learning based
methods. Monay and Gatica-Perez presented three
alternative algorithms to learn a Probabilistic Latent
Semantic Analysis (PLSA) model for annotated images [6].
Wang et al. have proposed a method for automatic image
annotation which is extended from typical cross-media
relevance model [7]. A graph learning framework
including the image-based graph learning and the word-
based graph learning was developed for image annotation
[8].
The research on comment mining for tag recommend-
ation is limited so far. Previous work mostly concerns
opinion mining, sentiment analysis, and semantic polarity
analysis. Li et al . presented a framework for news
recommendation in social media by incorporating
information from the entire discussion thread [9]. Momeni
et al. conducted an analysis of user-generated comments
on media objects of different museums and libraries to
shed some light on the characteristics of useful comments
and to identify the important key features of comments for
inferring usefulness [10]. Moreover, Sureka et al.
developed a rule-based system to automatically identify
comment spammers in YouTube forums by mining
comment activity logs of users [11].
Different to previous work, our work features on
considering tag as a new form of keyword in user
generated content. The rationale behind this is that Flickr is
one kind of social media in a more general context. One
form of social media of particular interest here is user
generated content. In user generated content, a user
uploads a photo only from his/her own viewpoint or
interest to tag this photo. However, an attractive photo
usually attracts many other viewers to express their
independent viewpoints through comments. According to
our observation, user comments in Flickr usually belong to
five categories: (1) description of photo content, (2)
description of photo shooting skill, (3) compliment, (4)
recommendation for photo group or photo albums, and (5)