
Improving contextual advertising
2.2.4 Category pages
In Wikipedia, each article can belong to more than one category, e.g., the article about the
“iPhone” belongs to two categories: “Apple Inc. mobile phones” and “Digital audio players.”
Moreover, these categories can be further categorized by associating them with one or more
parent categories. As shown in Fig. 2, the category about “Mammals” belongs to two parent
categories:“Vertebrates” and “Tetrapods.”Thus, the categorystructure does not form a simple
tree-structured taxonomy, but a directed acyclic graph (see the right side of Fig. 2), where
multiple categorization schemes coexist simultaneously.
All the above linkage and categorization information form a huge thesaurus, in which
the semantic relationships are associated and reflected (similar to an ontology graph). In
this paper, our major motivation is to utilize this informative and useful graph to improve
contextual semantic matching between pages and ads.
3 Problem statement
3.1 Problem definition
Without loss of generality, the task of contextual advertising is defined as the selection of the
most relevant ads to a given page. Let p be a targeted page used to match candidate ads. Let
A be the candidate ad database that contains N
a
ads, represented by A ={a
j
}
N
a
j=1
.LetN
be the number of expected ads to be embedded into a page, generally, which is given by the
publisher. Let sim( p, a
j
) be the similarity metric, which is used to compute the relevance
between the page p and the ad a
j
. Then, the above expectation about selecting the most
relevant N ads for a given page p from the candidate ad database
A can be formulated as
follows [where x
j
indicates whether the ad a
j
is selected (x
j
= 1) or not (x
j
= 0)]:
max
(x)
f (x) =
N
a
j=1
x
j
sim( p, a
j
) s.t.
N
a
j=1
x
j
= N, x
j
∈{0, 1} (1)
Based on Eq. (1), we conclude that sim( p, a
j
) is an essential metric, whose accuracy
directly determines the accuracy of selected ads to their pages. In other words, the most
essential problem in practical contextual advertising is how to accurately and efficiently judge
the relevance of an ad to a given Web page. More specifically, a good similarity metric that
is able to accurately measure the relevance between a page p andanada
j
[i.e., sim( p, a
j
)]
should satisfy the following two requirements: (1) good accuracy on relevance judgment
between pages and ads, i.e., the more relevant the page p is to the ad a
j
, the greater value
sim( p, a
j
) should be; and (2) good efficiency, i.e., it should be as efficient as possible when
computing the value of sim( p, a
j
), to decrease the time spent on contextual advertising.
However, in practical contextual advertising, it is difficult to balance the accuracy and
efficiency. In general, different relevance judgments may result in different contextual adver-
tising techniques. In the following subsections, we will briefly introduce the three main
approaches of contextual advertising and discuss their advantages and disadvantages.
3.2 Keyword matching
The well-known keyword matching approach estimated ad-relevance based on the co-
occurrence of the same keywords between pages and ads. It has been widely applied in
123