11:10 I. F. Ilyas et al.
whenever p
i
≤ ´p
i
for every i. We elaborate on the impact of function monotonicity on
top-k processing in Section 6.1.
In more complex applications, a ranking function might need to be expressed as a
numeric expression to be optimized. In this setting, the monotonicity restriction of the
ranking function is relaxed to allow for more generic functions. Numerical optimization
tools as well as indexes are used to overcome the processing challenges imposed by such
ranking functions.
Another group of applications address ranking objects without specifying a ranking
function. In some environments, such as data exploration or decision making, it might
not be important to rank objects based on a specific ranking function. Instead, objects
with high quality based on different data attributes need to be reported for further
analysis. These objects could possibly be among the top-k objects of some unspecified
ranking function. The set of objects that are not dominated by any other objects, based
on some given attributes, are usually referred to as the skyline.
We classify top-k processing techniques based on the restrictions they impose on the
underlying ranking function as follows:
—Monotone ranking function. Most of the current top-k processing techniques assume
monotone ranking functions since they fit in many practical scenarios, and have
appealing properties allowing for efficient top-k processing. One example is Fagin
et al. [2001]. We discuss the properties of monotone ranking functions in Section 6.1.
—Generic ranking function. A few recent techniques, for example, Zhang et al. [2006],
address top-k queries in the context of constrained function optimization. The ranking
function in this case is allowed to take a generic form. We discuss the details of these
techniques in Section 6.2.
—No ranking function. Many techniques have been proposed to answer skyline-related
queries, for example, B
¨
orzs
¨
onyi et al. [2001] and Yuan et al. [2005]. Covering current
skyline literature in detail is beyond the scope of this survey. We believe it worth a
dedicated survey by itself. However, we briefly show the connection between skyline
and top-k queries in Section 6.3.
2.6. Impact of Design Dimensions on Top-k Processing Techniques
Figure 4 shows the properties of a sample of different top-k processing techniques that
we describe in this survey. The applicable categories under each taxonomy dimension
are marked for each technique. For example, TA [Fagin et al. 2001] is an exact method
that assumes top-k selection query model, and operates on certain data, exploiting
both sorted and random access methods. TA integrates with database systems at the
application level, and supports monotone ranking functions.
Our taxonomy encapsulates different perspectives to understand the processing re-
quirements of current top-k processing techniques. The taxonomy dimensions, dis-
cussed in the previous sections, can be viewed as design dimensions that impact the
capabilities and the assumptions of the underlying top-k algorithms. In the following,
we give some examples of the impact of each design dimension on the underlying top-k
processing techniques:
—Impact of query model. The query model significantly affects the solution space of the
top-k algorithms. For example, the top-k join query model (Definition 2.2) imposes
tight integration with the query engine and physical join operators to efficiently
navigate the Cartesian space of join results.
—Impact of data access. Available access methods affect how different algorithms com-
pute bounds on object scores and hence affect the termination condition. For example,
ACM Computing Surveys, Vol. 40, No. 4, Article 11, Publication date: October 2008.