Programming G. Manacher
Techniques Editor
Expected Time Bounds
for Selection
Robert W. Floyd and Ronald L. Rivest
Stanford University
A
new selection algorithm is presented which is
shown to be
very efficient on the average, both theo-
retically and practically.
The number of
comparisons
used to select the ith smallest of n numbers
is
n q- min(i,n--i) q- o(n).
A
lower bound
within 9
percent
of
the above formula is also derived.
Key Words and Phrases: selection, computational
complexity, medians, tournaments, quantiles
CR Categories: 5.30, 5.39
1. Introduction
In this paper we present new bounds (upper and
lower) on the expected time required for selection. The
selection problem
can be succinctly stated as follows:
given a set X of n distinct numbers and an integer i,
1
< i < n, determine the ith smallest element of X
with as few comparisons as possible. The ith smallest
element, denoted by i0 X, is that element which is
larger than exactly i -- 1 other elements, so that 1 0 X
is the smallest, and n 0 X the largest, element in X.
Copyright O 1975, Association for Computing Machinery, Inc.
General permission to republish, but not for profit, all or part
of this material is granted provided that ACM's copyright notice
is given and that reference is made to the publication, to its date
of issue, and to the fact that reprinting privileges were granted
by permission of the Association for Computing Machinery.
This paper gives the theoretical background of Algorithm 489,
"The Algorithm
SELECT--for
finding the ith smallest of n ele-
ments," appearing on p. 173 of this issue.
This work was supported by the National Science Foundation
under grants GJ-992 and GJ-33170X. Authors' addresses: R.W.
Floyd, Stanford Computer Science Department, Stanford Uni-
versity, Stanford, CA 94305; R.L. Rivest, NE 43-807, Project MAC,
545 Technology Square, Cambridge, MA 02139.
We use the notations O(n) and o(n) in the following way:
f(n) < g(n) + O(n) means (3k > 0)(Vn)f(n) -- g(n) < kn, and
f(n) < g(n) + o(n) means lim~ ((f(n) --
g(n))/n) = O.
Let
f(i,n)
denote the expected number of com-
parisons required to select i 0 X. (We assume through-
out that all possible input orderings of the set X are
equally likely.) Since a selection algorithm must de-
termine, for every t C X, t ~ i0Xwhether t < i0X
or i0X < t, we have asatriviallowerbound
f(i,n)
> n- 1, for 1 < i < n. (1)
The best previously published selection algorithm is
FIND,
by C.A.R. Hoare [3]. Knuth [4] has determined
the average number of comparisons used by
FIND,
thus proving that
f(i,n) <
2((n Jr- 1)H~ -- (n -k- 3 --
i)H,~_i+l
(2)
-- (i -k- 2)Hi q- n -q- 3),
where
H,, = ~ j-1. (3)
l<j<n
This yields as special cases I
f(l,n) < 2n -4-
o(n),
(4)
and
f( [-n/2~ ,n) <_
211(1 + ln(2)) -4- o(n)
< 3.39/7 -k-
o(n).
(5)
No bounds better than (1) or (2) have previously been
published.
In Section 2 we present our new selection algorithm,
SELECT,
and derive by an analysis of its efficiency the
upper bound
f(i,n) < n + min(i,n -- i) + O(n '~
ln~'(n)). (6)
A small modification to
SELECT'is
then made, yielding
the slightly improved bound
f(i,n). <_ n +
min
(i,n
- i) + O(n½). (7)
An implementation of
SELECT
is given in Section 3
with timing results for both
SELECT
and
FIND.
The authors believe that
SELECT
is asymptotically
optimal in the sense that the function
sup f(
L~(n
-- 1)__1 + 1, 11)
F(~)
lim
(~Tef
.... n (8)
0<~<1
is bounded below by the analogue of the right-hand
side of (7), so that
F(a) >__ 1 -k min (¢,, 1 -- a), for 0_< ~_< l. (9)
A lower bound just a little better than 1 q- .75 min
(a, I -- a) is derived in Section 4, within 9 percent of
our conjecture and the performance of
SELECT.
165
Communications March 1975
of Volume 18
the ACM Number 3