REGULAR AND ALMOST UNIVERSAL HASHING 5
3. REGULARITY
Though we can require families of hash functions to have desirable properties such as uniformity
or universality, we also want individual hash functions to have reasonably good properties. For
example, what if a family contains the hash function h(x) = c for some constant c? This particular
hash function is certainly not desirable! In fact, it is the worst possible hash function for a hash table.
Yet we can find many such hash functions in a family that is otherwise strongly universal. Indeed,
Dietzfelbinger [9] proposed a strongly universal family made of the hash functions
h
A,B
(x) =
Ax + B mod 2
K
÷ 2
n−1
with integers A, B ∈ [0, 2
K
). It is strongly universal over the domain of integers x ∈ [0, 2
n
).
However, one out of 2
K
hash functions has A = 0. That is, if you pick a hash function at random,
the probability that you have a constant function (h
0,B
(x) = B ÷ 2
L−1
) is 1/2
K
. Though this
probability might be vanishingly small, many of the other hash functions have also poor distributions
of hash values. For example, if one picks A = 2
K−1
, then any two hash values (h
A,B
(x) and
h
A,B
(x
0
)) may only differ by one bit, at most. Letting A be odd also does not solve the problem:
e.g., A = 1, B = 0 gives the hash function x ÷ 2
n−1
which is either 1 or 0.
Such weak hash functions are a security risk [5, 6]. Thus, we require as much as possible that
hash functions be regular [24, 25].
Definition 1
A hash function h : X → Y is regular if for every y ∈ Y , we have that |{x ∈ X : (h(x) = y)}| ≤
d|X|/|Y |e. Further, a family H of hash functions is regular if every h ∈ H is regular.
We stress that this regularity property applies to individual hash functions.
‡
However, we can still
give a probabilistic interpretation to regularity: if we pick any two values x
1
and x
2
at random, the
probability that they collide h(x
1
) = h(x
2
) should be minimal (|Y |/|X|) if h is regular.
As an example, consider the case where X = Y = {0, 1}. There are only two regular hash
functions h : X → Y . The first one is the identity function (h
I
(0) = 0, h
I
(1) = 1) and the second
one is the negation function (h
N
(0) = 1, h
N
(1) = 0). The family {h
I
, h
N
} is uniform and universal:
the collision probability between distinct values is zero.
More generally, whenever X = Y , a function h : X → Y is regular if and only if it is a
permutation. This observation suffices to show that it is not possible to have strong universality and
regularity in general. Indeed, suppose that X = Y , then all hash functions h must be permutations.
Meanwhile, strong universality means that given that we know the hash value y of the element x (i.e.,
h(x) = y), we still known nothing about the hash value of x
0
for x
0
6= x. But if h is a permutation, we
know that the hash values differ (h(x
0
) 6= h(x))—contradicting strong universality. More formally,
if h is a permutation, we have that h(x) 6= h(x
0
) for x 6= x
0
which implies that P (h(x) = h(x
0
)) = 0
whereas P (h(x) = h(x
0
)) = 1/|Y | is required by strong universality. Thus, while we can have both
universality and regularity, we cannot have both strong universality and regularity.
The next two lemmas state that regularity is preserved under composition and concatenation.
Lemma 4
(Composition) Assume that |Y | divides |X| and |Z| divides |X|. Let f : X → Y and g : Y → Z be
regular hash functions then f ◦ g : X → Z is also regular.
Lemma 5
(Concatenation) Let f : X
1
→ Y
1
and g : X
2
→ Y
2
be regular hash functions then the function
h : X
1
× X
2
→ Y
1
× Y
2
defined by h(x
1
, x
2
) = (f(x
1
), g(x
2
)) is also regular.
‡
In contrast, Fleischmann et al. [26] used the term -almost regular to indicate that a family is almost uniform:
P (h(x) = y) ≤ for all x and y given that h is picked in H.