124 • D. Comer
Recall that in a binary search tree the
branch taken at a node depends on the
outcome of a comparison of the query key
and the key stored at the node. If the query
is less than the stored key, the left branch
is taken; if it is greater, the right branch is
followed. Figure 2 shows part of such a tree
used to store employee numbers, and the
path taken for the query "15."
Now consider Figure 3 which shows a
modified search tree with two keys stored
in each node. Searching proceeds by choos-
ing one of three paths at each node. In the
figure, the query, 15, is less than 42 so the
leftmost would be taken at the root. For
those queries between 42 and 81 the center
path would be selected, while the rightmost
path would be followed for queries greater
than 81. The decision procedure is repeated
at each node until an exact match occurs
(success) or a leaf is encountered (failure).
In general, each node in a B-tree of order
d contains at most 2d keys and 2d + 1
pointers, as shown in Figure 4. Actually,
the number of keys may vary from node to
node, but each must have at least d keys
and d + 1 pointers. As a result, each node
is at least 1/~ full. In the usual implementa-
tion a node forms one record of the index
f'fle, has a fixed length capable of accom-
modating 2d keys and 2d pointers, and con-
tains additional information telling how
many keys correctly reside in the node.
/ \ / \
FIGURE 2. Part of a binary search tree for employee
numbers The path taken for query "15" is darkened.
/
I N\
/
!~2 I,I ~3, I
1
1
/
1
1 1
\
FIGURE 3. A search tree with 2 keys and 3 branches
per node. The path taken for query "15" is darkened.
Ugually, large, multikey nodes cannot be
kept in main memory and require an access
to secondary storage each time they are to
be inspected. Later, we will see how, under
our cost criterion, maintaining more than
one key per node lowers the cost of find,
insert, and delete operations.
Balancing
The beauty of B-trees lies in the methods
for inserting and deleting records that al-
ways leave the tree balanced. As in the case
of binary search trees, random insertions of
records into a file can leave a tree unbal-
anced. While an unbalanced tree, like the
one shown in Figure 5a has some long paths
and some short ones, a balanced tree, like
the one shown in Figure 5b, has all leaves
at the same depth. Intuitively, B-trees have
a shape as shown in Figure 6. The longest
path in a B-tree of n keys contains at most
about logdn nodes, d being the order of the
B-tree. A find operation may visit n nodes
FIGURE 4. A node in a B-tree of order d with 2d
keys and 2d + 1 pointers.
1
I,, , -, \1
I/t',
--1
I/i,, \1
I/ \ -..I
--....
(b)
FIGURE 5. (a) An unbalanced tree with many long
paths, and (b) a balanced tree with all paths to
leaves exactly the same length.
Computing Surveys, Vol 11, No 2, June 1979