the R-tree divides the search space by minimum bounding
rectangles with allowance of the overlap between them.
Among these structures, the R-tree is the most efficient for
dynamic updates. With allowing overlap between minimum
bounding rectangles, the R-tree is kept balanced even after
dynamic updates. Therefore, the R-tree is regarded as one
of the most suitable index structures for the online database
applications.
The method of the region split has great influence on
the performance of index structures. Both the SS-tree and
the SR-tree are derived from the R-tree, and thus their
structure is similar to the R-tree. However, the SS-tree and
the SR-tree have better performance than the R-tree with
respect to the nearest-neighbor search. This performance
improvement comes from the modification of the region
split. The tree construction algorithm is common to the
SS-tree and the SR-tree. It is originally designed for the
SS-tree; the SR-tree borrows it from the SS-tree. This
algorithm splits regions based on the variance of coordi-
nates. According to the performance evaluation, it is shown
that this algorithm generates regions with shorter diameters
than does the algorithm of the R
*
-tree [1, 2].
While the region shape of the SS-tree is a bounding
sphere, that of the SR-tree is the intersection of a bounding
sphere and a bounding rectangle (Fig. 2). The SR-tree
employs bounding rectangles in addition to bounding
spheres because this enables regions to be smaller. Bound-
ing rectangles and bounding spheres are complementary to
each other in high-dimensional space. Bounding rectangles
reduce the volume of regions while bounding spheres re-
duce the diameter of regions. When employing both, the
SR-tree outperforms the SS-tree. Therefore, this paper em-
ploys the SR-tree for a multidimensional indexing method.
2.3. Nearest-neighbor search algorithms for
multidimensional index structures
The nearest-neighbor search finds such a point that is
closest to a given query point. This search is widely used in
multimedia applications, such as the similarity retrieval of
images based on feature vectors. Two types of search algo-
rithms have been proposed for multidimensional index
structures: the depth-first algorithm [7] and the breadth-first
algorithm [8]. The fundamental flow is common to both
algorithms. The search starts from the root node and visits
regions in ascending order of the distance from the query
point. On visiting each region, the candidate of the nearest
neighbor is determined by choosing the nearest point en-
countered so far. The search continues as long as there
remains such a region that is not visited so far and that is
closer than the candidate. The search terminates when no
region remains to be visited. The final candidate is the
answer of the query.
Both algorithms require a priority queue to keep the
nodes of the tree structure in ascending order of the distance
from the query point. On visiting a node, the children of the
node are pushed into the priority queue and then the first
node in the queue is chosen to be visited next. The differ-
ence between the depth-first and the breadth-first algorithm
resides in the method of maintaining the priority queue and
consequently makes a difference in the order of tree tra-
versal: one is depth-first and the other is breadth-first.
The depth-first algorithm allocates a priority queue
at each internal node and visits its child nodes recursively
in ascending order of the distance from the query point, that
is, a priority queue is allocated independently at each inter-
nal node and nodes are visited recursively along tree
branches. In this case, the second child node of an internal
node is visited only after the first node and its descendants
have been visited. Therefore, the tree is traversed in depth-
first fashion. On the other hand, the breadth-first algorithm
uses only one priority queue for the entire tree. On visiting
a node, its children are pushed into the single priority queue.
In this case, the search is not recursive. The closest node
among all of the nodes encountered so far is chosen for the
next visit.
Each of these algorithms has pros and cons. Neither
of them is completely superior to the other. With respect to
the number of nodes to be visited, the breadth-first algo-
rithm is proven to be optimal [9], that is, it visits the
minimum number of nodes. However, the priority queue
used by the breadth-first algorithm is more likely to be
longer than the one used by the depth-first algorithm. Since
the depth-first algorithm allocates the priority queue for
each visited internal node, the length of the priority queue
is never longer than the number of children of an internal
node. On the other hand, the priority queue used by the
breadth-first algorithm can be longer than the number of
children in an internal node; in the worst case, it is possible
that every node of the tree is pushed into the priority queue.
A longer priority queue might require more computational
cost. Therefore, with respect to the computational cost of
the priority queue, the depth-first algorithm is more advan-
Fig. 2. Structure of the SR-tree.
33