On fast Construction of SAH-based Bounding Volume Hierarchies
Ingo Wald
12
1
SCI Institute, University of Utah
2
Intel Corp, Santa Clara, CA
Figure 1: We present a method that enables fast, per-frame and from-scratch re-builds of a bounding volume hierarchy, thus completely removing
a BVH-based ray tracer’s reliance on updating or re-fitting. On a dual-2.6GHz Clovertown system (8 cores total), our method renders the
exploding dragon model (252K triangles) at around 13–21 frames per second (1024x1024 pixels) including animating the triangles, per-frame
rebuilds, shading, shadows, and display. The build itself takes less than 20ms, and is nearly agnostic to the distribution of the triangles; thus, the
variation in frame rate (21 fps for the initial, smooth frame, and 13 fps for the timestep corresponding to the four th image) is due only to varying
traversal cost, without any deterioration in BVH quality at all (i.e., when starting with the last frame, the frame rate actually increases).
ABSTRACT
With ray traversal performance reaching the point where real-time
ray tracing becomes practical, ray tracing research is now shifting
away from faster traversal, and towards the question what has to
be done to use it in truly interactive applications such as games.
Such applications are problematic because when geometry changes
every frame, the ray tracer’s internal index data structures are no
longer valid. Fully rebuilding all data structures every frame is the
most general approach to handling changing geometry, but was long
considered impractical except for grid-based grid based ray tracers,
trivial scenes, or reduced quality of the index structure. In this pa-
per, we investigate how some of the fast, approximate construction
techniques that have recently been proposed for kd-trees can also
be applied to bounding volume hierarchies (BVHs). We argue that
these work even better for BVHs than they do for kd-trees, and
demonstrate that when using those techniques, BVHs can be rebuilt
up to 10× faster than competing kd-tree based techniques.
1 INTRODUCTION
While ray tracing has always been the method of choice for most
offline rendering systems, it was long considered impractical for
interactive applications, like games. With ever more capable hard-
ware
1
and continuously improved algorithms, however, it is now
possible to trace severeal million rays per second on a desktop
PC. At this performance ray tracing slowly starts to get interesting
even for what eventually is the dominating application for graphics
technology—games. Games, however, are highly dynamic in na-
ture. Thus, the ray tracer has to either rebuild or update its internal
data structures every frame, and achieving interactive performance
requires to look at both traversal performance and rebuild/update
performance at the same time.
Though we refer the reader to a recent survey for a more com-
plete discussion on ray tracing animated scenes [20], the generally
accepted opinion today seems to be that among the three dominant
ray tracing data structures—grids, kd-trees, and BVHs—grids are
easiest to rebuild but less efficient to travers e, while kd-trees and
BVHs are more efficient, but harder to build (with the most effi-
1
A dual-3 GHz Clovertown PC already reaches 19 2 GFLOPs peak (2
CPUs× 4 cores × 4-wide SIMD/cycle× MADD× 3 GHz=192GFLOPs)
cient techniques requiring many seconds even for moderately com-
plex scenes [19]).
For bounding volume hierarchies, this has led to various ef-
forts to avoid full rebuilding by relying on refitting the BVH [11,
18]–potentially coupled with selective restructuring [22] or in-
frequent/asynchronous rebuilding [8, 11]. Fast re-building from
scratch has received far less attention. W
¨
achter et al [16] have pro-
posed a fast spatial-median build that was originally proposed for
their s-kd-tree like “bounding interval hierarchy” (BIH), but which
applies to traditional BVHs, too. W
¨
achter’s BIH-style build—once
applied to BVHs—allows for fast building of BVHs, but leads to
somewhat reduced traversal performance because it no longer em-
ploys a surface area heuristics (SAH) for determining how to best
build the hierarchy, and instead always splits at the spatial median.
For kd-trees, the set of techniques for fast tree construc-
tion is much richer, including hierarchical ray transformation
schemes [17], “fuzzy” kd-trees and motion decomposition [3], and,
in particular, fast “scan”-techniques that still employ a (somewhat
simplified) surface area heuristic, but achieve significantly high
build performance by slightly approximating a full SAH build. In
addition, Hunt et al. [6] have shown that when some coarse scene
graph information is available, these fast scan-based techniques can
be accelerated by up to another order of magnitude if scanning,
build-from-hierarchy, and lazy construction are combined.
For BVHs, such fast from-scratch build techniques have not yet
been investigated. This is surprising, since BVHs have some nice
properties that make them, in fact, more amenable to these tech-
niques than kd-trees: First, BVHs have fewer nodes than a kd-tree,
so fewer operations have to be performed; second, all BVH build al-
gorithms to date use only the triangle centroids, anyway, so there is
no need for handling cases where triangles “overlap” a split plane
(in a BVH, a triangle will always be on exactly one side); third,
since each triangle is referenced exactly once the total number of
nodes in a BVH is bounded by 2N −1 (where N is the number of
triangles), so the build can be performed f ully in place without any
additional management of node memory
2
.
While being able to expect that binning techniques to be at least
2
Note that in general this significantly over-estimates actual memory
consumption (at 4 triangles per leaf, it overestimates node memory by 4×),
but even then a BVH consumes less memory than a kd-tree [4].