b1
b2
b3
b4
b5
b6
b8
b7
reachable labels: L2
solve attempts: S2
reachable labels: L1
solve attempts: S1
Fig. 3: An example showing how to estimate the bug-detecting
potential of a seed. In this example, the seed follows the path
b1->b2->b3->b4. Basic block b5 and b7 are unexplored
and they can reach L
1
and L
2
UBSan labels, respectively. They
have been attempted by constraint solving for S
1
and S
2
times.
The final score for this seed is
e
−0.05S
1
×L
1
+e
−0.05S
2
×L
2
2
.
metric to quantify the amount of vulnerabilities in a chunk of
code. SAVIOR fulfills them as follows.
To meet R1, SAVIOR approximates the newly explorable
code regions based on a combination of static and dynamic
analysis. During compilation, SAVIOR statically computes
the set of reachable basic blocks from each branch. At run-
time, SAVIOR identifies the unexplored branches on the
execution path of a seed and calculates the basic blocks that
are reachable from those branches. We deem that these blocks
become explorable code regions once the concolic executor
runs that seed.
To meet R2, SAVIOR utilizes UBSan [21] to annotate three
types of potential bugs (as shown in Table I) in the program
under testing. It then calculates the UBSan labels in each
code region as the quantitative metric for R2. As UBSan’s
conservative instrumentation may generate dummy labels,
SAVIOR incorporates a static filter to safely remove useless
labels. We discuss the details of this method in Section III-B1.
The above two solutions together ensure a sound analysis
for identifying potential bugs. First, our static reachability
analysis, as described in Section III-B1, is built upon a
sound algorithm. It over-approximates all the code regions that
may be reached from a branch. Moreover, UBSan adopts a
conservative design, which counts all the operations that may
lead to the undefined behavior issues listed in Table I [21, 35].
Facilitated by the two aspects of soundness, we can avoid
mistakenly underrating the bug-detecting potential of a seed.
Following the two solutions, SAVIOR computes the impor-
tance score for each seed as follows. Given a seed with n unex-
plored branches {e
1
, e
2
, . . . , e
n
}, SAVIOR calculates the UB-
San labels in the code that are reachable from these branches,
respectively denoted as {L
1
, L
2
, . . . , L
n
}. Also note that, in
the course of testing, SAVIOR has made {S
1
, S
2
, . . . , S
n
}
attempts to solve those branches. With these pieces of infor-
mation, SAVIOR evaluates the importance score of this seed
with a weighted average
1
n
×
P
n
i=1
e
−0.05S
i
×L
i
. L
i
represents
the potential of the i
th
unexplored branch. We penalize L
i
with
e
−0.05S
i
to monotonically decrease its weight as the attempts
to solve this branch grow. The rationale is that more failed
\xfb\xfb\xf4\xf1
\xxx\xxx\xxx\xxx
\xfb\xf4\xf1\xf1
section->size
Overflow Condition:
section->size + 1 < section->size
\xfb\xfb\xf4\xf1
\xff \xff \xff \xff
\xfb\xf4\xf1\xf1
section->size
solve
section->size+1 > 0xffffffff
Fig. 4: Solving the integer overflow in Figure 2. This shows
the case in a 32-bit system, but it applies to 64-bit as well.
attempts (usually from multiple paths) indicate a low success
possibility on resolving the branch. Hence, we decrease its
potential so that SAVIOR can gradually de-prioritize hard-
to-solve branches. Lastly, SAVIOR takes the average score
of each candidate branches in order to maximize the bug
detection gain per unit of time. To better understand this
scoring method, we show an example and explain the score
calculation in Figure 3.
This scoring method is to ensure that SAVIOR always
prioritizes seeds leading to more unverified bugs, while in
the long run it would not trap into those with hard-to-solve
branch conditions. First, it conservatively assesses a given
seed by the results of sound reachability and bug labeling
analysis. A seed which leads to more unexplored branches
where more unverified bugs can be reached from will earn a
higher score. Second, it takes into account runtime information
to continuously improve the precision of the assessment. This
online refinement is important because statically SAVIOR
may hardly know whether a branch condition is satisfiable
or not. Utilizing the history of constraint solving attempts,
SAVIOR can decide whether a seemingly high-score branch
is worth more resources in the future. As shown by our evalua-
tion in Section V, this scoring scheme significantly accelerates
the detection of UBSan violations, which empirically supports
the effectiveness of our design.
Referring to our motivating example in Figure 1, the
function packet_handler1 has few UBSan labels while
pcap_handler2 contains hundreds of labels. Hence, the
seed following Figure 1b has a lower score compared to the
other seed which runs the path in Figure 1c. This guides
SAVIOR to prioritize the latter seed, which can significantly
expedite the exploration of vulnerable code.
Bug-guided verification: This technique also ensures a sound
vulnerability detection on the explored paths that reach the
vulnerable sites. Given a seed from fuzz testing, SAVIOR
executes it and extracts the label of each vulnerability along
the execution path. After that, SAVIOR verifies the predicates
implanted in each label by checking the satisfiability under the
current path condition — if the predicate is satisfiable then its
corresponding vulnerability is valid. This enables SAVIOR
to generate a proof of either vulnerability or non-existence
along a specific program path. Note that in concolic execution,
many new states with new branch constraints will be created.