Category Trigger API Extended function calls
Android RPC
Context.startService() u.onCreate(), ∀u extends Service
Context.startActivity() u.onCreate(), ∀u extends Activity
Context.sendBroadcast() u.onReceive(), ∀u extends BroadcastReceiver
AlarmManager.setRepeating() all the three above
... and 4 more
GUI Callbacks
setOnClickListener() u.onClick(), ∀u extends OnClickListener
... and 180 more [10]
Multi-threading
Thread.start() u.run(), ∀u extends Thread
AsyncTask.execute() u.doInBackground(), ∀u extends AsyncTask
... and 14 more
Table I: Trigger APIs and extended function calls.
Service class.
B. API Usage Analysis
Checking whether a given function is suspicious is equiv-
alent to finding a path from the function to a source API and
a path to a sink API. We first build a standard call graph
from program bytecode and then extend it with dummy
functions and extra calling relationships according to above-
mentioned cases. To accelerate the construction algorithm,
we omit Android library functions except for source, sink
and trigger APIs. We want to focus on application functions
and avoid analyzing the Android runtime library. After the
extended call graph is constructed, we perform a breadth-
first search to mark all suspicious functions. For example,
with the extended call graph in Figure 3, the static API
analysis can reveal four suspicious functions (BR1, f1 and
f7, f3) while a conventional call graph can only reveal f3.
Overall, the extended call graph is an over-approximated
call graph with calling relationships that will not happen
in real execution. Consequently, our static API analysis
could mark “good” functions as suspicious in trade for
the analysis performance. While previous work [9] employs
more complicated analyses to achieve better heuristic at the
cost of performance, AppAudit takes an opposite direction
and relies on dynamic analysis to prune false positives.
IV. APPROXIMATED EXECUTION
The static API analysis is over-approximating, which
could result in false positives. We use a dynamic analysis to
confirm actual data leaks and prune false positives.
The approximated executor is a dynamic analysis that
executes the bytecode instructions of a suspicious function
and reports if sensitive data could be leaked during the
execution. The executor has a typical register set, a program
counter (pc), a call stack as its execution context. It relies
on a novel object model to represent application memory
objects. The executor has three working modes, as shown
in Figure 4. It starts with “execution (exec)” mode, where it
interprets bytecodes and performs operations. Source APIs
can generate sensitive data objects, where we mark them as
“tainted”. Tainted objects propagate with the execution and
leap
exec
checkapprox
calling sink
functions
unknown
branching
tainted data
leaked
no taint
insufficient
context
resume
known
end
Figure 4: AppAudit approximated executor state machine.
taint any object that is derived from them. Whenever the
executor encounters a sink API, it changes to “check” mode
to check the parameters for the sink API. If tainted objects
are found, the executor reports the leak and terminates
(“end” final state). Otherwise, it reverts back to the normal
execution mode. When certain bytecode instruction cannot
be executed due to unknown operands (e.g. a conditional
jump instruction with unknown condition), the executor
switches to “approximation (approx)” mode for approxi-
mations to continue the execution. If the approximations
fail, commonly due to too many unknowns or insufficient
execution contexts, the executor will terminate the execution
of current function and start executing one of its caller
function (“leap” final state). The caller function is expected
to provide a more concrete execution context to analyze the
incomplete execution.
A. Object and Taint Representation
The executor starts from the function entry with the
absence of its calling context (the values of parameters and
global variables). We design an object model to represent
902