delivery of notifications, and develop tools for a single-
threaded implementation. Section 3 discusses the issues
that arise in a distributed implementation.
At any point in an execution, the set of timestamps at
which future messages can occur is constrained by the
current set of unprocessed events (messages and notifi-
cation requests), and by the graph structure. Messages
in a timely dataflow system flow only along edges, and
their timestamps are modified by ingress, egress, and
feedback vertices. Since events cannot send messages
backwards in time, we can use this structure to compute
lower bounds on the timestamps of messages an event
can cause. By applying this computation to the set of
unprocessed events, we can identify the vertex notifica-
tions that may be correctly delivered.
Each event has a timestamp and a location (either a
vertex or edge), and we refer to these as a pointstamp:
Pointstamp : (t ∈ Timestamp,
location
! "# $
l ∈ Edge ∪ Vertex).
The SENDBY and NOT IFYAT methods generate new
events: for v.SENDBY(e,m,t) the pointstamp of m is
(t, e) and for v.NOT IFYAT(t) the pointstamp of the no-
tification is (t,v).
The structural constraints on timely dataflow graphs
induce an order on pointstamps. We say a pointstamp
(t
1
,l
1
) could-result-in (t
2
,l
2
) if and only if there exists
a path ψ = ⟨l
1
,...,l
2
⟩ in the dataflow graph such that
the timestamp ψ(t
1
) that results from adjusting t
1
ac-
cording to each ingress, egress, or feedback vertex oc-
curring on that path satisfies ψ(t
1
) ≤ t
2
. Each path can
be summarized by the loop coordinates that its vertices
remove, add, and increment; the resulting path summary
between l
1
and l
2
is a function that transforms a time-
stamp at l
1
to a timestamp at l
2
. The structure of timely
dataflow graphs ensures that, for any locations l
1
and l
2
connected by two paths with different summaries, one of
the path summaries always yields adjusted timestamps
earlier than the other. For each pair l
1
and l
2
, we find the
minimal path summary over all paths from l
1
to l
2
us-
ing a straightforward graph propagation algorithm, and
record it as Ψ[l
1
,l
2
]. To efficiently evaluate the could-
result-in relation for two pointstamps (t
1
,l
1
) and (t
2
,l
2
),
we test whether Ψ[l
1
,l
2
](t
1
) ≤ t
2
.
We now consider how a single-threaded scheduler de-
livers events in a timely dataflow implementation. The
scheduler maintains a set of active pointstamps, which
are those that correspond to at least one unprocessed
event. For each active pointstamp the scheduler main-
tains two counts: an occurrence count of how many
outstanding events bear the pointstamp, and a precur-
sor count of how many active pointstamps precede it in
the could-result-in order. As vertices generate and retire
events, the occurrence counts are updated as follows:
Operation Update
v.SENDBY(e,m,t) OC[(t,e)] ← OC[(t, e)] + 1
v.ONRECV(e,m,t) OC[(t, e)] ← OC[(t,e)] −1
v.NOTIFYAT(t) OC[(t,v)] ← OC[(t, v)] + 1
v.ONNOT IFY (t) OC[(t,v)] ← OC[(t, v)] − 1
The scheduler applies updates at the start of calls to
SENDBY and NOT IFYAT, and as calls to ONRECV and
ONNOT IFY complete. When a pointstamp p becomes
active, the scheduler initializes its precursor count to the
number of existing active pointstamps that could-result-
in p. At the same time, the scheduler increments the
precursor count of any pointstamp that p could-result-
in. A pointstamp p leaves the active set when its occur-
rence count drops to zero, at which point the scheduler
decrements the precursor count for any pointstamp that
p could-result-in. When an active pointstamp p’s pre-
cursor count is zero, there is no other pointstamp in the
active set that could-result-in p, and we say that p is in
the frontier of active pointstamps. The scheduler may
deliver any notification in the frontier.
When a computation begins the system initializes an
active pointstamp at the location of each input vertex,
timestamped with the first epoch, with an occurrence
count of one and a precursor count of zero. When an
epoch e is marked complete the input vertex adds a new
active pointstamp for epoch e + 1, then removes the
pointstamp for e, permitting downstream notifications
to be delivered for epoch e. When the input vertex is
closed it removes any active pointstamps at its location,
allowing all events downstream of the input to eventu-
ally drain from the computation.
2.4 Discussion
Although the timestamps in timely dataflow are
more complicated than traditional integer-valued time-
stamps [22, 38], the vertex programming model supports
many advanced use cases that motivate other systems.
The requirement that a vertex explicitly request notifi-
cations (rather than passively receive notifications for all
times) allows a programmer to make performance trade-
offs by choosing when to use coordination. For exam-
ple, the monotonic aggregation operators in Bloom
L
[13]
may continually revise their output without coordina-
tion; in Naiad a vertex can achieve this by sending out-
puts from ONRECV. Such an implementation can im-
prove performance inside a loop by allowing fast un-
coordinated iteration, at the possible expense of send-
ing multiple messages before the output reaches its final
value. On the other hand an implementation that sends
only once, in ONNOTIF Y, may be more useful at the
boundary of a sub-computation that will be composed
with other processing, since the guarantee that only a
single value will be produced simplifies the downstream
442