libtorque: Portable Multithreaded Continuations
for Scalable Event-Driven Programs
Nick Black and Richard Vuduc
Georgia Institute of Technology
nickblack@linux.com, richie@cc.gatech.edu
Abstract
Since before even 4.4BSD or POSIX.1
1
, programming via events and continuations has marked best
UNIX practice for handling multiplexed I/O
2
. The idiom’s programmability was proved sapient astride a
decade’s architectures and operating systems, as libevent [19], libev, Java NIO and others achieved ubiquity
among network applications. Academic [32] and proprietary systems have responded to multiproces-
sor economics with threaded callback engines, but such I/O cores remain rare in open source: Firefox uses
multiple functionally-decomposed event loops, nginx multiple processes with static load distribution, and
Apache’s mpm-worker variant a thread per connection. Economic employment of even COTS (Consumer
Off-The-Shelf) hardware already requires concurrent workloads, and manycore’s march into the data cen-
ter still more: dynamic parallelism as rule rather than exception. The community agrees that UNIX net-
work programming must change [6] [13], but consensus of direction remains elusive [14] [29]. We present
our open source, portable libtorque library, justify the principles from which it was derived, extend previous
threaded cores through aggressively exploiting details of memories, processors, and their interconnections
(as detected at runtime), and imply a new state-of-the-art in architecturally-adaptive, high-performance
systems programming. Built with scalability (in both the large and small), low latency, and faithfulness to
UNIX idiom as guiding lights, libtorque subsumes the functionality of existing I/O frameworks (for which
we provide compatibility wrappers) despite superior performance across most loads and apparatus.
1 Intro
It’s a rare and rather lucky program which spends
most of its wall time calculating. Whether an inter-
active application, a desktop widget, or a network
server, relative eternities are made up of waiting for,
shuffling, and signaling the presence of data. Spin-
ning on event readiness implies ineffective use of
processor and power, motivating event registration
and data readiness notification schemes (of which
blocking I/O—with or without a timeout—can be
thought the uniplex special case. Each thread has
a single channel, bound to a single source). This
concept forms the essential omphalos of POSIX’s sys-
tem interfaces: Pthread condition variables, asyn-
chronous I/O, and humble read(2) can all be inter-
preted as some mapping between threads and event
sources, with the goal always of sleeping as much as
local service requirements allow. Every non-trivial
program will use them at least once.
Given our definition of blocking I/O, spinless
employ of multiple event sources requires either
multiple threads or multiplexed, non-blocking (pos-
sibly asynchronous, i.e. kernel-demultiplexed) I/O.
The former solution, at the cost of O(n) threads and
O(n) context switches for n event sources, retains
the simplicity and streamline (thanks to the absence
of any multiplexing interfaces) of blocking. With
the advent of Linux’s NPTL and FreeBSD’s libthr,
this method has seen a resurgence, and even argu-
ment a posteriori that its performance might exceed
that of multiplexed I/O [31]. That this is possible—
that O(n) time and space costs are negated by multi-
plexing overhead, as evidenced in [28]—shows how
much room for improving non-blocking I/O solu-
tions exists on modern machines. We enumerate
and address five possible overheads: memory ef-
fects, multiplex setup (“enplexing”?), synchroniza-
tion within the system call, walking of the event ta-
ble, and copying the results to userspace. We il-
lustrate several ways I/O-intensive applications can
(and libtorque does) make effective use of system ar-
chitecture properties. We show that a tradeoff ex-
ists between dynamic balance under arbitrary load
1
select(2) first showed up in 4.2BSD, poll(2) in SVR4.
2
As canonicalized within the books of W. Richard Stevens [24].
1