Optimizing TLS for High–Bandwidth Applications
in FreeBSD
Randall Stewart
Netflix Inc.
100 Winchester Circle
Los Gatos, CA 95032
USA
Email: rrs@netflix.com
John-Mark Gurney
Consultant
Oakland, CA
USA
Email: jmg@freebsd.org
Scott Long
Netflix Inc.
100 Winchester Circle
Los Gatos, CA 95032
USA
Email: scottl@netflix.com
Abstract—Transport Layer Security (TLS) is becoming in-
creasingly desirable and necessary in the modern Internet.
Unfortunately it also induces heavy penalties on application
CPU performance for both the client and server. In this paper
we examine the server-side performance implications on CPU
computational and data-movement overhead when enabling TLS
on Netflix’s OpenConnect Appliance (OCA [1]) network. We then
explore enhancements to FreeBSD to reduce the costs that TLS
adds when serving high volumes of video traffic. Finally we
describe recent changes and future improvements to FreeBSD’s
OpenCrypto Framework that can be used to further improve
performance.
I. INTRODUCTION
Transport Layer Security [2] (TLS) is becoming an opera-
tional requirement in today’s unfriendly Internet. It provides
both encryption and authentication to any application that
enables it; but as with many improvements it also comes at a
high cost in terms of additional CPU cycles. Up until recently
Netflix has not enabled TLS on its OpenConnect Appliances
(OCA).
An OCA is a FreeBSD-based appliance that serves movies
and television programming to Netflix subscribers. Confiden-
tial customer data like payment information, account authenti-
cation, and search queries are exchanged via an encrypted TLS
session between the client and the various application servers
that make up the Netflix infrastructure. The actual audio and
video content session is not encrypted. At first glance, this
might seem like a glaring oversight, but the audio and video
objects are already protected by Digital Rights Management
(DRM) that is pre-encoded into the objects prior to them being
distributed to the OCA network for serving. The addition of
TLS encryption to these objects was previously not considered
to be a high priority requirement.
Evolving market forces as well as the changing landscape of
the internet [3] have caused us to re-evaluate our view on TLS.
The computational cost of TLS serving is high, so with this
in mind Netflix launched a small pilot project to explore what
impacts enabling TLS would have on its products. We also
started to examine recent innovations in FreeBSD for ways
that we might be able to reduce the costs of TLS.
Fig. 1. Classic Web Serving
II. THE IDEA
The Netflix OpenConnect Appliance is a server-class com-
puter based on an Intel 64bit Xeon CPU and running FreeBSD
10.1 and Nginx 1.5. Each server is designed to hold between
10TB and 120TB of multimedia objects, and can accommodate
anywhere from 10,000 to 40,000 simultaneous long-lived TCP
sessions with customer client systems. The servers are also
designed to deliver between 10Gbps and 40Gbps of continuous
bandwidth utilization. Communication with the client is over
the HTTP protocol, making the system essentially into a large
static-content web server.
A traditional web server will receive a client request for
an object stored on a local disk, allocate a local buffer for
the object data via the malloc(3) library call, then issue a
read(2) system call to retrieve and copy the contents of the
object into the buffer, and finally issue a write(2) system call
to copy the buffer contents into a socket buffer which is then
transmitted to the client. This process usually involves two
or more data copies handled directly by the CPU as well
as some associated consumption of CPU cache and memory
bandwidth. This simple data flow model (see Fig 1) works
well and is easily maintainable for low-bandwidth needs, but
is taxing on the CPU for high bandwidth applications. Early
tests in the OCA development process showed that the server