Thrift: Scalable Cross-Language Services Implementation
Mark Slee, Aditya Agarwal and Marc Kwiatkowski
Facebook, 156 University Ave, Palo Alto, CA
{mcslee,aditya,marc}@facebook.com
Abstract
Thrift is a software library and set of code-generation tools devel-
oped at Facebook to expedite development and implementation of
efficient and scalable backend services. Its primary goal is to en-
able efficient and reliable communication across programming lan-
guages by abstracting the portions of each language that tend to
require the most customization into a common library that is imple-
mented in each language. Specifically, Thrift allows developers to
define datatypes and service interfaces in a single language-neutral
file and generate all the necessary code to build RPC clients and
servers.
This paper details the motivations and design choices we made
in Thrift, as well as some of the more interesting implementation
details. It is not intended to be taken as research, but rather it is an
exposition on what we did and why.
1. Introduction
As Facebook’s traffic and network structure have scaled, the re-
source demands of many operations on the site (i.e. search, ad se-
lection and delivery, event logging) have presented technical re-
quirements drastically outside the scope of the LAMP framework.
In our implementation of these services, various programming lan-
guages have been selected to optimize for the right combination of
performance, ease and speed of development, availability of exist-
ing libraries, etc. By and large, Facebook’s engineering culture has
tended towards choosing the best tools and implementations avail-
able over standardizing on any one programming language and be-
grudgingly accepting its inherent limitations.
Given this design choice, we were presented with the challenge
of building a transparent, high-performance bridge across many
programming languages. We found that most available solutions
were either too limited, did not offer sufficient datatype freedom,
or suffered from subpar performance.
1
The solution that we have implemented combines a language-
neutral software stack implemented across numerous programming
languages and an associated code generation engine that trans-
forms a simple interface and data definition language into client
and server remote procedure call libraries. Choosing static code
generation over a dynamic system allows us to create validated
code that can be run without the need for any advanced introspec-
tive run-time type checking. It is also designed to be as simple as
possible for the developer, who can typically define all the neces-
sary data structures and interfaces for a complex service in a single
short file.
Surprised that a robust open solution to these relatively common
problems did not yet exist, we committed early on to making the
Thrift implementation open source.
1
See Appendix A for a discussion of alternative systems.
In evaluating the challenges of cross-language interaction in a net-
worked environment, some key components were identified:
Types. A common type system must exist across programming lan-
guages without requiring that the application developer use custom
Thrift datatypes or write their own serialization code. That is, a C++
programmer should be able to transparently exchange a strongly
typed STL map for a dynamic Python dictionary. Neither program-
mer should be forced to write any code below the application layer
to achieve this. Section 2 details the Thrift type system.
Transport. Each language must have a common interface to bidirec-
tional raw data transport. The specifics of how a given transport is
implemented should not matter to the service developer. The same
application code should be able to run against TCP stream sockets,
raw data in memory, or files on disk. Section 3 details the Thrift
Transport layer.
Protocol. Datatypes must have some way of using the Transport
layer to encode and decode themselves. Again, the application
developer need not be concerned by this layer. Whether the service
uses an XML or binary protocol is immaterial to the application
code. All that matters is that the data can be read and written
in a consistent, deterministic matter. Section 4 details the Thrift
Protocol layer.
Versioning. For robust services, the involved datatypes must pro-
vide a mechanism for versioning themselves. Specifically, it should
be possible to add or remove fields in an object or alter the argu-
ment list of a function without any interruption in service (or, worse
yet, nasty segmentation faults). Section 5 details Thrift’s versioning
system.
Processors. Finally, we generate code capable of processing data
streams to accomplish remote procedure calls. Section 6 details the
generated code and TProcessor paradigm.
Section 7 discusses implementation details, and Section 8 describes
our conclusions.
2. Types
The goal of the Thrift type system is to enable programmers to
develop using completely natively defined types, no matter what
programming language they use. By design, the Thrift type system
does not introduce any special dynamic types or wrapper objects.
It also does not require that the developer write any code for object
serialization or transport. The Thrift IDL (Interface Definition Lan-
guage) file is logically a way for developers to annotate their data
structures with the minimal amount of extra information necessary
to tell a code generator how to safely transport the objects across
languages.
2.1 Base Types
The type system rests upon a few base types. In considering which
types to support, we aimed for clarity and simplicity over abun-