by evaluating queue policies, parsing and analyzing the SQL
text, creating and optimizing distributed execution plan.
The coordinator distributes this plan to workers, starts exe-
cution of tasks and then begins to enumerate splits, which are
opaque handles to an addressable chunk of data in an external
storage system. Splits are assigned to the tasks responsible for
reading this data.
Worker nodes running these tasks process these splits by
fetching data from external systems, or process intermediate
results produced by other workers. Workers use co-operative
multi-tasking to process tasks from many queries concurrently.
Execution is pipelined as much as possible, and data flows
between tasks as it becomes available. For certain query
shapes, Presto is capable of returning results before all the
data is processed. Intermediate data and state is stored in-
memory whenever possible. When shuffling data between
nodes, buffering is tuned for minimal latency.
Presto is designed to be extensible; and provides a versa-
tile plugin interface. Plugins can provide custom data types,
functions, access control implementations, event consumers,
queuing policies, and configuration properties. More impor-
tantly, plugins also provide connectors, which enable Presto to
communicate with external data stores through the Connector
API, which is composed of four parts: the Metadata API, Data
Location API, Data Source API, and Data Sink API. These
APIs are designed to allow performant implementations of
connectors within the environment of a physically distributed
execution engine. Developers have contributed over a dozen
connectors to the main Presto repository, and we are aware of
several proprietary connectors.
IV. SYSTEM DESIGN
In this section we describe some of the key design decisions
and features of the Presto engine. We describe the SQL dialect
that Presto supports, then follow the query lifecycle all the way
from client to distributed execution. We also describe some
of the resource management mechanisms that enable multi-
tenancy in Presto. Finally, we briefly discuss fault tolerance.
A. SQL Dialect
Presto closely follows the ANSI SQL specification [2]. While
the engine does not implement every feature described, im-
plemented features conform to the specification as far as
possible. We have made a few carefully chosen extensions to
the language to improve usability. For example, it is difficult
to operate on complex data types, such as maps and arrays,
in ANSI SQL. To simplify operating on these common data
types, Presto syntax supports anonymous functions (lambda
expressions) and built-in higher-order functions (e.g., trans-
form, filter, reduce).
B. Client Interfaces, Parsing, and Planning
1) Client Interfaces: The Presto coordinator primarily ex-
poses a RESTful HTTP interface to clients, and ships with
a first-class command line interface. Presto also ships with a
JDBC client, which enables compatibility with a wide variety
of BI tools, including Tableau and Microstrategy.
2) Parsing: Presto uses an ANTLR-based parser to convert
SQL statements into a syntax tree. The analyzer uses this
tree to determine types and coercions, resolve functions and
scopes, and extracts logical components, such as subqueries,
aggregations, and window functions.
3) Logical Planning: The logical planner uses the syntax
tree and analysis information to generate an intermediate
representation (IR) encoded in the form of a tree of plan nodes.
Each node represents a physical or logical operation, and the
children of a plan node are its inputs. The planner produces
nodes that are purely logical, i.e. they do not contain any
information about how the plan should be executed. Consider
a simple query:
SELECT
orders.orderkey, SUM(tax)
FROM orders
LEFT JOIN lineitem
ON orders.orderkey = lineitem.orderkey
WHERE discount = 0
GROUP BY orders.orderkey
The logical plan for this query is outlined in Figure 2.
Aggregate [SUM(tax)]
LeftJoin [ON orderkey]
Scan [orders]
Filter [discount=0]
Scan [lineitem]
Fig. 2. Logical Plan
C. Query Optimization
The plan optimizer transforms the logical plan into a more
physical structure that represents an efficient execution strategy
for the query. The process works by evaluating a set of
transformation rules greedily until a fixed point is reached.
Each rule has a pattern that can match a sub-tree of the
query plan and determines whether the transformation should
be applied. The result is a logically equivalent sub-plan that
replaces the target of the match. Presto contains several rules,
including well-known optimizations such as predicate and
limit pushdown, column pruning, and decorrelation.
We are in the process of enhancing the optimizer to perform
a more comprehensive exploration of the search space using
a cost-based evaluation of plans based on the techniques
introduced by the Cascades framework [13]. However, Presto
already supports two cost-based optimizations that take table
and column statistics into account - join strategy selection and
join re-ordering. We will discuss only a few features of the
optimizer; a detailed treatment is out of the scope of this paper.
1) Data Layouts: The optimizer can take advantage of
the physical layout of the data when it is provided by the
connector Data Layout API. Connectors report locations and
other data properties such as partitioning, sorting, grouping,