Orca%
Database%System%
%
%
%
!
!
!
Parser! Catalog! Executor!
DXL!Query! DXL!MD! DXL!Plan!
Query2DXL! DXL2Plan!
Query!
Results!
MD!Provider!
Figure 2: Interaction of Orca with database system
Search'
Property'Enforcement'
Memory'Manager'
Concurrency'
Control'
GPOS%
OS%
Orca%
Operators'
Transforma9ons'
Card.'Es9ma9on'
Cost'Model'
Op*mizer%Tools%
Job'Scheduler'
File'I/O'
Memo%
DXL'Query'
DXL'Plan'
MD'Cache'
Excep9on'
Handling'
Figure 3: Orca architecture
for communication, such as input queries, output plans and
metadata. Overlaid on DXL is a simple communication pro-
tocol to send the initial query structure and retrieve the
optimized plan. A major benefit of DXL is packaging Orca
as a stand-alone product.
Figure 2 shows the interaction between Orca and an ex-
ternal database system. The input to Orca is a DXL query.
The output of Orca is a DXL plan. During optimization,
the database system can be queried for metadata (e.g., ta-
ble definitions). Orca abstracts metadata access details by
allowing database system to register a metadata provider
(MD Provider) that is responsible for serializing metadata
into DXL before being sent to Orca. Metadata can also be
consumed from regular files containing metadata objects se-
rialized in DXL format.
The database system needs to include translators that
consume/emit data in DXL format. Query2DXL transla-
tor converts a query parse tree into a DXL query, while
DXL2Plan translator converts a DXL plan into an executable
plan. The implementation of such translators is done com-
pletely outside Orca, which allows multiple systems to use
Orca by providing the appropriate translators.
The architecture of Orca is highly extensible; all compo-
nents can be replaced individually and configured separately.
Figure 3 shows the different components of Orca. We briefly
describe these components as follows.
Memo. The space of plan alternatives generated by the
optimizer is encoded in a compact in-memory data struc-
ture called the Memo [13]. The Memo structure consists of
a set of containers called groups, where each group contains
logically equivalent expressions. Memo groups capture the
different sub-goals of a query (e.g., a filter on a table, or a
join of two tables). Group members, called group expres-
sions, achieve the group goal in different logical ways (e.g.,
different join orders). Each group expression is an operator
that has other groups as its children. This recursive struc-
ture of the Memo allows compact encoding of a huge space
of possible plans as we illustrate in Section 4.1.
Search and Job Scheduler. Orca uses a search mecha-
nism to navigate through the space of possible plan alter-
natives and identify the plan with the least estimated cost.
The search mechanism is enabled by a specialized Job Sched-
uler that creates dependent or parallel work units to perform
query optimization in three main steps: exploration, where
equivalent logical expressions are generated, implementation
where physical plans are generated, and optimization, where
required physical properties (e.g., sort order) are enforced
and plan alternatives are costed. We discuss the details of
optimization jobs scheduling in Section 4.2.
Transformations. [13] Plan alternatives are generated
by applying transformation rules that can produce either
equivalent logical expressions (e.g., InnerJoin(A,B) → In-
nerJoin(B,A)), or physical implementations of existing ex-
pressions (e.g., Join(A,B) → HashJoin(A,B)). The results of
applying transformation rules are copied-in to the Memo,
which may result in creating new groups and/or adding new
group expressions to existing groups. Each transformation
rule is a self-contained component that can be explicitly ac-
tivated/deactivated in Orca configurations.
Property Enforcement. Orca includes an extensible
framework for describing query requirements and plan char-
acteristics based on formal property specifications. Prop-
erties have different types including logical properties (e.g.,
output columns), physical properties (e.g., sort order and
data distribution), and scalar properties (e.g., columns used
in join conditions). During query optimization, each oper-
ator may request specific properties from its children. An
optimized child plan may either satisfy the required proper-
ties on its own (e.g., an IndexScan plan delivers sorted data),
or an enforcer (e.g., a Sort operator) needs to be plugged in
the plan to deliver the required property. The framework
allows each operator to control enforcers placement based
on child plans’ properties and operator’s local behavior. We
describe this framework in more detail in Section 4.1.
Metadata Cache. Since metadata (e.g., table definitions)
changes infrequently, shipping it with every query incurs an
overhead. Orca caches metadata on the optimizer side and
only retrieves pieces of it from the catalog if something is
unavailable in the cache, or has changed since the last time
it was loaded in the cache. Metadata cache also abstracts
the database system details from the optimizer, which is
particularly useful during testing and debugging.
GPOS. In order to interact with operating systems with
possibly different APIs, Orca uses an OS abstraction layer
called GPOS. The GPOS layer provides Orca with an exten-
sive infrastructure including a memory manager, primitives
for concurrency control, exception handling, file I/O and
synchronized data structures.
4. QUERY OPTIMIZATION
We describe Orca’s optimization workflow in Section 4.1.
We then show how the optimization process can be con-
ducted in parallel in Section 4.2.