Glow: Graph Lowering Compiler Techniques for
Neural Networks
Nadav Rotem, Jordan Fix, Saleem Abdulrasool, Summer Deng, Roman Dzhabarov,
James Hegeman, Roman Levenstein, Bert Maher, Satish Nadathur, Jakob Olesen,
Jongsoo Park, Artem Rakhov, Misha Smelyanskiy
Facebook
Abstract
This paper presents the design of Glow, a machine
learning compiler for heterogeneous hardware. It is
a pragmatic approach to compilation that enables
the generation of highly optimized code for multi-
ple targets. Glow lowers the traditional neural net-
work dataflow graph into a two-phase strongly-typed
intermediate representation. The high-level inter-
mediate representation allows the optimizer to per-
form domain-specific optimizations. The lower-level
instruction-based address-only intermediate represen-
tation allows the compiler to perform memory-related
optimizations, such as instruction scheduling, static
memory allocation and copy elimination. At the low-
est level, the optimizer performs machine-specific code
generation to take advantage of specialized hardware
features. Glow features a lowering phase which en-
ables the compiler to support a high number of input
operators as well as a large number of hardware targets
by eliminating the need to implement all operators on
all targets. The lowering phase is designed to reduce
the input space and allow new hardware backends to
focus on a small number of linear algebra primitives.
1 Introduction
The end of power saving due to Moore’s Law, com-
bined with the increased demand for compute power
driven by machine learning, has led to a wave of in-
novation in computer architecture. Hennessy and
Patterson [1] present five principles that guide the de-
sign of machine-learning domain specific architectures
(DSA): dedicated local memories, large numbers of
arithmetic units, simple forms of parallelism, reduced
bitwidths, and domain-specific programming mod-
els. Compilers need to perform advance whole-graph
optimizations in order to execute neural networks ef-
ficiently on DSAs. In this paper we describe some of
these techniques.
Traditional machine learning frameworks iterate
over the nodes in the graph and execute them one
by one. Unfortunately the node-visitor method of
execution is inefficient, even on traditional proces-
sors. As a result, machine learning frameworks have
started to hand over the graph to compilers [2] that
execute code more efficiently. Based on the increasing
importance of neural networks, the need for energy
efficiency in data centers and mobile devices, and the
design principles of domain-specific architectures, we
believe that the machine learning frameworks of the
future will focus on providing attractive programming
models on top of a layer that integrates compilers for
many different targets.
In the Glow project, we focus on the lower parts of
the software stack. We work to provide PyTorch [3]
and other frameworks with a low-level graph and a
code generator for neural networks. The name Glow
is an abbreviation for Graph-Lowering, which is the
main technique that the compiler uses for generat-
ing efficient code. The Glow low-level graph will
not replace the machine learning high-level graph, in
the same way that the low-level intermediate repre-
sentation in compilers does not replace the abstract
syntax tree. We aim to provide a useful compiler
toolkit that will allow hardware developers to focus
on implementing efficient acceleration hardware, each
of which likely differ in capabilities, and use Glow
for automating compilation tasks such as instruction
selection, memory allocation and graph scheduling.
The full compiler toolkit is open-source and publicly
available
1
.
2 Related Work
2.1 Relationship to Neural Network
Frameworks
Frameworks such as PyTorch [3], Caffe [4], and Ten-
sorFlow [5] have found success by providing a useful
1
http://github.com/pytorch/glow
1
arXiv:1805.00907v2 [cs.PL] 4 May 2018