The Lua compiler uses no intermediate representation. It emits instructions for the
virtual machine “on the fly” as it parses a program. Nevertheless, it does perform some
optimizations. For instance, it delays the generation of code for base expressions like
variables and constants. When it parses such expressions, it generates no code; instead,
it uses a simple structure to represent them. Therefore, it is very easy to check whether
an operand for a given instruction is a constant or a local variable and use those values
directly in the instruction, thus avoiding unnecessary and costly moves (see Section 3).
To be portable across many different C compilers and platforms, Lua cannot use
several tricks commonly used by interpreters, such as direct threaded code [8]. Instead,
it uses a standard while–switch dispatch loop. Also, at places the C code seems un-
duly complicated, but the complication is there to ensure portability. The portability of
Lua’s implementation has increased steadily throughout the years, as Lua got compiled
under many different C compilers in many different platforms (including several 64-bit
platforms and some 16-bit platforms).
We consider that we have achieved our design and implementation goals. Lua is a
very portable language: it runs on any machine with an ANSI C compiler, from embedded
systems to mainframes. Lua is really lightweight: for instance, on Linux its stand-alone
interpreter, complete with all standard libraries, takes less than 150 Kbytes; the core is
less than 100 Kbytes. Lua is efficient: independent benchmarks [2, 4] show Lua as one of
the fastest language implementations in the realm of scripting languages (i.e., interpreted
and dynamically-typed languages). We also consider Lua a simple language, being syn-
tactically similar to Pascal and semantically similar to Scheme, but this is subjective.
3. The Representation of Values
Lua is a dynamically-typed language: types are attached to values rather than to vari-
ables. Lua has eight basic types: nil, boolean, number, string, table, function, userdata,
and thread. Nil is a marker type having only one value, also called nil. Boolean values
are the usual true and false. Numbers are double-precision floating-point numbers,
corresponding to the type double in C, but it is easy to compile Lua using float or
long instead. (Several games consoles and smaller machines lack hardware support for
double.) Strings are arrays of bytes with an explicit size, and so can contain arbitrary bi-
nary data, including embedded zeros. Tables are associative arrays, which can be indexed
by any value (except nil) and can hold any value. Functions are either Lua functions or
C functions written according to a protocol for interfacing with the Lua virtual machine.
Userdata are essentially pointers to user memory blocks, and come in two flavors: heavy,
whose blocks are allocated by Lua and are subject to garbage collection, and light, whose
blocks are allocated and freed by the user. Finally, threads represent coroutines. Values of
all types are first-class values: we can store them in global variables, local variables and
table fields, pass them as arguments to functions, return them from functions, etc.
Lua represents values as tagged unions, that is, as pairs (t, v), where t is an integer
tag identifying the type of the value v, which is a union of C types implementing Lua
types. Nil has a single value. Booleans and numbers are implemented as ‘unboxed’
values: v represents values of those types directly in the union. This implies that the union
must have enough space for a double. Strings, tables, functions, threads, and userdata
values are implemented by reference: v contains pointers to structures that implement
those values. Those structures share a common head, which keeps information needed
for garbage collection. The rest of the structure is specific to each type.
Figure 1 shows a glimpse of the actual implementation of Lua values. TObject is
the main structure in this implementation: it represents the tagged unions (t, v) described