Loop Recognition in C++/Java/Go/Scala
Robert Hundt
Google
1600 Amphitheatre Parkway
Mountain View, CA, 94043
rhundt@google.com
Abstract—In this experience report we encode a well specified,
compact benchmark in four programming languages, namely
C++, Java, Go, and Scala. The implementations each use the
languages’ idiomatic container classes, looping constructs, and
memory/object allocation schemes. It does not attempt to exploit
specific language and run-time features to achieve maximum
performance. This approach allows an almost fair comparison
of language features, code complexity, compilers and compile
time, binary sizes, run-times, and memory footprint.
While the benchmark itself is simple and compact, it em-
ploys many language features, in particular, higher-level data
structures (lists, maps, lists and arrays of sets and lists), a few
algorithms (union/find, dfs / deep recursion, and loop recognition
based on Tarjan), iterations over collection types, some object
oriented features, and interesting memory allocation patterns.
We do not explore any aspects of multi-threading, or higher level
type mechanisms, which vary greatly between the languages.
The benchmark points to very large differences in all examined
dimensions of the language implementations. After publication of
the benchmark internally at Google, several engineers produced
highly optimized versions of the benchmark. We describe many
of the performed optimizations, which were mostly targeting run-
time performance and code complexity. While this effort is an
anecdotal comparison only, the benchmark, and the subsequent
tuning efforts, are indicative of typical performance pain points
in the respective languages.
I. INTRODUCTION
Disagreements about the utility of programming languages
are as old as programming itself. Today, these “language wars”
become increasingly heated, and less meaningful, as more
people are working with more languages on more platforms
in settings of greater variety, e.g., from mobile to datacenters.
In this paper, we contribute to the discussion by imple-
menting a well defined algorithm in four different languages,
C++, Java, Go, and Scala. In all implementations, we use the
default, idiomatic data structures in each language, as well as
default type systems, memory allocation schemes, and default
iteration constructs. All four implementations stay very close
to the formal specification of the algorithm and do not attempt
any form of language specific optimization or adaption.
The benchmark itself is simple and compact. Each imple-
mentation contains some scaffold code, needed to construct
test cases allowing to benchmark the algorithm, and the
implementation of the algorithm itself.
The algorithm employs many language features, in partic-
ular, higher-level data structures (lists, maps, lists and arrays
of sets and lists), a few algorithms (union/find, dfs / deep
recursion, and loop recognition based on Tarjan), iterations
over collection types, some object oriented features, and
interesting memory allocation patterns. We do not explore any
aspects of multi-threading, or higher level type mechanisms,
which vary greatly between the languages. We also do not
perform heavy numerical computation, as this omission allows
amplification of core characteristics of the language implemen-
tations, specifically, memory utilization patterns.
We believe that this approach highlights features and charac-
teristics of the languages and allows an almost fair comparison
along the dimensions of source code complexity, compilers
and default libraries, compile time, binary sizes, run-times, and
memory footprint. The differences along these dimensions are
surprisingly large.
After publication of the benchmark internally at Google,
several engineers produced highly optimized versions of the
benchmark. We describe many of the performed optimizations,
which were mostly targeting run-time performance and code
complexity. While this evaluation is an anecdotal comparison
only, the benchmark itself, as well as the subsequent tuning
efforts, point to typical performance pain points in the respec-
tive languages.
The rest of this paper is organized as follows. We briefly
introduce the four languages in section II. We introduce the
algorithm and provide instructions on how to find, build, and
run it, in section III. We highlight core language properties
in section IV, as they are needed to understand the imple-
mentation and the performance properties. We describe the
benchmark and methodology in section V, which also contains
the performance evaluation. We discuss subsequent language
specific tuning efforts in section VI, before we conclude.
II. THE CONTENDERS
We describe the four languages by providing links to the
the corresponding wikipedia entries and cite the respective
first paragraphs from wikipedia. Readers familiar with the
languages can skip to the next section.
C++ [7] is a statically typed, free-form, multi-paradigm,
compiled, general-purpose programming language. It is re-
garded as a ”middle-level” language, as it comprises a com-
bination of both high-level and low-level language features.
It was developed by Bjarne Stroustrup starting in 1979 at
Bell Labs as an enhancement to the C language and originally
named C with Classes. It was renamed C++ in 1983.
Java [9] is a programming language originally developed
by James Gosling at Sun Microsystems (which is now a sub-