runtime library manages and maps to kernel-level threads
in an
M
-to-
N
way. A goroutine can be created by simply
adding the keyword go before a function call.
To make goroutines easy to create, Go also supports creat-
ing a new goroutine using an anonymous function, a function
denition that has no identier, or “name”. All local variables
declared before an anonymous function are accessible to the
anonymous function, and are potentially shared between
a parent goroutine and a child goroutine created using the
anonymous function, causing data race (Section 6).
2.2 Synchronization with Shared Memory
Go supports traditional shared memory accesses across
goroutines. It supports various traditional synchroniza-
tion primitives like lock/unlock (
Mutex
), read/write lock
(
RWMutex
), condition variable (
Cond
), and atomic read/write
(
atomic
). Go’s implementation of
RWMutex
is dierent from
pthread_rwlock_t
in C. Write lock requests in Go have a
higher privilege than read lock requests.
As a new primitive introduced by Go,
Once
is designed
to guarantee a function is only executed once. It has a
Do
method, with a function
f
as argument. When
Once.Do(f)
is invoked many times, only for the rst time,
f
is executed.
Once
is widely used to ensure a shared variable only be
initialized once by multiple goroutines.
Similar to
pthread_join
in C, Go uses
WaitGroup
to al-
low multiple goroutines to nish their shared variable ac-
cesses before a waiting goroutine. Goroutines are added to
a
WaitGroup
by calling
Add
. Goroutines in a
WaitGroup
use
Done
to notify their completion, and a goroutine calls
Wait
to wait for the completion notication of all goroutines in
a
WaitGroup
. Misusing
WaitGroup
can cause both blocking
bugs (Section 5) and non-blocking bugs (Section 6).
2.3 Synchronization with Message Passing
Channel (
chan
) is a new concurrency primitive introduced
by Go to send data and states across goroutines and to build
more complex functionalities [
3
,
50
]. Go supports two types
of channels: buered and unbuered. Sending data to (or
receiving data from) an unbuered channel will block a gor-
outine, until another goroutine receives data from (or sends
data to) the channel. Sending to a buered channel will only
block, when the buer is full. There are several underlying
rules in using channels and the violation of them can create
concurrency bugs. For example, channel can only be used
after initialization, and sending data to (or receiving data
from) a
nil
channel will block a goroutine forever. Sending
data to a closed channel or close an already closed channel
can trigger a runtime panic.
The
select
statement allows a goroutine to wait on mul-
tiple channel operations. A
select
will block until one of its
cases can make progress or when it can execute a
default
branch. When more than one cases in a
select
are valid, Go
will randomly choose one to execute. This randomness can
cause concurrency bugs as will be discussed in Section 6.
Go introduces several new semantics to ease the interac-
tion across multiple goroutines. For example, to assist the
programming model of serving a user request by spawn-
ing a set of goroutines that work together, Go introduces
context
to carry request-specic data or metadata across
goroutines. As another example,
Pipe
is designed to stream
data between a
Reader
and a
Writer
. Both
context
and
Pipe
are new forms of passing messages and misusing them
can create new types of concurrency bugs (Section 5).
Application Stars Commits Contributors LOC Dev History
Docker 48975 35149 1767 786K 4.2 Years
Kubernetes 36581 65684 1679 2297K 3.9 Years
etcd 18417 14101 436 441K 4.9 Years
CockroachDB 13461 29485 197 520k 4.2 Years
gRPC* 5594 2528 148 53K 3.3 Years
BoltDB 8530 816 98 9K 4.4 Years
Table 1. Information of selecte d applications.
The num-
ber of stars, commits, contributors on GitHub, total source lines of
code, and development history on GitHub. *: the gRPC version that is
written in Go.
2.4 Go Applications
Recent years have seen a quick increase in popularity and
adoption of the Go language. Go was the 9th most popular
language on GitHub in 2017 [
18
]. As of the time of writing,
there are 187K GitHub repositories written in Go.
In this study, we selected six representative, real-world
software written in Go, including two container systems
(Docker and Kubernetes), one key-value store system (etcd),
two databases (CockroachDB and BoltDB), and one RPC
library (gRPC-go
1
) (Table 1). These applications are open-
source projects that have gained wide usages in datacenter
environments. For example, Docker and Kubernetes are the
top 2 most popular applications written in Go on GitHub,
with 48.9K and 36.5K stars (etcd is the 10th, and the rest are
ranked in top 100). Our selected applications all have at least
three years of development history and are actively main-
tained by developers currently. All our selected applications
are of middle to large sizes, with lines of code ranging from 9
thousand to more than 2 million. Among the six applications,
Kubernetes and gRPC are projects originally developed by
Google.
3 Go Concurrency Usage Patterns
Before studying Go concurrency bugs, it is important to rst
understand how real-world Go concurrent programs are like.
This section presents our static and dynamic analysis results
of goroutine usages and Go concurrency primitive usages in
our selected six applications.
1
We will use gRPC to represent the gRPC version that is written Go in the
following paper, unless otherwise specied.