GPU Virtualization on VMware’s Hosted I/O Architecture
Micah Dowty, Jeremy Sugerman
VMware, Inc.
3401 Hillview Ave, Palo Alto, CA 94304
micah@vmware.com, yoel@vmware.com
Abstract
Modern graphics co-processors (GPUs) can produce
high fidelity images several orders of magnitude faster
than general purpose CPUs, and this performance expec-
tation is rapidly becoming ubiquitous in personal com-
puters. Despite this, GPU virtualization is a nascent field
of research. This paper introduces a taxonomy of strate-
gies for GPU virtualization and describes in detail the
specific GPU virtualization architecture developed for
VMware’s hosted products (VMware Workstation and
VMware Fusion).
We analyze the performance of our GPU virtualiza-
tion with a combination of applications and microbench-
marks. We also compare against software rendering, the
GPU virtualization in Parallels Desktop 3.0, and the na-
tive GPU. We find that taking advantage of hardware
acceleration significantly closes the gap between pure
emulation and native, but that different implementations
and host graphics stacks show distinct variation. The mi-
crobenchmarks show that our architecture amplifies the
overheads in the traditional graphics API bottlenecks:
draw calls, downloading buffers, and batch sizes.
Our virtual GPU architecture runs modern graphics-
intensive games and applications at interactive frame
rates while preserving virtual machine portability. The
applications we tested achieve from 86% to 12% of na-
tive rates and 43 to 18 frames per second with VMware
Fusion 2.0.
1 Introduction
Over the past decade, virtual machines (VMs) have be-
come increasingly popular as a technology for multi-
plexing both desktop and server commodity x86 com-
puters. Over that time, several critical challenges in
CPU virtualization were solved and there are now both
software and hardware techniques for virtualizing CPUs
with very low overheads [1]. I/O virtualization, how-
ever, is still very much an open problem and a wide
variety of strategies are used. Graphics co-processors
(GPUs) in particular present a challenging mixture of
broad complexity, high performance, rapid change, and
limited documentation.
Modern high-end GPUs have more transistors, draw
more power, and offer at least an order of magnitude
more computational performance than CPUs. At the
same time, GPU acceleration has extended beyond en-
tertainment (e.g., games and video) into the basic win-
dowing systems of recent operating systems and is start-
ing to be applied to non-graphical high-performance ap-
plications including protein folding, financial modeling,
and medical image processing. The rise in applications
that exploit, or even assume, GPU acceleration makes
it increasingly important to expose the physical graph-
ics hardware in virtualized environments. Additionally,
virtual desktop infrastructure (VDI) initiatives have led
many enterprises to try to simplify their desktop man-
agement by delivering VMs to their users. Graphics vir-
tualization is extremely important to a user whose pri-
mary desktop runs inside a VM.
GPUs pose a unique challenge in the field of virtu-
alization. Machine virtualization multiplexes physical
hardware by presenting each VM with a virtual device
and combining their respective operations in the hyper-
visor platform in a way that utilizes native hardware
while preserving the illusion that each guest has a com-
plete stand-alone device. Graphics processors are ex-
tremely complicated devices. In addition, unlike CPUs,
chipsets, and popular storage and network controllers,
GPU designers are highly secretive about the specifi-
cations for their hardware. Finally, GPU architectures
change dramatically across generations and their gener-
ational cycle is short compared to CPUs and other de-
vices. Thus, it is nearly intractable to provide a virtual
device corresponding to a real modern GPU. Even start-
ing with a complete implementation, updating it for each
new GPU generation would be prohibitively laborious.
Thus, rather than modeling a complete modern GPU,
our primary approach paravirtualizes: it delivers an ide-
alized software-only GPU and our own custom graphics
driver for interfacing with the guest operating system.
The main technical contributions of this paper are (1)
a taxonomy of GPU virtualization strategies—both emu-
lated and passthrough-based, (2) an overview of the vir-
tual graphics stack in VMware’s hosted architecture, and
(3) an evaluation and comparison of VMware Fusion’s
3D acceleration with other approaches. We find that a
hosted model [2] is a good fit for handling complicated,
rapidly changing GPUs while the largely asynchronous
Published in the USENIX Workshop on I/O Virtualization 2008 1