Immersive Displays with Multiple Perspective Views
Providing multiple people with personal perspective views
in augmented or virtual environments has traditionally been
achieved through head-worn displays [7], where each user
has their own view on their own personal display. In practice,
the obstruction of the face with such displays and their
narrow field of view can lead to unnatural face-to-face
interactions. Our solution avoids this difficulty.
Alternatively, projection-bases systems rely on either time-
or space-multiplexing the projected images to show multiple
independent views. Agrawala et al. [2] demonstrates a time-
multiplexed approach with synchronized shutter glasses to
enable two stereoscopic user perspectives on an interactive
tabletop. While conceptually simple, this approach tends to
suffer from low image brightness and sometimes perceivable
flicker since each eye only gets a small slice of the available
light in each frame.
In contrast, Bimber et al.’s Virtual Showcase [5] uses multi-
plane beam combiners to enable up to four independent
perspectives on a spatially-multiplexed tabletop: each view
appears at a different location of the display and is optically
combined with four mirrors. Similarly, IllusionHole [21]
uses spatial-multiplexing to provide multiple participants
correct perspective views around the tabletop. Their display
greatly reduces the visible display area for each user to
ensure minimal image overlap. Our work also employs
spatial multiplexing approach to support multiple views. In
contrast, we exploit the fact that in the face-to-face
arrangement, most of the surfaces that each person sees are
on their partner’s body or in their partner’s background.
These surfaces are good for projections of individual views,
as they are not easily perceived by their partner.
Depth Perception and Object Presence
The ability of our system to provide two users with a sense
of spatial presence in a room-scale augmented reality hinges
on the human ability to perceive depth from a perspective
view without binocular cues. Conventional measures of
person- and object-presence have mostly been defined for
virtual environment displays that surround and isolate the
user [31]. Stevens and colleagues found that the users can
experience a measurable sense of object-presence with
projection-augmented models [33]. However, their
questionnaire-based study examined only planar projections
and not perspective views.
The relative importance of various depth cues in perception
of virtual objects [9] is an important consideration for our
system since we do not offer stereoscopic vision.
Sollenberger and Milgram [32] showed a large improvement
of head-coupled stereo over static non-head-coupled non-
stereo displays while Arthur et al.’s experiments [1] showed
that users greatly preferred head coupling without stereo to
stereo head-coupled displays in fish-tank VR. How such
results apply to perspective SAR configurations remains
unclear. The most closely related work in this space is the
depth perception study of MirageTable [3] which showed
substantial accuracy in users’ estimates of depth on a SAR
tabletop with head-coupled stereo view. Also related is a
pilot experiment reported by Broecker et al. [6] who
investigated a variety of cues affecting depth perception in a
view-dependent near-field SAR, but found no statistically
significant results.
While focusing on projected tabletops, Hancock et al. [11]
evaluated people’s ability to judge object orientation under
different projection conditions. Their work highlights the
importance of correct perspective on judgment of objects’
spatial presence especially in multi-user scenarios. There is
also a long line of related research in cognitive psychology
on understanding the relative importance of different cues for
depth perception (e.g., [9, 10, 30]). The complete review of
this work is beyond the scope of this paper, but the three
volume book by Howard [16] offers the definitive summary
of knowledge on the topic.
SYSTEM DESCRIPTION
We describe our prototype dyadic SAR system, including
hardware configuration, scene modeling, dynamic projection
mapping and support for multiple simultaneous views.
Technical contributions of our work include: a particular
configuration of pro-cam units to support dyadic interaction,
a graphics pipeline that blends views from two simultaneous
perspectives while supporting projection onto dynamic depth
maps (e.g., people), and the design of interactions and
experiences that showcase these capabilities.
Hardware Configuration
Our prototype dyadic SAR system employs three HD video
projectors (BenQ W1080ST), each paired with a Kinect for
Windows v2 sensor. Their mounting was chosen to both
display and sense around two users that are approximately
facing each other in a large room. Two of the projector and
camera pairs are mounted on the ceiling, about two feet
above the heads of each of the two users. These are oriented
so that they approximately face one another, covering the
opposite walls and part of the floor (Figure 1). Roughly
speaking, each user’s view is rendered by the projector above
them. The precise surface geometry necessary for dynamic
projection mapping for a user’s view is provided by the
Kinect paired with the projector above them. Meanwhile,
body tracking of that user is supported by the opposite facing
Kinect camera. While mounted significantly higher than in
most applications, Kinect body tracking works well in this
configuration. This symmetric arrangement of projectors and
cameras follows the symmetric nature of dyadic interaction.
The third projector and camera pair is mounted on the
ceiling, facing downwards, to cover the area between the
areas covered by the first two projectors.
Our current implementation is primarily hosted on a single
PC which drives all three projectors. As the current Kinect
for Windows v2 SDK can support only one camera per PC,
we have three additional PCs which send the Kinect depth
images and other image processing results (e.g., body
tracking, optical flow) to the main PC via network. All depth