ISO/IEC 23009-1:2014(E)
10
© ISO/IEC 2014 – All rights reserved
DASH is based on a hierarchical data model aligned with the presentation in Figure 3. A DASH Media
Presentation is described by a Media Presentation Description document. This describes the sequence of
Periods (see 5.3.2) in time that make up the Media Presentation. A Period typically represents a media
content period during which a consistent set of encoded versions of the media content is available i.e. the set
of available bitrates, languages, captions, subtitles etc. does not change during a Period.
Within a Period, material is arranged into Adaptation Sets (see 5.3.3). An Adaptation Set represents a set of
interchangeable encoded versions of one or several media content components (see 5.3.4). For example
there may be one Adaptation Set for the main video component and a separate one for the main audio
component. If there is other material available, for example captions or audio descriptions, then these may
each have a separate Adaptation Set. Material may also be provided in multiplexed form, in which case
interchangeable versions of the multiplex may be described as a single Adaptation Set, for example an
Adaptation Set containing both the main audio and main video for a Period. Each of the multiplexed
components may be described individually by a media content component description.
An Adaptation Set contains a set of Representations (see 5.3.5). A Representation describes a deliverable
encoded version of one or several media content components. A Representation includes one or more media
streams (one for each media content component in the multiplex). Any single Representation within an
Adaptation Set is sufficient to render the contained media content components. By collecting different
Representations in one Adaptation Set, the Media Presentation author expresses that the Representations
represent perceptually equivalent content. Typically this means, that clients may switch dynamically from
Representation to Representation within an Adaptation Set in order to adapt to network conditions or other
factors. Switching refers to the presentation of decoded data up to a certain time t, and presentation of
decoded data of another Representation from time t onwards. If Representations are included in one
Adaptation Set, and the client switches properly, the Media Presentation is expected to be perceived
seamless across the switch. Clients may ignore Representations that rely on codecs or other rendering
technologies they do not support or that are otherwise unsuitable.
Within a Representation, the content may be divided in time into Segments (see 5.3.9 and 6) for proper
accessibility and delivery. In order to access a Segment, a URL is provided for each Segment. Consequently,
a Segment is the largest unit of data that can be retrieved with a single HTTP request.
NOTE This is not strictly true, since the MPD may also include a byte range with the URL, meaning that the
Segment is contained in the provided byte range of some larger resource. An intelligent client could in principle
construct a single request for multiple Segments, but this would not be the typical case.
DASH defines different timelines. One of the key features in DASH is that encoded versions of different media
content components share a common timeline. The presentation time of each access unit within the media
content is mapped to the global common presentation timeline for synchronization of different media
components and to enable seamless switching of different coded versions of the same media components.
This timeline is referred as Media Presentation timeline. The Media Segments themselves contain accurate
Media Presentation timing information enabling synchronization of components and seamless switching.
A second timeline is used to signal to clients the availability time of Segments at the specified HTTP-URLs.
These times are referred to as Segment availability times and are provided in wall-clock time. Clients
typically compare the wall-clock time to Segment availability times before accessing the Segments at the
specified HTTP-URLs in order to avoid erroneous HTTP request responses. For static Media Presentations,
the availability times of all Segments are identical. For dynamic Media Presentations, the availability times of
segments depend on the position of the Segment in the Media Presentation timeline, i.e. the Segments get
available over time. Whereas static Media Presentations are suitable to offer On-Demand content, dynamic
Media Presentations are mostly suitable to offer live services.
Segments are assigned a duration, which is the duration of the media contained in the Segment when
presented at normal speed. Typically all Segments in a Representation have the same or roughly similar
duration. However Segment duration may differ from Representation to Representation. A DASH presentation
can be constructed with relative short segments (for example a few seconds), or longer Segments including a
single Segment for the whole Representation.
Short Segments are usually required in the case of live content, where there are restrictions on end-to-end
latency. The duration of a Segment is typically a lower bound on the end-to-end latency. DASH does not