DirectX视频加速：WMV8, WMV9, VC-1解码规范

5星 · 超过95%的资源需积分: 9 173 浏览量更新于2024-07-30 收藏 972KB PDF 举报

"DirectX 视频加速技术是微软为 Windows Media Video (WMV) 8、WMV 9 和 SMPTE 421M（VC-1）提供的一种硬件加速解决方案，它通过 DirectX Video Acceleration (DXVA) 的扩展来实现。此技术旨在提高视频解码的效率，减轻 CPU 的负担，提升播放质量，尤其是在高清晰度视频内容的处理上。文档由微软公司于2007年12月发布，最后一次更新是在2010年8月，适用于 DirectX Video Acceleration 相关的应用和开发。 DirectX Video Acceleration (DXVA) 是微软 DirectX 技术的一部分，它允许图形处理单元 (GPU) 协助进行视频解码任务，从而减少中央处理器 (CPU) 的工作负载。这项技术对于多媒体应用，特别是视频播放软件和游戏，具有显著性能提升效果。DXVA 支持的 WMV 8、WMV 9 和 VC-1 格式涵盖了广泛的视频编码标准，包括高级视频编码 (Advanced Video Coding) 和 VC-1 标准，这些标准常用于高清 DVD 和在线流媒体服务。文档中的内容详细阐述了 DXVA 在解码 WMV 8、WMV 9 和 VC-1 视频时的具体规范和扩展。微软指出，由于市场环境不断变化，文档中的信息可能随时间而更新，因此不能视为微软的正式承诺，而且微软无法保证在发布后信息的准确性。此外，微软明确表示不提供任何形式的保修，无论是明示还是默示，对于文档中的信息，用户应遵守所有适用的版权法律，未经许可，不得复制、存储或以其他方式传播。在实际应用中，开发者可以利用 DXVA 接口和驱动程序来优化他们的应用程序，确保在支持 DXVA 的硬件上实现高效、流畅的视频播放。这通常涉及到对 GPU 能力的深入理解和利用，以及正确处理不同视频编码格式的解码流程。对于用户而言，这意味着能够享受更流畅、无卡顿的视频观看体验，同时减少系统资源的消耗，尤其是在老旧或者性能有限的计算机上。 DirectX Video Acceleration for WMV 是一种旨在提高视频播放性能的技术，通过利用 GPU 加速，它能够为 WMV 8、WMV 9 和 VC-1 格式的视频解码提供硬件级别的支持，从而提升用户体验，尤其是在处理高清视频时。了解并应用这些技术对于多媒体内容的创作者、开发者和消费者都至关重要。"

DirectX Video Acceleration for Windows Media Video Decoding 16

3.2.4 Inverse-Scan Method

The bPicScanFixed and bPicScanMethod members of the DXVA_PictureParameters

structure are used as follows:

 If bConfigBitStreamRaw is 0, indicating host-based bitstream parsing,

bPicScanFixed is not used for WMV 8 or WMV 9. The value is always set to 1,

and accelerators shall ignore the value. If bConfigBitStreamRaw is 1,

indicating off-host raw bitstream parsing, bPicScanFixed is used as specified in

section 3.2.20.5 of this specification.

 If bConfigBitStreamRaw is 0, bPicScanMethod is not used for WMV 8 or

WMV 9. It is always set to a fixed value, as described in section 4.0.

Accelerators shall ignore the value. If bConfigBitStreamRaw is 1,

bConfigBitStreamRaw is used as specified in section 3.2.20.5.

3.2.5 Flags Conveyed in bBidirectionalAveragingMode

The bBidirectionalAveragingMode member of the DXVA_PictureParameters

structure contains five flags for WMV 8 or WMV 9 decoding, defined as follows:

 iWMV9 = (bBidirectionalAveragingMode >> 7) & 1

 i9IRU = (bBidirectionalAveragingMode >> 6) & 1

 iOHIT = (bBidirectionalAveragingMode >> 5) & 1

 iINSO = (bBidirectionalAveragingMode >> 4) & 1

 iWMVA = (bBidirectionalAveragingMode >> 3) & 1

The other bits in bBidirectionalAveragingMode shall equal 0.

The uses of iWMV9 and iWMVA are described in various places in this specification.

Essentially, iWMV9 equal to 1 indicates WMV 9 processing, as opposed to WMV 8

processing, and iWMVA equal to 1 indicates WMV 9 Advanced profile, as opposed to

WMV 9 Simple or Main profile.

The accelerator should not need the value of the i9IRU flag, because the flag is 0 for

WMV 8 (when iWMV9 = 0), while for WMV 9 (iWMV9 = 1) this flag equals the value of

bConfigIntraResidUnsigned in the configuration parameters.

The accelerator should not need the value of the iOHIT flag, because its value equals

the value of bConfigResidDiffAccelerator in the configuration parameters structure.

Note that bConfigResidDiffHost and bConfigResidDiffAccelerator cannot both equal

1 for WMV 8 or WMV 9 decoding.

The iINSO flag is used to invoke the WMV 9 intensity scaling and offset functionality,

described in section 3.2.16 of this specification.

3.2.6 Picture Width and Height

The width and height of the picture are specified in the wPicWidthInMBminus1 and

wPicHeightInMBminus1 members of the DXVA_PictureParameters structure. Two

variables, FrameWidthInLumaSamples and FrameHeightInLumaSamples, are computed

from these values as follows:

 If iWMVA equals 1:

 FrameWidthInLumaSamples = wPicWidthInMBminus1 + 1.

 FrameHeightInLumaSamples = wPicHeightInMBminus1 + 1.

DirectX Video Acceleration for Windows Media Video Decoding 17

An intermediate value HeightDivisor is derived as follows:

 If bPicStructure is 11b (frame), HeightDivisor = 1.

 Otherwise, HeightDivisor = 2.

These values are interpreted as follows:

 wPicWidthInMBminus1 + 1 gives the width of the cropped luma array for the

picture, in units of luma samples.

 FrameWidthInLumaSamples gives the width of the cropped luma array for the

frame, in units of luma samples.

 (wPicHeightInMBminus1 + 1) / HeightDivisor gives the height of the cropped

luma array for the picture, in units of luma samples.

 FrameHeightInLumaSamples gives the height of the cropped luma array for the

frame, in units of luma samples.

The value of FrameWidthInLumaSamples shall be an integer multiple of 2.

For video coded as progressive scan (that is, when bPicExtrapolation in the picture

parameters data structure is not 2), the value of FrameHeightInLumaSamples shall

be an integer multiple of 2.

For video coded as interlaced scan (that is, when bPicExtrapolation is 2,

regardless of whether it is coded as field-structured pictures or frame-structured

pictures), the value of FrameHeightInLumaSamples shall be an integer multiple of 4.

 If iWMVA equals 0:

 FrameWidthInLumaSamples = (wPicWidthInMBminus1 + 1) * 16.

 FrameHeightInLumaSamples = (wPicHeightInMBminus1 + 1) * 16.

These values are interpreted as follows:

 wPicWidthInMBminus1 + 1 gives the width of the cropped luma array for the

picture, in units of macroblocks.

 FrameWidthInLumaSamples gives the width of the cropped luma array for

frame, in units of luma samples.

 wPicHeightInMBminus1 + 1 gives the height of the cropped luma array for the

picture, in units of macroblocks.

 FrameHeightInLumaSamples gives the height of the cropped luma array for the

frame, in units of luma samples.

Note When decoding video that is coded using WMV 9 Simple or Main profile, the

values of FrameWidthInLumaSamples and FrameHeightInLumaSamples must always

be integer multiples of 16. This is not the case for video that is coded using WMV 9

Advanced profile. In that profile, the smaller size of the cropping rectangle becomes part

of the decoding process. Thus, while the VC1_A, VC1_B, VC1_C, or VC1_D restricted

profile is needed for decoding Advanced profile bitstreams, the other (older) restricted

profiles are sufficient to decode WMV 9 Simple or Main profile.

Simple and Main profiles can still be used to decode pictures that are not an integer

multiple of 16 in width or height. However, that particular aspect of the picture size

information is not required for the basic decoding process.

DirectX Video Acceleration for Windows Media Video Decoding 18

Regardless of the profile in use, the media type passed in the decoder's IPin::Connect

method must specify an integer multiple of 16 for the width and height in the bmiHeader

member of the VIDEOINFOHEADER or VIDEOINFOHEADER2 format structure. (This

structure is given by the pbFormat member of the AM_MEDIA_TYPE structure in the

IPin::Connect method.) That data structure specifies the dimensions of the destination

surface (in DXVA 1) and also specifies the contents of the

DD_CREATEMOCOMPDATA structure that is passed to the driver in the

DdMoCompCreate function.

For the same reasons, when decoding an interlaced sequence coded using the

Advanced profile (that is, when the INTERLACE syntax element defined in subclause

6.1.9 of the VC-1 specification equals 1), the height given in bmiHeader must be an

integer multiple of 32. In this case, multiples of 32 are required because an interlaced

frame can be encoded as a pair of field pictures, and each field picture must be an

integer multiple of 16 in height.

However, when decoding a Simple or Main profile bitstream, the software decoder can

specify smaller dimensions in the rcSource rectangle of the VIDEOINFOHEADER or

VIDEOINFOHEADER2 format structure. Setting smaller dimensions in rcSource

enables cropping of the decoded picture to a smaller size. The same method was used

for H.263: the decoding process operates on a picture that spans an integer number of

macroblocks, and a cropping rectangle is used outside of the decoding process to trim

the output picture. (It is somewhat more complicated in MPEG-4 part 2 decoding.)

In addition, the following variables shall be computed:

 FrameWidthInMBs = (FrameWidthInLumaSamples + 15) / 16

 FrameHeightInMBs = (FrameHeightInLumaSamples + 15) / 16

 If bPicStructure is 11b (frame), PicHeightInMBs = FrameHeightInMBs;

otherwise, PicHeightInMBs = (FrameHeightInLumaSamples + 31) / 32.

Note For the WMV 9 Advanced profile (when iWMVA is 1), it is important to note that

the coded video may contain a mixture of progressive frames, interlaced frames, and

interlaced fields. If FrameHeightInMbs is an odd number, the total number of

macroblocks in a pair of coded fields will not equal the number of macroblocks in a

coded frame.

3.2.7 Lack of Backward Prediction in WMV 8

In WMV 8, backward prediction is not used, and the bPicBackwardPrediction member

of the DXVA_PictureParameters structure is always 0.

3.2.8 Backward Prediction in WMV 9

In WMV 9, backward prediction can be used, and the bPicBackwardPrediction

member of the DXVA_PictureParameters structure may equal 1.

3.2.9 Motion Compensation Padding

A process for padding the boundaries of the luma and chroma arrays of reference

pictures for WMV 8 and WMV 9 is specified as follows. If a decoded reference picture is

later changed as a result of reference-picture modification, the padding process must be

repeated using the modified values. For WMV Advanced profile, a pair of reference

fields is treated as a frame for the padding process.

DirectX Video Acceleration for Windows Media Video Decoding 19

1. For padding luma arrays, set M = 16, FW = FrameWidthInLumaSamples, FH =

FrameHeightInLumaSamples. For padding chroma arrays, set M = 8, FW =

(FrameWidthInLumaSamples + 1) / 2, and FH = (FrameHeightInLumaSamples + 1) /

2. Horizontal padding is applied as follows for vertical positions i = 0 to FH − 1.

 For j = 1 to 2 * M, samples at virtual positions (x = −j, y = i) are created by

setting each of these samples to the value of the sample at position (x = 0, y =

i).

 When FW % M is not zero, samples at virtual positions (x = FW + j, y = i) for j =

0 to M − 1 − (FW % M) are created by setting each of these samples to the

value of the sample at position (x = FW − 1, y = i).

 For j = 0 to 2 * M − 1, samples at virtual positions (x = FrameWidthInMBs * M +

j, y = i) are created by setting each of these samples to the value of the sample

at position (x = FrameWidthInMBs * M − 1, y = i).

3. If the reference frame was coded with bPicExtrapolation equal to 1 (progressive-

scan extrapolation), the following applies for j = −2 * M to (FrameWidthInMBs + 2) *

M − 1.

Note This case can occur only when the reference frame was coded with

bPicStructure equal to 11b (frame).

 For i = 1 to 4 * M, samples at virtual positions (x = j, y = −i) are created by

setting each of these samples to the value of the sample at position (x = j, y =

0).

 When FH % M is not zero, samples at virtual positions (x = j, y = FH + i) for i = 0

to M − 1 − (FH % M) are created by setting each of these samples to the value

of the sample at position (x = j, y = FH − 1).

 For i = 0 to 4 * M − 1, samples at virtual positions (x = j, y = FrameHeightInMBs

* M + i) are created by setting each of these samples to the value of the sample

at position (x = j, y = FrameHeightInMBs * M − 1).

4. Otherwise, if bPicExtrapolation equals 2, the following applies for j = −2 * M to

(FrameWidthInMBs + 2) * M − 1.

Note This case can occur only with WMV 9 Advanced profile. It can occur when

the reference frame was coded with bPicStructure equal to 11b (frame), or when

the reference frame was coded as two pictures with bPicStructure equal to 01b or

10b.

 For i = 1 to 2 * M, samples at virtual positions (x = j, y = −2 * i) are created by

setting each of these samples to the value of the sample at position (x = j, y =

0).

 For i = 1 to 2 * M, samples at virtual positions (x = j, y = −2 * i + 1) are created

by setting each of these samples to the value of the sample at position (x = j, y =

1).

 When FH % M is not zero, samples at virtual positions (x = j, y = FH + 2 * i) for i

= 0 to (M / 2) − 1 − ((FH % M) / 2) are created by setting each of these samples

to the value of the sample at position (x = j, y = FH − 2).

DirectX Video Acceleration for Windows Media Video Decoding 20

 When FH % M is not zero, samples at virtual positions (x = j, y = FH + 2 * i + 1)

for i = 0 to (M / 2) − 1 − ((FH % M) / 2) are created by setting each of these

samples to the value of the sample at position (x = j, y = FH − 1).

 For i = 0 to 2 * M − 1, samples at virtual positions (x = j, y = FrameHeightInMBs

* M + 2 * i) are created by setting each of these samples to the value of the

sample at position (x = j, y = FrameHeightInMBs * M − 2).

 For i = 0 to 2 * M − 1, samples at virtual positions (x = j, y = FrameHeightInMBs

* M + 2 * i + 1) are created by setting each of these samples to the value of the

sample at position (x = j, y = FrameHeightInMBs * M − 1).

Note This padding process may not be necessary in accelerators that can operate with

memory address clipping of the reference picture texture surface. The padding process

is defined here to provide a clear description of the necessary values in the decoded

picture, not as a prescription of how to obtain the results.

For MPEG-4 part 2, motion vector (MV) range clipping by the accelerator is always

necessary when the padding method is used for extrapolation in the accelerator,

because MPEG-4 part has no predefined limit to MV range. By comparison, the WMV 8

encoder limits the MV range such that 32 samples of luma padding are sufficient for the

decoding process. Therefore, if the accelerator uses 32 samples of luma padding and a

corresponding 16 samples of chroma padding, it should not strictly be necessary to clip

motion vector values. However, no such encoder limitation is specified for WMV 9, so in

this case the accelerator will always need to use MV range clipping when the padding

method is used for extrapolation.

Furthermore, when WMV 9 Simple or Main profile is used (that is, when iWMV9 is 1 and

iWMVA is 0), MV range clipping will be necessary even when the accelerator can

operate with memory address clipping. The reason is that these profiles include special

clipping behavior. (See section 3.2.14.4.)

Note The method described here pads reference frames by at least 32 luma samples

and 16 chroma samples horizontally, and at least 64 luma samples and 32 chroma

samples vertically. In fact, an accelerator can obtain equivalent results for progressive-

scan video sequences by padding reference frames by only 17 or 18 luma samples (8 or

9 chroma samples), and for interlaced-scan sequences by padding reference frames by

34 or 36 luma samples (16 or 18 chroma samples). In this case, the choice between the

alternate pairs of numbers (17 or 18; and 34 or 36) depends on whether fractional

remainders are set to zero when integer offsets are clipped. However, the larger amount

of padding is given for convenience of specification, and accelerators are responsible for

ensuring that their results are functionally equivalent to the specification.

Note Interlaced video sequences may contain individual progressive-scan pictures,

and those progressive-scan pictures may be used as references for decoding field

pictures or field-mode macroblocks of interlaced-scan pictures. For this reason, the

padding method specified here pads progressive frames by the same amount that it

pads interlaced frames.

Note The extrapolation padding of the reference frame is performed based on the

values of bPicStructure and bPicExtrapolation for the frame being referenced and not

based on these parameters for the picture being decoded.

剩余100页未读，继续阅读

simonmj

粉丝: 1
资源: 1

DirectX视频加速：WMV8, WMV9, VC-1解码规范

Intel CAS，IntelCacheAccelerationSoftware_x64-3.2.2.64_entry.exe

IntelCacheAccelerationSoftware_x64-3.2.2.64_entry.zip

WMV+PowerToy

Java ssm 面试题（2025最新版）.docx

【发文无忧】基于天鹰优化算法AO-Kmean-Transformer-GRU实现数据回归预测算法研究Matlab代码.rar

【发文无忧】基于蝗虫优化算法GOA-Kmean-Transformer-GRU实现数据回归预测算法研究Matlab代码.rar

【创新发文无忧】Matlab实现鹈鹕优化算法POA-Kmean-Transformer-GRU故障诊断算法研究.rar

中国极端气温指数栅格数据集（1961-2020）-最新出炉.zip

Dubbo 30道面试题及答案（2025最新版）.docx

基于熵值法测算的280多个地级市高质量发展水平数据（2005-2021年）-最新出炉.zip

最新资源