3D音频的多通道对象空间参数压缩方法

102 浏览量更新于2024-08-28 收藏 470KB PDF 举报

"这篇研究论文提出了一种针对3D音频的多通道对象式空间参数压缩方法，旨在提高3D音频的空间精度，同时降低空间参数的比特率。通过结合空间方向滤波和空间侧信息聚类，开发了一种新的多通道对象式空间参数压缩算法（MOSPCA），该算法能将同一声源在帧内不同频率带的空间参数压缩到一个共同的表示中，从而实现高效的数据压缩。" 正文: 3D音频技术近年来在娱乐、游戏和虚拟现实等领域中得到了广泛应用，其关键在于提供高度逼真的声场再现，使听者能够感知声音的方向、距离和立体感。然而，为了实现这种高精度的空间信息，需要大量的数据来描述每个声音对象的位置和运动，这导致了数据量的显著增加，对传输和存储提出了挑战。本论文的贡献在于提出了一种名为MOSPCA（Multi-channel Object-Based Spatial Parameters Compression Approach）的新方法，以解决这一问题。MOSPCA的核心思想是将空间参数的压缩与对象导向的方法相结合，通过对声源在不同频率带内的空间特性进行分析和处理，实现了对空间信息的有效压缩。首先，论文采用了空间方向滤波技术。这种方法可以识别并提取出声音信号的主要传播方向，从而减少对非主导方向信息的编码，降低冗余。通过这种方式，可以减小用于表示3D音频中各个声音对象方向信息的数据量。其次，论文引入了空间侧信息聚类策略。这一策略旨在合并来自同一声源的帧内不同频率带的空间参数，通过聚类将相似的信息归一化，进一步压缩数据。这种方法减少了对每个频率带单独编码的需要，降低了总的比特率，同时保持了声音的定位精度。 MOSPCA的优势在于其能够在保持3D音频高质量的同时，显著降低数据传输和存储的需求。这对于实时通信、流媒体服务以及需要高效数据处理的其他应用具有重要意义。此外，MOSPCA的适用性不仅限于特定的音频格式或系统，而是可以应用于多种多通道3D音频环境，具有广泛的潜在应用价值。 "Multi-channel Object-Based Spatial Parameter Compression Approach for 3D Audio" 这篇研究论文展示了在不牺牲音质的前提下，通过创新的压缩算法优化3D音频空间参数编码的有效途径，为未来3D音频技术的发展提供了新的思路和工具。这项工作对于推动3D音频技术的进步，尤其是在有限带宽和存储资源的环境下，具有重大的理论和实际意义。

sound pressure. Both of these values can be derived from a B-format microphone signal

(W, X, Y, Z). B-format microphone has four channels: omnidirectional and three

ﬁgure-of-eight microphones organized o rthogonally. The omnidirectional microphone

signal is denoted as W . And the three ﬁgure-of-eight microphones are d enoted as X, Y

and Z.

In DirAC, a spatial microphone recording signal is analyzed in frequency domain to

derive the sound ﬁeld infor mation including both localization (azimuth and elevation)

and diffuseness information. This information is the side information for a downmix

audio channel. DirAC estimates azimuth h and elevation u of sound source by ana-

lyzing the intensity relation between the 3D Cartesian coordinate axis. This estimation

is carried out based on the signal’s time-frequency representation.

hðn; f Þ¼arg tan

I

ðn; f Þ

I

ðn; f Þ



ð1Þ

uðn; f Þ¼arg tan

I

ðn; f Þ

ﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃ

ðn; f ÞþI

ðn; f Þ

ð2Þ

Where n and f are time and frequency indices respectively. The intensity for each

axis is denoted with I

ðn; f Þ, I

ðn; f Þ and I

ðn; f Þ. The diffuseness information is

denoted with wðn; f Þ.

When synthesizing, the directional sound source in the original sound ﬁeld is

rendered b y panning the sound source to the location speciﬁed by DirAC directional

cues, while the diffuseness cues are used to reproduce surround image with no per-

ceptual localization feature.

2.2 3D Audio Spatial Localization Quantization Method

Different from SLQP [3], the paper’s authors propos e a new compression method in

[5], which extracts the distance and direction information of sound sources as side

information, for enhancing the spatial quality of multichannel 3D audio. A 3D audio

spatial localization quantization method is designed to describe the azimuth, elevation

and distance parameters as Fig. 2.

An azimuth precision of 2° (3° for low precision) is used for front area, and the

azimuth precision is gradually decreased to 5° (7 ° for low precision) in the rear area.

A5° elevation (10° for low precision) resolution is utilized, and the distance r of the

sound source can be quantized as the radius of different spheres (10 cm, 20 cm, 20 cm,

40 cm, 50 cm, 75 cm, 100 cm, 130 cm, 160 cm, 320 cm).

The resulting numbers of spatial quantization points are 16740 and 6330 for the

high and low precision design. Hence, it requires about 14 kbps/object and

12.6 kbps/object for the high precision and low precision design respectively.

356 C. Yang et al.

剩余10页未读，继续阅读

weixin_38500090

粉丝: 4
资源: 963

3D音频的多通道对象空间参数压缩方法

Multi-Focus Image Fusion Based on Spatial.rar_image fusion_image

Multi-channel and sharp angular spatial filters based on one-dimensional photonic crystals

Multi-focus image fusion based on spatial frequency and morphological operators

JND-based spatial parameter quantization of multichannel audio signals

Multi-focus optical fiber lens based on all-dielectric metasurface

Direction of Arrival Estimation in Low-Grazing Angle: A Partial Spatial-Differencing Approach

Spatial-Speaker-Space：使用High Fidelity Spatial Audio API为演讲者和听众提供舒适的虚拟3D音频环境

Fabrication large area photonic crystals with periodic waveguide by one-step holographic lithography based on spatial light modulator

Multi-granularity and metric spatial reasoning

3D object retrieval based on Spatial+ LDA model

最新资源