没有合适的资源?快使用搜索试试~ 我知道了~
首页MPEG1 layer3音频标准
MPEG1 layer3音频标准
4星 · 超过85%的资源 需积分: 34 88 下载量 85 浏览量
更新于2023-03-16
评论 2
收藏 229KB DOC 举报
MPEG1 layer3音频标准 ISOIEC 11172.音频压缩解压缩标准。语法结构以等内容。对初学者很有帮助。
资源详情
资源评论
资源推荐
TITLE PAGE PROVIDED BY ISO
CD 11172-3
CODING OF MOVING PICTURES AND ASSOCIATED AUDIO
FOR DIGITAL STORAGE MEDIA AT UP TO ABOUT 1.5 MBIT/s
Part 3 AUDIO
CONTENTS
FOREWORD
INTRODUCTION
1. GENERAL NORMATIVE ELEMENTS
1.1 Scope
1.2 Organization of the Document
1.3 Normative References
2. TECHNICAL NORMATIVE ELEMENTS
2.1 Definitions
2.2 Symbols and Abbreviations
2.3 Method of Describing Bitstream Syntax
2.4 Requirements
2.4.1 Coding Structure and Parameters
2.4.2 Specification of the Coded Audio Bitstream Syntax
2.4.3 Semantics for the Audio Bitstream Syntax
2.4.4 The Audio Decoding Process
2.4.5 Compliance Requirements
3-Annex A (normative) Diagrams
3-Annex B (normative) Tables
3-Annex C (informative) The Encoding Process
3-Annex D (informative) Psychoacoustic Models
3-Annex E (informative) Bit Sensitivity to Errors
3-Annex F (informative) Error Concealment
3-Annex G (informative) Joint Stereo Coding
FOREWORD
This standard is a committee draft that was submitted for approval to ISO-IEC/JTC1 SC29 on 22 November 1991.
It was prepared by SC29/WG11, also known as MPEG (Moving Pictures Expert Group). MPEG was formed in
1988 to establish a standard for the coded representation of moving pictures and associated audio stored on digital
storage media.
This standard is published in four parts. Part 1 - systems - specifies the system coding layer of the standard. It
defines a multiplexed structure for combining audio and video data and means of representing the timing
information needed to replay synchronized sequences in real-time. Part 2 - video - specifies the coded
representation of video data and the decoding process required to reconstruct pictures. Part 3 - audio - specifies
1
the coded representation of audio data. Part 4 - conformance testing - is still in preparation. It will specify the
procedures for determining the characteristics of coded bit streams and for testing compliance with the
requirements stated in Parts 1, 2 and 3.
In Part 1 of this standard all annexes are informative and contain no normative requirements.
In Part 2 of this standard 2-Annex A, 2-Annex B and 2-Annex C contain normative requirements and are an
integral part of this standard. 2-Annex D and 2-Annex E are informative and contain no normative requirements.
In Part 3 of this standard 3-Annex A and 3-Annex B contain normative requirements and are an integral part of this
standard. All other annexes are informative and contain no normative requirements.
INTRODUCTION
To aid in the understanding of the specification of the stored compressed bitstream and its decoding, a sequence of
encoding, storage and decoding is described.
Encoding
The encoder processes the digital audio signal and produces the compressed bitstream for storage. The encoder
algorithm is not standardized, and may use various means for encoding such as estimation of the auditory masking
threshold, quantization, and scaling. However, the encoder output must be such that a decoder conforming to the
specifications of clause 2.4 will produce audio suitable for the intended application.
mapping
quantizer
and
coding
frame
packing
psychoacoustic
model
Figure I-1 Sketch of a basic encoder
Input audio samples are fed into the encoder. The mapping creates a filtered and subsampled representation of the
input audio stream. The mapped samples may be called either subband samples (as in Layer I, see below) or
transformed subband samples (as in Layer III). A psychoacoustic model creates a set of data to control the
quantizer and coding. These data are different depending on the actual coder implementation. One possibility is to
use an estimation of the masking threshold to do this quantizer control. The quantizer and coding block creates a
set of coding symbols from the mapped input samples. Again, this block can depend on the encoding system. The
block 'frame packing' assembles the actual bitstream from the output data of the other blocks, and adds other
information (e.g. error correction) if necessary.
Layers
2
Depending on the application, different layers of the coding system with increasing encoder complexity and
performance can be used. An ISO MPEG Audio Layer N decoder is able to decode bitstream data which has been
encoded in Layer N and all layers below N.
Layer I:
This layer contains the basic mapping of the digital audio input into 32 subbands, fixed segmentation to format the
data into blocks, a psychoacoustic model to determine the adaptive bit allocation, and quantization using block
companding and formatting.
Layer II:
This layer provides additional coding of bit allocation, scalefactors and samples. Different framing is used.
Layer III:
This layer introduces increased frequency resolution based on a hybrid filterbank. It adds a different (nonuniform)
quantizer, adaptive segmentation and entropy coding of the quantized values .
Joint Stereo coding can be added as an additional feature to any of the layers.
Storage
Various streams of encoded video, encoded audio, synchronization data, systems data and auxiliary data may be
stored together on a storage medium. Editing of the audio will be easier if the edit point is constrained to coincide
with an addressable point.
Access to storage may involve remote access over a communication system. Access is assumed to be controlled by
a functional unit other than the audio decoder itself. This control unit accepts user commands, reads and interprets
data base structure information, reads the stored information from the media, demultiplexes non-audio information
and passes the stored audio bitstream to the audio decoder at the required rate.
Decoding
The decoder, subject to the application-dependent parameters of clause 2.4.1, accepts the compressed audio
bitstream in the syntax defined in clause 2.4.2, decodes the data elements according to clause 2.4.3, and uses the
information to produce digital audio output according to clause 2.4.4.
frame
unpacking
reconstruction
inverse
mapping
Figure I-2 Sketch of the basic structure of the decoder
Bitstream data is fed into the decoder. The bitstream unpacking and decoding block does error detection if error-
check is applied in the encoder (see clause 2.4.2.4). The bitstream data are unpacked to recover the various pieces
of information. The reconstruction block reconstructs the quantized version of the set of mapped samples. The
inverse mapping transforms these mapped samples back into uniform PCM.
1. GENERAL NORMATIVE ELEMENTS
1.1 Scope
This standard specifies the coded representation of high quality audio for storage media and the method for
decoding of high quality audio signals. The input of the encoder and the output of the decoder are compatible with
existing PCM standards such as standard Compact Disc and Digital Audio Tape.
3
This standard is intended for application to digital storage media providing a total continuous transfer rate of about
1.5 Mbit/sec for both audio and video bitstreams, such as CD, DAT and magnetic hard disc. The storage media
may either be connected directly to the decoder, or via other means such as communication lines and the ISO
11172 multiplex stream defined in Part1 of this standard. This standard is intended for sampling rates of 32 kHz,
44.1 kHz, and 48 kHz.
1.2 References
The following standards contain provisions which, through reference in this text, constitute provisions of this
International Standard. At the time of publication, the editions indicated were valid. All standards are subject to
revision, and parties to agreements based on this International Standard are encouraged to investigate the
possibility of applying the most recent editions of the standards indicated below. Members of IEC and ISO
maintain registers of currently valid International Standards.
Recommendations and reports of the CCIR, 1990
XVIIth Plenary Assembly, Dusseldorf, 1990
Volume XI - Part 1
Broadcasting Service (Television)
Rec. 601-1 "Encoding parameters of digital television for studios".
Volume X
Rec. 953 "Encoding parameters of digital audio".
IEEE Draft Standard "Specification for the implementation of 8x 8 inverse discrete cosine transform".
P1180/D2, July 18,1990
2. TECHNICAL NORMATIVE ELEMENTS
2.1 Definitions
For the purposes of this International Standard, the following definitions apply.
AC coefficient: Any DCT coefficient for which the frequency in one or both dimensions is non-zero.
access unit: in the case of compressed audio an access unit is an audio access unit. In the case of compressed video an access
unit is the coded representation of a picture.
Adaptive segmentation: A subdivision of the digital representation of an audio signal in variable segments of time.
adaptive bit allocation: The assignment of bits to subbands in a time and frequency varying fashion according to a
psychoacoustic model.
adaptive noise allocation: The assignment of coding noise to frequency bands in a time and frequency varying fashion
according to a psychoacoustic model.
Alias: Mirrored signal component resulting from sub-Nyquist sampling.
Analysis filterbank: Filterbank in the encoder that transforms a broadband PCM audio signal into a set of subsampled subband
samples.
Audio Access Unit: An Audio Access Unit is defined as the smallest part of the encoded bitstream which can be decoded by
itself, where decoded means "fully reconstructed sound".
audio buffer: A buffer in the system target decoder for storage of compressed audio data.
4
backward motion vector: A motion vector that is used for motion compensation from a reference picture at a later time in
display order.
Bark: Unit of critical band rate.
bidirectionally predictive-coded picture; B-picture: A picture that is coded using motion compensated prediction from a past
and/or future reference picture.
bitrate: The rate at which the compressed bitstream is delivered from the storage medium to the input of a decoder.
Block companding: Normalizing of the digital representation of an audio signal within a certain time period.
block: An 8-row by 8-column orthogonal block of pels.
Bound: The lowest subband in which intensity stereo coding is used.
byte aligned: A bit in a coded bitstream is byte-aligned if its position is a multiple of 8-bits from the first bit in the stream.
channel: A digital medium that stores or transports an ISO 11172 stream.
chrominance (component): A matrix, block or sample of pels representing one of the two colour difference signals related to
the primary colours in the manner defined in CCIR Rec 601. The symbols used for the colour difference signals are Cr and Cb.
coded audio bitstream: A coded representation of an audio signal as specified in this International Standard.
coded video bitstream: A coded representation of a series of one or more pictures as specified in this International Standard.
coded order: The order in which the pictures are stored and decoded. This order is not necessarily the same as the display
order.
coded representation: A data element as represented in its encoded form.
coding parameters: The set of user-definable parameters that characterise a coded video bitstream. Bit-streams are
characterised by coding parameters. Decoders are characterised by the bitstreams that they are capable of decoding.
component: A matrix, block or sample of pel data from one of the three matrices (luminance and two chrominance) that make
up a picture.
compression: Reduction in the number of bits used to represent an item of data.
constant bitrate coded video: A compressed video bitstream with a constant average bitrate.
constant bitrate: Operation where the bitrate is constant from start to finish of the compressed bitstream.
Constrained Parameters: In the case of the video specification, the values of the set of coding parameters defined in Part 2
Clause 2.4.4.4.
constrained system parameter stream (CSPS): An ISO 11172 multiplexed stream for which the constraints defined in Part 1
Clause 2.4.6 apply.
CRC: Cyclic redundancy code.
Critical Band Rate: Psychoacoustic measure in the spectral domain which corresponds to the frequency selectivity of the
human ear.
Critical Band: Part of the spectral domain which corresponds to a width of one Bark.
5
剩余37页未读,继续阅读
Alberteins
- 粉丝: 2
- 资源: 6
上传资源 快速赚钱
- 我的内容管理 收起
- 我的资源 快来上传第一个资源
- 我的收益 登录查看自己的收益
- 我的积分 登录查看自己的积分
- 我的C币 登录后查看C币余额
- 我的收藏
- 我的下载
- 下载帮助
会员权益专享
最新资源
- ExcelVBA中的Range和Cells用法说明.pdf
- 基于单片机的电梯控制模型设计.doc
- 主成分分析和因子分析.pptx
- 共享笔记服务系统论文.doc
- 基于数据治理体系的数据中台实践分享.pptx
- 变压器的铭牌和额定值.pptx
- 计算机网络课程设计报告--用winsock设计Ping应用程序.doc
- 高电压技术课件:第03章 液体和固体介质的电气特性.pdf
- Oracle商务智能精华介绍.pptx
- 基于单片机的输液滴速控制系统设计文档.doc
- dw考试题 5套.pdf
- 学生档案管理系统详细设计说明书.doc
- 操作系统PPT课件.pptx
- 智慧路边停车管理系统方案.pptx
- 【企业内控系列】企业内部控制之人力资源管理控制(17页).doc
- 温度传感器分类与特点.pptx
资源上传下载、课程学习等过程中有任何疑问或建议,欢迎提出宝贵意见哦~我们会及时处理!
点击此处反馈
安全验证
文档复制为VIP权益,开通VIP直接复制
信息提交成功
评论11