1、 ETSI TS 102 563 V1.1.1 (2007-02)Technical Specification Digital Audio Broadcasting (DAB);Transport of Advanced Audio Coding (AAC) audioEuropean Broadcasting Union Union Europenne de Radio-Tlvision EBUUER ETSI ETSI TS 102 563 V1.1.1 (2007-02) 2 Reference DTS/JTC-DAB-49 Keywords audio, broadcasting,
2、coding, DAB, digital ETSI 650 Route des Lucioles F-06921 Sophia Antipolis Cedex - FRANCE Tel.: +33 4 92 94 42 00 Fax: +33 4 93 65 47 16 Siret N 348 623 562 00017 - NAF 742 C Association but non lucratif enregistre la Sous-Prfecture de Grasse (06) N 7803/88 Important notice Individual copies of the p
3、resent document can be downloaded from: http:/www.etsi.org The present document may be made available in more than one electronic version or in print. In any case of existing or perceived difference in contents between such versions, the reference version is the Portable Document Format (PDF). In ca
4、se of dispute, the reference shall be the printing on ETSI printers of the PDF version kept on a specific network drive within ETSI Secretariat. Users of the present document should be aware that the document may be subject to revision or change of status. Information on the current status of this a
5、nd other ETSI documents is available at http:/portal.etsi.org/tb/status/status.asp If you find errors in the present document, please send your comment to one of the following services: http:/portal.etsi.org/chaircor/ETSI_support.asp Copyright Notification No part may be reproduced except as authori
6、zed by written permission. The copyright and the foregoing restriction extend to reproduction in all media. European Telecommunications Standards Institute 2007. European Broadcasting Union 2007. All rights reserved. DECTTM, PLUGTESTSTM and UMTSTM are Trade Marks of ETSI registered for the benefit o
7、f its Members. TIPHONTMand the TIPHON logo are Trade Marks currently being registered by ETSI for the benefit of its Members. 3GPPTM is a Trade Mark of ETSI registered for the benefit of its Members and of the 3GPP Organizational Partners. ETSI ETSI TS 102 563 V1.1.1 (2007-02) 3 Contents Intellectua
8、l Property Rights4 Foreword.4 1 Scope 5 2 References 5 3 Definitions, abbreviations and arithmetic operators.5 3.1 Definitions5 3.2 Abbreviations .5 3.3 Arithmetic operators.6 4 Introduction 6 5 Audio7 5.1 HE AAC v2 audio coding 7 5.2 Audio super framing syntax .7 5.3 MPEG Surround.10 5.3.1 Overview
9、 10 5.3.2 Requirements for MPEG Surround encoders and decoders11 5.4 Programme Associated Data (PAD).11 5.4.1 PAD insertion .12 5.4.2 Coding of F-PAD and X-PAD13 5.4.3 PAD extraction .13 6 Transport error coding and interleaving.13 6.1 RS coding .13 6.2 Formation of the coding array 14 6.3 Formation
10、 of the parity array14 6.4 Formation of the output array.14 6.5 Order of data transmission15 7 Signalling .15 7.1 FIC signalling.15 7.2 Audio parameter signalling 15 8 Re-configuration.15 Annex A (normative): Error concealment .16 A.1 AAC error concealment16 A.1.1 Interpolation of one corrupt AU .16
11、 A.1.2 Fade-out and fade-in.17 A.2 SBR error concealment 17 A.3 Parametric stereo error concealment 19 Annex B (informative): Implementation tips for PAD insertion20 Annex C (informative): Synchronizing to the audio super frame structure .21 Annex D (informative): Processing a super frame 23 Annex E
12、 (informative): Bit-rate available for audio .24 Annex F (informative): Bibliography.25 History 26 ETSI ETSI TS 102 563 V1.1.1 (2007-02) 4 Intellectual Property Rights IPRs essential or potentially essential to the present document may have been declared to ETSI. The information pertaining to these
13、essential IPRs, if any, is publicly available for ETSI members and non-members, and can be found in ETSI SR 000 314: “Intellectual Property Rights (IPRs); Essential, or potentially Essential, IPRs notified to ETSI in respect of ETSI standards“, which is available from the ETSI Secretariat. Latest up
14、dates are available on the ETSI Web server (http:/webapp.etsi.org/IPR/home.asp). Pursuant to the ETSI IPR Policy, no investigation, including IPR searches, has been carried out by ETSI. No guarantee can be given as to the existence of other IPRs not referenced in ETSI SR 000 314 (or the updates on t
15、he ETSI Web server) which are, or may be, or may become, essential to the present document. Foreword This Technical Specification (TS) has been produced by Joint Technical Committee (JTC) Broadcast of the European Broadcasting Union (EBU), Comit Europen de Normalisation ELECtrotechnique (CENELEC) an
16、d the European Telecommunications Standards Institute (ETSI). NOTE 1: The EBU/ETSI JTC Broadcast was established in 1990 to co-ordinate the drafting of standards in the specific field of broadcasting and related fields. Since 1995 the JTC Broadcast became a tripartite body by including in the Memora
17、ndum of Understanding also CENELEC, which is responsible for the standardization of radio and television receivers. The EBU is a professional association of broadcasting organizations whose work includes the co-ordination of its members activities in the technical, legal, programme-making and progra
18、mme-exchange domains. The EBU has active members in about 60 countries in the European broadcasting area; its headquarters is in Geneva. European Broadcasting Union CH-1218 GRAND SACONNEX (Geneva) Switzerland Tel: +41 22 717 21 11 Fax: +41 22 717 24 81 The Eureka Project 147 was established in 1987,
19、 with funding from the European Commission, to develop a system for the broadcasting of audio and data to fixed, portable or mobile receivers. Their work resulted in the publication of European Standard, EN 300 401 1, for DAB (see note 2) which now has worldwide acceptance. The members of the Eureka
20、 Project 147 are drawn from broadcasting organizations and telecommunication providers together with companies from the professional and consumer electronics industry. NOTE 2: DAB is a registered trademark owned by one of the Eureka Project 147 partners. ETSI ETSI TS 102 563 V1.1.1 (2007-02) 5 1 Sco
21、pe The present document defines the method to code and transmit audio services using the HE AAC v2 2 audio coder for Eureka-147 Digital Audio Broadcasting (DAB) (EN 300 401 1) and details the necessary mandatory requirements for decoders. The permitted audio modes and the data protection and encapsu
22、lation are detailed. This audio coding scheme permits the full use of the PAD channel for carrying dynamic labels and user applications. 2 References The following documents contain provisions which, through reference in this text, constitute provisions of the present document. References are either
23、 specific (identified by date of publication and/or edition number or version number) or non-specific. For a specific reference, subsequent revisions do not apply. For a non-specific reference, the latest version applies. Referenced documents which are not found to be publicly available in the expec
24、ted location might be found at http:/docbox.etsi.org/Reference. NOTE: While any hyperlinks included in this clause were valid at the time of publication ETSI cannot guarantee their long term validity. 1 ETSI EN 300 401: “Radio Broadcasting Systems; Digital Audio Broadcasting (DAB) to mobile, portabl
25、e and fixed receivers“. 2 ISO/IEC 14496-3: “Information technology - Coding of audio-visual objects - Part 3: Audio“. 3 Definitions, abbreviations and arithmetic operators 3.1 Definitions For the purposes of the present document, the terms and definitions given in EN 300 401 1 and the following appl
26、y: access unit: access unit contains the audio samples for 20 ms, 30 ms, 40 ms or 60 ms of audio depending on the sampling rate of the AAC core, respectively 48 kHz, 32 kHz, 24 kHz or 16 kHz audio super frame: audio super frame contains a number of AUs which together contain the encoded audio for 12
27、0 ms subchannel_index: subchannel_index is derived from the size of the sub-channel carrying the audio service and defines the number of Reed-Solomon code words in each audio super frame 3.2 Abbreviations For the purposes of the present document, the abbreviations given in EN 300 401 1 and the follo
28、wing apply: AAC Advanced Audio Coding AU Access Unit DAC Digital Analogue Converter DMB Digital Multimedia Broadcasting DVB Digital Video Broadcasting HE AAC High Efficiency AAC MPS MPEG SurroundPS Parametric Stereo ETSI ETSI TS 102 563 V1.1.1 (2007-02) 6 RS Reed-Solomon SBR Spectral Band Replicatio
29、n 3.3 Arithmetic operators + addition subtraction multiplication division m DIV p denotes the quotient part of the division of m by p (m and p are positive integers) m MOD p denotes the remainder of the division of m by p (m and p are positive integers) ()=qpiif denotes the sum: f(p) + f(p + 1) + f(
30、p + 2) . + f(q) ()=qpiif denotes the product: f(p) f(p + 1) f(p + 2) . f(q) 4 Introduction The DAB system standard 1 defines the way that audio (programme) services are carried when using MPEG Layer II. The present document defines the way that audio (programme) services are carried when using MPEG
31、4 HE AAC v2. For Layer II audio, two sampling rates are permitted, 48 kHz and 24 kHz. Each audio frame contains samples for 24 ms or 48 ms respectively and each contains the same number of bytes. The audio frames are carried in one or two respectively DAB logical frames. For AAC, two transforms are
32、specified. For DAB, only the 960 transform is permitted with sampling rates of 48 kHz, 32 kHz, 24 kHz and 16 kHz. Each AU (audio frame) contains samples for 20 ms, 30 ms, 40 ms or 60 ms respectively. In order to provide a similar architectural model to Layer II audio, and simple synchronization, AUs
33、 are built into audio super frames of 120ms which are then carried in five DAB logical frames. In order to provide additional error control, Reed Solomon coding and virtual interleaving is applied. The overall scheme is shown in figure 1. HE AAC v2 audio coder Scope of present document Reed-Solomon
34、coder and virtual interleaver DAB main service channel multiplexer Audio super framing Figure 1: Conceptual diagram of the outer coder and interleaver ETSI ETSI TS 102 563 V1.1.1 (2007-02) 7 5 Audio 5.1 HE AAC v2 audio coding For generic audio coding, a subset of the MPEG-4 High Efficiency Advanced
35、Audio Coding v2 (HE AAC v2) profile chosen to best suit the DAB system environment is used. The HE AAC v2 Profile, Level 2 according to 2 shall apply with the following additional restrictions for the DAB system: Sampling rates: permitted output sampling rates of the HE AAC v2 decoder are 32 kHz and
36、 48 kHz, i.e. when SBR is enabled the AAC core shall be operated at 16 kHz or 24 kHz, respectively. If SBR is disabled then the AAC core shall be operated at 32 kHz or 48 kHz respectively. Transform length: the number of samples per channel per AU is 960. This is required to harmonize HE AAC AU leng
37、ths to allow the combination of an integer number of AUs to build an audio super frame of 120 ms duration. Audio bit rates are restricted to fit within a maximum sub-channel size of 192 kbps (approximately 175 kbps for audio, assuming no PAD). Audio super framing: AUs are composed into audio super f
38、rames, which always correspond to 120 ms in time. The AUs in the audio super frames are encoded together such that each audio super frame is of constant length, i.e. that bit exchange between AUs is only possible within an audio super frame. The number of AUs per super frame are: two (16 kHz AAC cor
39、e sampling rate with SBR enabled), three (24 kHz AAC core sampling rate with SBR enabled), four (32 kHz AAC core sampling rate) or six (48 kHz AAC core sampling rate). Each audio super frame is carried in five consecutive logical DAB frames (see clause 7) which enables simple synchronization and man
40、agement of reconfigurations. The size of the audio super frame is defined by the size of the MSC sub-channel (see 1 clause 6.2.1) which carries the audio super frame. Sub-channels are multiples of 8 kbps in size. The size of the audio super frame in bytes is given by the expressions below: subchanne
41、l_index = MSC sub-channel size (kbps) 8 audio_super_frame_size (bytes) = subchannel_index 110 The first byte of the audio super frame is byte 0 and the last byte is byte (audio_super_frame_size 1). NOTE: The subchannel_index parameter may take the values 1 to 24 due to the restriction limiting the m
42、aximum sub-channel size to 192 kbps. 5.2 Audio super framing syntax Table 1: Syntax of he_aac_super_frame() Syntax No. of bits Note he_aac_super_frame(subchannel_index) he_aac_super_frame_header() determines num_aus for (n = 0; n = - 2QL - ()00Et - ()16EEL t - () ( )1,0EE Ell lL+ - 11EQLANDL= - () (
43、)00QEtt - () ()QQ EELLtt - () ( )1,0QQ Qll lL+tt - all elements of tQare not among the elements of tEIf the plausibility check fails, the AU error flag is set and the error concealment outlined above is applied. ETSI ETSI TS 102 563 V1.1.1 (2007-02) 19frameErrorNoYesGenerateConcealing controldataadd
44、Concealing-EnvelopeDatadeltaToLinear-PCMEnvelope-DecodingtimeCompensate-FirstEnvelopeprevFrame-ErrorFlagYesNodeltaToLinear-PCMEnvelope-Decodingcheck dataset frame errorflagerrorYes Norequantise-EnvelopeDatareturncoupling =prevCouplingBeginFigure A.1: SBR error concealment overview A.3 Parametric ste
45、reo error concealment Parametric stereo error concealment is based on the fact that the stereo image is quasi stationary. The concealment strategy keeps the Parametric Stereo settings from the last valid AU until a new set of Parametric Stereo settings can be decoded from a valid AU. ETSI ETSI TS 10
46、2 563 V1.1.1 (2007-02) 20Annex B (informative): Implementation tips for PAD insertion To reduce the data rate needed for PAD, the following optimizations on encoder side should be made: If no PAD is inserted (neither F-PAD, nor X-PAD information is available), then no data_stream_element() encapsula
47、ting the PAD data should be added to the access unit. A value of 0 for the F-PAD field is equivalent to “no F-PAD information is available“. For low audio bit rates, it is important to take into account that additional (X-)PAD data will reduce the bit rate for the audio and thus the audio quality. F
48、or low audio bit rates, even the bit rate for dynamic labels might become significant. It is not just the average (X-)PAD data rate that should be taken into account, but especially for low audio bit rates, also the “burstiness“ of (X-)PAD data insertion should be taken into consideration. If the ex
49、act timing for X-PAD insertion is not important (it is not important for most multimedia information such as dynamic labels) and low audio bit rates are used, then the following should be considered: The bit rate used for PAD depends on the total number of used bytes (including encapsulation and F-PAD) per audio super frame. It might sometimes be more efficient to use one single big X-PAD field in one audio super frame than to use multiple smaller X-PAD fields in the same audio super frame; Some mult