1、 INTERNATIONAL STANDARD ISO/IEC 23003-2:2010 TECHNICAL CORRIGENDUM 1 Published 2012-09-01 INTERNATIONAL ORGANIZATION FOR STANDARDIZATION ORGANISATION INTERNATIONALE DE NORMALISATION INTERNATIONAL ELECTROTECHNICAL COMMISSION COMMISSION LECTROTECHNIQUE INTERNATIONALEInformation technology MPEG audio t
2、echnologies Part 2: Spatial Audio Object Coding (SAOC) TECHNICAL CORRIGENDUM 1 Technologies de linformation Technologies audio MPEG Partie 2: Codage dobjet audio spatial (SAOC) RECTIFICATIF TECHNIQUE 1 Technical Corrigendum 1 to ISO/IEC 23003-2:2010 was prepared by Joint Technical Committee ISO/IEC
3、JTC 1, Information technology, Subcommittee SC 29, Coding of audio, picture, multimedia and hypermedia information. In Clause 2 “Normative references”, add: ISO/IEC 23000-12, Information technology Multimedia application format (MPEG-A) Part 12: Interactive music application format In all tables, re
4、place: “reserved” with: “N/A” ICS 35.040 Ref. No. ISO/IEC 23003-2:2010/Cor.1:2012(E) ISO/IEC 2012 All rights reserved Published in Switzerland ISO/IEC 23003-2:2010/Cor.1:2012(E) 2 ISO/IEC 2012 All rights reservedIn 5.1 Introduction, replace: The number of objects that can be handled is in priniciple
5、 not limited. with: The number of objects that can be handled is in principle not limited. In 5.5.2 Baseline Profile, replace: Note that ISO/IEC 23000-12 (Information technology Multimedia application format (MPEG-A) Part 12: Interactive music spplication format) defines several brands that refer to
6、 the SAOC Baseline Profile. with: Note that ISO/IEC 23000-12 defines several brands that refer to the SAOC Baseline Profile. In 6.1 Payloads for SAOC, replace: Table 8 Syntax of ResidualConfig() Syntax No. of bits Mnemonic ResidualConfig() bsResidualSamplingFrequencyIndex; 4 uimsbf bsResidualFramesP
7、erSAOCFrame; 2 uimsbf bsNumGroupsFGO; 2 uimsbf for ( i=0; ibsNumGroupsFGO + 1; i+ ) bsResidualPresenti; 1 uimsbf if ( bsResidualPresenti ) with: Table 8 Syntax of ResidualConfig() Syntax No. of bits Mnemonic ResidualConfig() bsResidualSamplingFrequencyIndex; 4 uimsbf bsResidualFramesPerSAOCFrame; 2
8、uimsbf bsNumEAO; 2 uimsbf for ( i=0; ibsNumEAO + 1; i+ ) bsResidualPresenti; 1 uimsbf if ( bsResidualPresenti ) ISO/IEC 23003-2:2010/Cor.1:2012(E) ISO/IEC 2012 All rights reserved 3In 6.1 Payloads for SAOC, replace: Table 15 Syntax of PresetConfig() Syntax No. of bits Mnemonic PresetConfig() bsNumPr
9、esets; 4 uimsbf for ( i=0; ibsNumPresets+1; i+ ) bsNumBytePresetLabeli; 8 uimsbf for ( j=0; jbsNumBytePresetLabeli; j+ ) bsPresetLabelij; 8 bslbf bsPresetMatrix; 1 uimsbf if (bsPresetMatrix) PresetMatrixData(); else with: Table 15 Syntax of PresetConfig() Syntax No. of bits Mnemonic PresetConfig() b
10、sNumPresets; 4 uimsbf for ( i=0; ibsNumPresets+1; i+ ) bsNumBytePresetLabeli; 8 uimsbf for ( j=0; jbsNumBytePresetLabeli; j+ ) bsPresetLabelij; 8 bslbf bsPresetMatrixi; 1 uimsbf if (bsPresetMatrixi) PresetMatrixData(); else In 6.1 Payloads for SAOC, replace: Table 21 Syntax of SAOCFramingInfo() Synt
11、ax No. of bits Mnemonic SAOCFramingInfo() bsFramingType; 1 uimsbf If ( bsLowDelayMode = 0 ) bsNumParamSets; 3 uimsbf else bsNumParamSets; 1 uimsbf if (bsFramingType) for (ps=0; psnumParamSets; ps+) Note 1 bsParamSlotps; nBitsParamSlot uimsbf Note 2 Note 1: numParamSets is defined by numParamSets = b
12、sNumParamSets + 1. Note 2: nBitsParamSlot is defined according to nBitsParamSlot = ceil(log2(numSlots). ISO/IEC 23003-2:2010/Cor.1:2012(E) 4 ISO/IEC 2012 All rights reservedwith: Table 21 Syntax of SAOCFramingInfo() Syntax No. of bits Mnemonic SAOCFramingInfo() bsFramingType; 1 uimsbf If ( bsLowDela
13、yMode = 0 ) bsNumParamSets; 3 uimsbf else bsNumParamSets; 1 uimsbf for (ps=0; psnumParamSets; ps+) if (bsFramingType) Note 1 bsParamSlotps; nBitsParamSlot uimsbf Note 2 else bsParamSlotps =ceil(numSlots*(ps+1)/numParamSets)-1; Note 1, 3 Note 1: numParamSets is defined by numParamSets = bsNumParamSet
14、s + 1. Note 2: nBitsParamSlot is defined according to nBitsParamSlot = ceil(log2(numSlots). Note 3: numSlots is defined by numSlots = bsFrameLength + 1. In 6.1 Payloads for SAOC, replace: bsDcuParam Defines the parameter value for the DCU algorithm according to Table 41. with: bsDcuParam Defines the
15、 parameter value for the DCU algorithm according to Table 39. In 6.1 Payloads for SAOC; replace: Table 42 numQuantSteps XXX (dataType) numQuantStepsXXXCoarse numQuantStepsXXXFine DCLD, DMG, PDG 15 31 IOC 4 8 OLD 8 16 NRG 32 64 with: Table 42 numQuantSteps XXX (dataType) numQuantStepsXXXCoarse numQua
16、ntStepsXXXFine DCLD, DMG, PDG 15 31 IOC 4 8 OLD 8 16 NRG 32 64 ISO/IEC 23003-2:2010/Cor.1:2012(E) ISO/IEC 2012 All rights reserved 5In 6.1 Payloads for SAOC, replace: PresetUserDataContainer() Syntactic element that contains preset rendering data in the user-defined preset representation format and
17、has a length of exactly bsPresetUserDataLen bytes. with: PresetUserDataContainer() Syntactic element that contains preset rendering data in the user-defined preset representation format and has a length of exactly bsPresetUserDataLen bytes. All bitstream variables which are not explicitly described
18、here are defined in ISO/IEC 23003-1:2007. In 6.1 Payloads for SAOC, add: bsResidualFramesPerSAOCFrame Indicates the number of residual frames per SAOC frame, ranging from one to four according to Table 56 defined in ISO/IEC 23003-1:2007. In 6.1 Payloads for SAOC, add: SAOCDiffHuffData() Syntactic el
19、ement that contains one or two temporally subsequent parameter subsets of a given parameter in the SAOC frame, where the quantized values are coded using a combination of differential coding and Huffman coding. In Clause 7 SAOC processing, omit the time/band indices for all signals and parameters. I
20、n 7.1.2 Dequantization of the SAOC parameters, replace: Table 47 OLD parameter quantization table idx 0 1 2 3 4 5 6 7 OLDidx 10 -15.0010 -4.5010 -4.0010 -3.5010 -3.0010 -2.5010 -2.2010 -1.90idx 8 9 10 11 12 13 14 15 OLDidx 10 -1.6010 -1.3010 -1.0010 -0.8010 -0.6010 -0.4010 -0.201 with: Table 47 OLD
21、parameter quantization table idx 0 1 2 3 4 5 6 7 OLDidx 10 -15.010 -4.510 -4.010 -3.510 -3.010 -2.510 -2.210 -1.9idx 8 9 10 11 12 13 14 15 OLDidx 10 -1.610 -1.310 -1.010 -0.810 -0.610 -0.410 -0.21 ISO/IEC 23003-2:2010/Cor.1:2012(E) 6 ISO/IEC 2012 All rights reservedIn 7.1.2 Dequantization of the SAO
22、C parameters, replace: while (ps=0; psnumParamSet; ps+) switch (bsXXXdataModepips) case 0: /* default */ for (pb=0; pbnumBands, pb+) switch (XXX) case OLD, NRG, IOC, DCLD, DMG, PDG: idxXXXpipspb = 0; break; break; with: while (ps=0; psnumParamSet; ps+) switch (bsXXXdataModepips) case 0: /* default *
23、/ for (pb=0; pbnumBands, pb+) switch (XXX) case NRG, DCLD, DMG, PDG: idxXXXpipspb = 0; break; case OLD: idxXXXpipspb = 15; break; case IOC: idxXXXpipspb = 5; break; break; In 7.2.3 Unquantized interface for the MPS parameters, replace: For an efficient practical implementation and to prevent a loss
24、in precision, the parameter interface to the MPS decoder may alternatively be established in a direct, unquantized way. Rather than writing an actual MPS bitstream, the relevant parameters may be passed directly to the MPS decoder. with: For an efficient practical implementation and to prevent a los
25、s in precision, the parameter interface to the MPS decoder may alternatively be established in a direct, unquantized way. The required range of all relevant parameters is determined by the minimal and maximal values of the corresponding dequantization scheme. Rather than writing an actual MPS bitstr
26、eam, the relevant parameters may be passed directly using binary32 (single) floating point format (IEEE 754-2008) to the MPS decoder. In 7.4 Post(processing) downmix compensation, replace and move the corresponding text to “7.5 Signals and parameters”: If the post(processed) downmix , post(processed
27、) nk X is used, the following modification should be taken prior to SAOC decoding/transcoding: , , PDG post(processed) nk nk nk XWXISO/IEC 23003-2:2010/Cor.1:2012(E) ISO/IEC 2012 All rights reserved 7where represents the input signal to the SAOC decoder/transcoder. , nk X The matrix is defined for e
28、very time-slot and every hybrid subband k . Its elements are obtained from the transmitted PDG parameters which are defined for a given parameter time-slot l and a given processing band . The mapping to the hybrid domain is done according to Table A.31, ISO/IEC 23003-1:2007. If post(processed) downm
29、ix compensation is applied (bsPdgFlag = 1), the matrix is defined as: , PDG nk W n m PDG , lm W PDG , 0 lm lm PDG W , , for mono downmix, PDG , , 0 , 1 0 0 lm lm lm PDG PDG W , for stereo downmix, where , PDG , . lm j PDG j l m Dwith: If the post(processed) downmix post(processed) X compensation is
30、applied (bsPdgFlag = 1), the following modification should be taken prior to the SAOC decoding/transcoding PDG post(processed) XWX . The matrix is obtained from the transmitted PDG parameters as PDG W PDG 0 1 0 0 PDG PDG W , for stereo downmix, PDG 0 0 00 PDG W , for mono downmix. Here, the dequanti
31、zed post(processed) downmix gains are obtained according to 7.1.2 as PDG , . j PDG j l m DIn 7.5.2 Signals and parameters, replace: , for stereo downmix, , 0 0 nk l r Xx, for monoo downmix. , 0 0 nk d Xx with: , for stereo downmix, 0 0 l r X, for mono downmix. 0 0 d XISO/IEC 23003-2:2010/Cor.1:2012(
32、E) 8 ISO/IEC 2012 All rights reservedIn 7.5 Signals and parameters, add: Output covariance The output covariance matrix with elements F , ij f is given as * FA E A , for binaural rendering, * ren ren FME M, otherwise. In 7.5 Signals and parameters, add: Input covariance The input covariance is given
33、 as v *2 v DED . In 7.6 SAOC transcoding/decoding modes, use the following structure: 7.6 SAOC transcoding/decoding modes 7.6.1 Overview 7.6.2 Decorrelated signal 7.6.3 Transcoding modes 7.6.3.1 Introduction 7.6.3.2 Mono downmix (“x-1-5”) processing mode 7.6.3.2.1 Introduction 7.6.3.2.2 SAOC downmix
34、 preprocessor unit 7.6.3.2.3 SAOC parameter processing unit 7.6.3.3 Stereo downmix (“x-2-5”) processing mode 7.6.3.3.1 Introduction 7.6.3.3.2 SAOC downmix preprocessor unit 7.6.3.3.3 SAOC parameter processing unit 7.6.4 Decoding modes 7.6.4.1 Introduction 7.6.4.2 Mono to binaural “x-1-b“ processing
35、mode 7.6.4.3 Mono to stereo “x-1-2“ processing mode 7.6.4.4 Mono to mono “x-1-1“ processing mode 7.6.4.5 Stereo to binaural “x-2-b“ processing mode 7.6.4.6 Stereo to stereo “x-2-2“ processing mode 7.6.4.7 Stereo to mono “x-2-1“ processing mode In 7.6.2 Mono downmix (“x-1-5”) processing mode, replace
36、: Estimation of power and cross power terms Incorporating index denoting the OTT element, the power and cross power terms can be estimated by: h11 2 ,0 0, 0, , 00 NN hh hi ij j i j p wwe , 11 2 ,1 1, 1, , 00NN hh hi ij j i j p ww e , 11 0, 1, , 00NN hh hi j ij i j R wwe . Derivation of the MPS param
37、eters Finally, the corresponding CLD and ICC parameters are derived as: 2 ,0 ,2 10 2 ,1 10log max , h lm h h p CLD p , ISO/IEC 23003-2:2010/Cor.1:2012(E) ISO/IEC 2012 All rights reserved 9 2 , 22 ,0 ,1 max , max , max , h lm h hh R ICC pp , with: The index refers to the OTT helement h 0,0 10 1,1 10l
38、og h h h r CLD r , 0,1 0,0 1,1 h h hh r ICC rr . The terms can be estimated as , h ij r 11 2 , , 00 max , NN hh h ij in jm nm nm rw w e , . In 7.6.2 Mono downmix (“x-1-5”) processing mode, replace: , ,2 10 , 10log max , lm lm lm f ADG v . The scalar , lm f is computed as 1 , , 0 N lm lm ii i f f . w
39、ith: 2 10 10log max , trace ADG v F . In 7.6.2 Mono downmix (“x-1-5”) processing mode, replace: The following subclauses give a description of the SAOC transcoding mode for the mono downmix case. The object parameters (OLD, IOC, DMG, DCLD) from the SAOC bitstream are transcoded into spatial paramete
40、rs (CLD, ICC, CPC, ADG) for the MPS bitstream according to the rendering information. The downmix is not modified. with: The following subclauses describe the processing steps dedicated to the transformation of SAOC parameters (OLD, IOC, DMG) into MPS data (CLD, ICC, ADG) according to the rendering
41、information for the mono downmix case, see Figure 13 (left). The downmix signal is not modified. In 7.6.2 Mono downmix (“x-1-5”) processing mode, replace: The respective contribution of each object to the two outputs of OTT element 0 is obtained by summation of the corresponding elements in . This s
42、ummation gives a sub-rendering matrix of OTT element 0: , ren lm M , 0 lm WISO/IEC 23003-2:2010/Cor.1:2012(E) 10 ISO/IEC 2012 All rights reserved00 0,0 0, 1 , 0 00 1,0 1, 1 , , 0, 0, 0, 0, 1, 1, 1, 1, , , 0, 0, 1, 1, . N lm N lm lm lm lm lm lm lm lm Lf Rf C Lfe N Lf N Rf N C N Lfe lm lm lm lm Ls Rs
43、N Ls N Rs ww ww mmmm mmmm mm mm W The CLDs and ICCs of the subsequent OTT boxes ( , , lm h CLD , lm h ICC , 0, , 4 h ) are calculated using the sub-rendering matrices defined as: , , 11 0, 0, 1, 1, 0,0 0, 1 , 1 , , 11 0, 0, 1, 1, 1,0 1, 1 lm lm lm lm Lf Rf N Lf N Rf N lm lm lm lm lm C Lfe N C N Lfe
44、N mm mm ww mm mm ww W , with: The respective contribution of each object to the two outputs of OTT helement is obtained by summation of the corresponding elements in the rendering matrix . The subsequent sub-rendering matrices with elements are defined as ren M h W , h ij w 00 0,0 0, 1 , 0 00 1,0 1,
45、 1 ,2 ,2 ,2 ,2 , 2 , 2 , 2 , 2 0, 0, 0, 0, 1, 1, 1, 1, ,2 ,2 , 2 , 2 0, 0, 1, 1, N lm N lm lm lm lm lm lm lm lm Lf Rf C Lfe N Lf N Rf N C N Lfe lm lm lm lm Ls Rs N Ls N Rs ww ww mmmm mmmm mm mm W . ,2 ,2 , 2 , 2 11 0, 0, 1, 1, 0,0 0, 1 , 1 11 ,2 , 2 , 2 , 2 1,0 1, 1 0, 0, 1, 1, lm lm lm lm Lf Rf N L
46、f N Rf N lm lm lm lm lm N C Lfe N C N Lfe mm mm ww ww mm mm W , In 7.6.2.2 Sub-rendering matrices for each OTT element, remove: Additional information is provided by the rendering matrix with elements , yielding the mapping of all audio input channels to the desired output channels , ren lm M m l j i m , , i j . The rendering matrix for the 5.1 output configuration is given by: , ren lm M. , 0, 1, , 0, 1, , 0, 1, , ren , 0, 1, , 0, 1, , 0, 1, lm lm Lf N Lf lm lm Rf N Rf lm lm CN lm lm lm Lfe N Lfe lm lm Ls N Ls lm lm Rs N Rs mm mm mm