1、 International Telecommunication Union ITU-T G.718TELECOMMUNICATION STANDARDIZATION SECTOR OF ITU Corrigendum 3(01/2011) SERIES G: TRANSMISSION SYSTEMS AND MEDIA, DIGITAL SYSTEMS AND NETWORKS Digital terminal equipments Coding of voice and audio signals Frame error robust narrow-band and wideband em
2、bedded variable bit-rate coding of speech and audio from 8-32 kbit/s Corrigendum 3: Corrections to text and C-code Recommendation ITU-T G.718 (2008) Corrigendum 3 ITU-T G-SERIES RECOMMENDATIONS TRANSMISSION SYSTEMS AND MEDIA, DIGITAL SYSTEMS AND NETWORKS INTERNATIONAL TELEPHONE CONNECTIONS AND CIRCU
3、ITS G.100G.199 GENERAL CHARACTERISTICS COMMON TO ALL ANALOGUE CARRIER-TRANSMISSION SYSTEMS G.200G.299 INDIVIDUAL CHARACTERISTICS OF INTERNATIONAL CARRIER TELEPHONE SYSTEMS ON METALLIC LINES G.300G.399 GENERAL CHARACTERISTICS OF INTERNATIONAL CARRIER TELEPHONE SYSTEMS ON RADIO-RELAY OR SATELLITE LINK
4、S AND INTERCONNECTION WITH METALLIC LINES G.400G.449 COORDINATION OF RADIOTELEPHONY AND LINE TELEPHONY G.450G.499 TRANSMISSION MEDIA AND OPTICAL SYSTEMS CHARACTERISTICS G.600G.699 DIGITAL TERMINAL EQUIPMENTS G.700G.799 General G.700G.709 Coding of voice and audio signals G.710G.729Principal characte
5、ristics of primary multiplex equipment G.730G.739 Principal characteristics of second order multiplex equipment G.740G.749 Principal characteristics of higher order multiplex equipment G.750G.759 Principal characteristics of transcoder and digital multiplication equipment G.760G.769 Operations, admi
6、nistration and maintenance features of transmission equipment G.770G.779 Principal characteristics of multiplexing equipment for the synchronous digital hierarchy G.780G.789 Other terminal equipment G.790G.799 DIGITAL NETWORKS G.800G.899 DIGITAL SECTIONS AND DIGITAL LINE SYSTEM G.900G.999 MULTIMEDIA
7、 QUALITY OF SERVICE AND PERFORMANCE GENERIC AND USER-RELATED ASPECTS G.1000G.1999 TRANSMISSION MEDIA CHARACTERISTICS G.6000G.6999 DATA OVER TRANSPORT GENERIC ASPECTS G.7000G.7999 PACKET OVER TRANSPORT ASPECTS G.8000G.8999 ACCESS NETWORKS G.9000G.9999 For further details, please refer to the list of
8、ITU-T Recommendations. Rec. ITU-T G.718 (2008)/Cor.3 (01/2011) i Recommendation ITU-T G.718 Frame error robust narrow-band and wideband embedded variable bit-rate coding of speech and audio from 8-32 kbit/s Corrigendum 3 Corrections to text and C-code Summary Recommendation ITU-T G.718 describes a n
9、arrow-band (NB) and wideband (WB) embedded variable bit-rate coding algorithm for speech and audio operating in the range from 8 to 32 kbit/s which is designed to be robust to frame erasures. This codec provides state-of-the-art NB speech quality over the lower bit rates and state-of-the-art WB spee
10、ch quality over the complete range of bit rates. In addition, the ITU-T G.718 codec is designed to be highly robust to frame erasures, thereby enhancing the speech quality when used in IP transport applications on fixed, wireless and mobile networks. Despite its embedded nature, the codec also perfo
11、rms well with both NB and WB generic audio signals. This codec has an embedded scalable structure, enabling maximum flexibility in the transport of voice packets through IP networks of today and in future media-aware networks. In addition, the embedded structure of ITU-T G.718 will easily allow the
12、codec to be extended to provide a super-wideband and stereo capability through additional layers which are currently under development. The bitstream may be truncated at the decoder side or by any component of the communication system to instantaneously adjust the bit rate to the desired value witho
13、ut the need for out-of-band signalling. The encoder produces an embedded bitstream structured in five layers corresponding to the five available bit rates: 8, 12, 16, 24 and 32 kbit/s. The ITU-T G.718 encoder can accept WB sampled signals at 16 kHz, or NB signals sampled at either 16 or 8 kHz. Simil
14、arly, the decoder output can be 16 kHz WB, in addition to 16 or 8 kHz NB. Input signals sampled at 16 kHz, but with bandwidth limited to NB, are detected by the encoder. The output of the ITU-T G.718 codec is capable of operating with a bandwidth of 300-3400 Hz at 8 and 12 kbit/s and 50-7000 Hz from
15、 8 to 32 kbit/s. The high quality codec core represents a significant performance improvement, providing 8 kbit/s wideband clean speech quality equivalent to the ITU-T G.722.2 codec at 12.65 kbit/s whilst the 8 kbit/s narrow-band codec operating mode provides clean speech quality equivalent to the I
16、TU-T G.729E codec at 11.8 kbit/s. The codec operates on 20-ms frames and has a maximum algorithmic delay of 42.875 ms for wideband input and wideband output signals. The maximum algorithmic delay for narrow-band input and narrow-band output signals is 43.875 ms. The codec may also be employed in a l
17、ow-delay mode when the encoder and decoder maximum bit rates are set to 8 kbit/s or 12 kbit/s. In this case, the maximum algorithmic delay is reduced by 10 ms. The codec also incorporates an alternate coding mode, with a minimum bit rate of 12.65 kbit/s, which is bitstream interoperable with Recomme
18、ndation ITU-T G.722.2, 3GPP AMR-WB and 3GPP2 VMR-WB mobile WB speech coding standards. This option replaces layer 1 and layer 2, and the layers 3-5 are similar to the default option with the exception that in layer 3 fewer bits are used to compensate for the extra bits of the 12.65 kbit/s core. The
19、decoder is further able to decode all other ITU-T G.722.2 operating modes. Furthermore, a new annex to this Recommendation is under ii Rec. ITU-T G.718 (2008)/Cor.3 (01/2011) development that will efficiently enable bit-stream interoperability with the 3GPP2 EVRC-WB codec. This Recommendation also i
20、ncludes discontinuous transmission mode (DTX) and comfort noise generation (CNG) algorithms that enable bandwidth savings during inactive periods. An integrated noise reduction algorithm can be used provided that the communication session is limited to 12 kbit/s. The underlying algorithm is based on
21、 a two-stage coding structure: the lower two layers are based on code-excited linear prediction (CELP) coding of the band (50-6400 Hz) where the core layer takes advantage of signal classification to use optimized coding modes for each frame. The higher layers encode the weighted error signal from t
22、he lower layers using overlap-add modified discrete cosine transformation (MDCT) transform coding. Several technologies are used to encode the MDCT coefficients to maximize performance for both speech and music. Corrigendum 1 (11/2008) corrects a number of minor problems that have been identified in
23、 the fixed-point ANSI C source code of the base text of this Recommendation. Amendment 1 (03/2009) introduces some additional minor corrections to the fixed-point ANSI C source code and to the text of the Recommendation. It also describes an addition of a verification of the default value of the lay
24、er 5 unused bit, and the procedure of erasure of layer 5 if the bit does not have the default value. Amendment 1 also introduces the new Annex A, which defines an alternative implementation of the ITU-T G.718 algorithm using floating point arithmetic to be used for implementation on DSP hardware opt
25、imized for floating-point operations. The accompanying floating point ANSI C source code is fully interoperable with the fixed-point code. While Corrigendum 2 (08/2009) includes further corrections to address minor problems found in both the fixed and floating-point implementations, its main benefit
26、 is in the streamlining of the fixed-point implementation which reduces the complexity of the codec from 69 to 57 WMOPS whilst remaining bit-exact with the original code on both steps of the characterization text. This 17% complexity reduction is significant and will clearly make the G.718 more attr
27、active to implement. Amendment 2 (03/2010) corrects minor defects identified in Recommendation ITU-T G.718 main body (text and ANSI C source code) and introduces new Annex B on a scalable superwideband (50-14000 Hz) extension for ITU-T G.718 operating from 36 to 48 kbit/s. Corrigendum 3 (01/2011) in
28、cludes corrections to address minor problems found in both the fixed and floating-point implementations. It also corrects some small inaccuracies found in the Recommendation ITU-T G.718 text and revises the test vectors package to improve the coverage. Corrigendum 3 includes an enhancement designed
29、to improve the performance of music with high frequency pitches beyond 381 Hz (16000 / 42) coded at 16, 24 and 32 kbit/s. The enhancement is a decoder-only option, hence it does not affect ITU-T G.718 bitstream. This Recommendation contains an electronic attachment with the ANSI C source code, which
30、 is an integral part of this Recommendation. Amendment 2 introduces Version 1.5 of the ANSI-C code. Test vectors for the superwideband extension of Annex B are also available online in the ITU-T test signal database at http:/itu.int/net/ITU-T/sigdb/speaudio/Gseries.htm#G.718 . For consistency, the C
31、 source code of the complete distribution of Corrigendum 3 (03/2010) has been denoted as “ Software Release 1.6“. History Edition Recommendation Approval Study Group 1.0 ITU-T G.718 2008-06-13 16 1.1 ITU-T G.718 (2008) Cor.1 2008-11-13 16 1.2 ITU-T G.718 (2008) Amend. 1 2009-03-16 16 1.3 ITU-T G.718
32、 (2008) Cor.2 2009-08-29 16 1.4 ITU-T G.718 (2008) Amend.2 2010-03-29 16 1.5 ITU-T G.718 (2008) Cor.3 2011-01-13 16 Rec. ITU-T G.718 (2008)/Cor.3 (01/2011) iii FOREWORD The International Telecommunication Union (ITU) is the United Nations specialized agency in the field of telecommunications, inform
33、ation and communication technologies (ICTs). The ITU Telecommunication Standardization Sector (ITU-T) is a permanent organ of ITU. ITU-T is responsible for studying technical, operating and tariff questions and issuing Recommendations on them with a view to standardizing telecommunications on a worl
34、dwide basis. The World Telecommunication Standardization Assembly (WTSA), which meets every four years, establishes the topics for study by the ITU-T study groups which, in turn, produce Recommendations on these topics. The approval of ITU-T Recommendations is covered by the procedure laid down in W
35、TSA Resolution 1. In some areas of information technology which fall within ITU-Ts purview, the necessary standards are prepared on a collaborative basis with ISO and IEC. NOTE In this Recommendation, the expression “Administration“ is used for conciseness to indicate both a telecommunication admini
36、stration and a recognized operating agency. Compliance with this Recommendation is voluntary. However, the Recommendation may contain certain mandatory provisions (to ensure, e.g., interoperability or applicability) and compliance with the Recommendation is achieved when all of these mandatory provi
37、sions are met. The words “shall“ or some other obligatory language such as “must“ and the negative equivalents are used to express requirements. The use of such words does not suggest that compliance with the Recommendation is required of any party. INTELLECTUAL PROPERTY RIGHTS ITU draws attention t
38、o the possibility that the practice or implementation of this Recommendation may involve the use of a claimed Intellectual Property Right. ITU takes no position concerning the evidence, validity or applicability of claimed Intellectual Property Rights, whether asserted by ITU members or others outsi
39、de of the Recommendation development process. As of the date of approval of this Recommendation, ITU had received notice of intellectual property, protected by patents, which may be required to implement this Recommendation. However, implementers are cautioned that this may not represent the latest
40、information and are therefore strongly urged to consult the TSB patent database at http:/www.itu.int/ITU-T/ipr/. ITU 2011 All rights reserved. No part of this publication may be reproduced, by any means whatsoever, without the prior written permission of ITU. iv Rec. ITU-T G.718 (2008)/Cor.3 (01/201
41、1) Table of Contents Page 5.5 ITU-T G.722.2-interoperable option 1 6.13 ITU-T G.722.2-interoperable option 16 7.1 Core layer decoding (layer 1) . 17 7.13 Decoding in ITU-T G.722.2-interoperable option . 21 8 Description of the transmitted parameter indices . 23 8.1 Bit allocation for the default opt
42、ion 23 9.2 Organization of the simulation software 24 Annex A Reference floating-point implementation for ITU-T G.718 26 A.5 ANSI C-code 26 Annex B Superwideband scalable extension for ITU-T G.718 27 Bibliography. 27 Electronic attachment: ANSI C source code and test vectors Rec. ITU-T G.718 (2008)/
43、Cor.3 (01/2011) 1 Recommendation ITU-T G.718 Frame error robust narrow-band and wideband embedded variable bit-rate coding of speech and audio from 8-32 kbit/s1Corrigendum 3 Corrections to text and C-code Modifications introduced by this corrigendum are shown in revision marks. Unchanged text is rep
44、laced by ellipsis (). Some parts of unchanged text (clause numbers, etc.) may be kept to indicate the correct insertion points. 5.5 ITU-T G.722.2-interoperable option To satisfy the objective of interoperability with other standards, ITU-T G.718 is equipped with an option to allow it to interoperate
45、 with ITU-T G.722.2 at 12.65 kbit/s. When invoked, the option allows ITU-T G.722.2 mode 2 (12.65 kbit/s) to replace layer 1 and layer 2. Note that this feature makes the codec interoperable also with the 12.65 kbit/s mode 2 of 29 and mode 3 of 2425. The decoder is further able to decode all ITU-T G.
46、722.2/AMR-WB coding modes. 6.1.5 Frame energy The spectral analysis module also calculates several energy-related parameters. For example, an average energy in critical bands is computed as ()()()()19,.,0 ,)()(21)(10222=+=ijkXjkXiMLiEiMkiIiRCBFFTCBCB(11) where )(kXRand )(kXIare, respectively, the re
47、al and imaginary parts of the k-th frequency bin and ij is the index of the first bin in the i-th critical band given by ij =1, 3, 5, 7, 9, 11, 13, 16, 19, 22, 26, 30, 35, 41, 47, 55, 64, 75, 89, 107. Furthermore, energy per frequency bin, EBIN(k), is calculated as ()()2221( ) ( ) ( ) , 0,.,1272BIN
48、R IFFTEk XkXk kL=+ =22( ) ( ) ( ), 0,.,127BIN R IEkXkXk k=+ =(12) _ 1This Recommendation contains an electronic attachment with the ANSI C source code, which is an integral part of this Recommendation. 2 Rec. ITU-T G.718 (06/2008) Cor.3 (01/2011) 6.4.3 Levinson-Durbin algorithm The modified autocorr
49、elation function, )(kr , is used to obtain the LP filter coefficients ak, k = 1,16, by solving the set of equations: () 16,.,1),(161=iirkirakk(50) The set of equations (50) is solved using the Levinson-Durbin algorithm 15. This algorithm uses the following recursion: () ()00Er= for i = 1 to 16 1(1)1() ( ) / ( 1)iiijjkri arijEi= + 111() ( ) / ( 1)iiijjkri arijEi= + ()iiiak= for 1 to 1ji= () ( 1) ( 1)ii ij jijaa ka=+ 2() (1 ) ( 1)iEi k Ei= (51) The final so