1、 International Telecommunication Union ITU-T G.711TELECOMMUNICATION STANDARDIZATION SECTOR OF ITU Amendment 2(11/2009) SERIES G: TRANSMISSION SYSTEMS AND MEDIA, DIGITAL SYSTEMS AND NETWORKS Digital terminal equipments Coding of voice and audio signals Pulse code modulation (PCM) of voice frequencies
2、 Amendment 2: New Appendix III Audio quality enhancement toolbox Recommendation ITU-T G.711 (1988) Amendment 2 ITU-T G-SERIES RECOMMENDATIONS TRANSMISSION SYSTEMS AND MEDIA, DIGITAL SYSTEMS AND NETWORKS INTERNATIONAL TELEPHONE CONNECTIONS AND CIRCUITS G.100G.199 GENERAL CHARACTERISTICS COMMON TO ALL
3、 ANALOGUE CARRIER-TRANSMISSION SYSTEMS G.200G.299 INDIVIDUAL CHARACTERISTICS OF INTERNATIONAL CARRIER TELEPHONE SYSTEMS ON METALLIC LINES G.300G.399 GENERAL CHARACTERISTICS OF INTERNATIONAL CARRIER TELEPHONE SYSTEMS ON RADIO-RELAY OR SATELLITE LINKS AND INTERCONNECTION WITH METALLIC LINES G.400G.449
4、 COORDINATION OF RADIOTELEPHONY AND LINE TELEPHONY G.450G.499 TRANSMISSION MEDIA AND OPTICAL SYSTEMS CHARACTERISTICS G.600G.699 DIGITAL TERMINAL EQUIPMENTS G.700G.799 General G.700G.709 Coding of voice and audio signals G.710G.729Principal characteristics of primary multiplex equipment G.730G.739 Pr
5、incipal characteristics of second order multiplex equipment G.740G.749 Principal characteristics of higher order multiplex equipment G.750G.759 Principal characteristics of transcoder and digital multiplication equipment G.760G.769 Operations, administration and maintenance features of transmission
6、equipment G.770G.779 Principal characteristics of multiplexing equipment for the synchronous digital hierarchy G.780G.789 Other terminal equipment G.790G.799 DIGITAL NETWORKS G.800G.899 DIGITAL SECTIONS AND DIGITAL LINE SYSTEM G.900G.999 MULTIMEDIA QUALITY OF SERVICE AND PERFORMANCE GENERIC AND USER
7、-RELATED ASPECTS G.1000G.1999 TRANSMISSION MEDIA CHARACTERISTICS G.6000G.6999 DATA OVER TRANSPORT GENERIC ASPECTS G.7000G.7999 PACKET OVER TRANSPORT ASPECTS G.8000G.8999 ACCESS NETWORKS G.9000G.9999 For further details, please refer to the list of ITU-T Recommendations. Rec. ITU-T G.711 (1988)/Amd.2
8、 (11/2009) i Recommendation ITU-T G.711 Pulse code modulation (PCM) of voice frequencies Amendment 2 New Appendix III Audio quality enhancement toolbox Summary Appendix III to ITU-T Recommendation G.711 describes a toolbox to provide audio quality enhancements to ITU-T G.711. The toolbox comprises f
9、our tools that are algorithms initially developed in the context of ITU-T G.711.1 wideband speech and audio codec. The four tools aim at enhancing the quality of ITU-T G.711 legacy for both encoder and decoder sides. At the encoder side is a noise shaping tool which is used in combination with a mod
10、ified ITU-T G.711 encoder to perceptually shape the coding noise of the PCM encoder and produce a compatible bit stream. At the decoder, the three tools offer an improved audio quality and/or a better robustness against packet losses. The first tool is a noise gate which is used to increase the clea
11、rness of the audio signal during quasi-silent periods. The second tool is a postfilter which reduces the PCM quantization noise of legacy ITU-T G.711. The third is a frame erasure concealment algorithm which is used to extrapolate the signal in case of erased frames. The toolbox has been tested with
12、 a frame size of 5 ms. The overall complexity of the toolbox is about 4 WMOPS. All of these tools can be used separately or in combination. This appendix contains an electronic attachment containing the respective ANSI-C source code. Source Amendment 2 to Recommendation ITU-T G.711 (1988) was agreed
13、 on 6 November 2009 by ITU-T Study Group 16 (2009-2012). ii Rec. ITU-T G.711 (1988)/Amd.2 (11/2009) FOREWORD The International Telecommunication Union (ITU) is the United Nations specialized agency in the field of telecommunications, information and communication technologies (ICTs). The ITU Telecom
14、munication Standardization Sector (ITU-T) is a permanent organ of ITU. ITU-T is responsible for studying technical, operating and tariff questions and issuing Recommendations on them with a view to standardizing telecommunications on a worldwide basis. The World Telecommunication Standardization Ass
15、embly (WTSA), which meets every four years, establishes the topics for study by the ITU-T study groups which, in turn, produce Recommendations on these topics. The approval of ITU-T Recommendations is covered by the procedure laid down in WTSA Resolution 1. In some areas of information technology wh
16、ich fall within ITU-Ts purview, the necessary standards are prepared on a collaborative basis with ISO and IEC. NOTE In this Recommendation, the expression “Administration“ is used for conciseness to indicate both a telecommunication administration and a recognized operating agency. Compliance with
17、this Recommendation is voluntary. However, the Recommendation may contain certain mandatory provisions (to ensure e.g. interoperability or applicability) and compliance with the Recommendation is achieved when all of these mandatory provisions are met. The words “shall“ or some other obligatory lang
18、uage such as “must“ and the negative equivalents are used to express requirements. The use of such words does not suggest that compliance with the Recommendation is required of any party. INTELLECTUAL PROPERTY RIGHTS ITU draws attention to the possibility that the practice or implementation of this
19、Recommendation may involve the use of a claimed Intellectual Property Right. ITU takes no position concerning the evidence, validity or applicability of claimed Intellectual Property Rights, whether asserted by ITU members or others outside of the Recommendation development process. As of the date o
20、f approval of this Recommendation, ITU had received notice of intellectual property, protected by patents, which may be required to implement this Recommendation. However, implementers are cautioned that this may not represent the latest information and are therefore strongly urged to consult the TS
21、B patent database at http:/www.itu.int/ITU-T/ipr/. ITU 2010 All rights reserved. No part of this publication may be reproduced, by any means whatsoever, without the prior written permission of ITU. Rec. ITU-T G.711 (1988)/Amd.2 (11/2009) iii CONTENTS Page III.1 Scope 1 III.2 References 1 III.3 Defin
22、itions 1 III.4 Abbreviations and acronyms 1 III.5 Conventions 2 III.6 General description of the toolbox . 2 III.7 Functional description of the toolbox for the encoder 4 III.8 Functional description of the toolbox for the decoder 4 III.9 Bit-exact description of the audio quality enhancement toolbo
23、x for ITU-T G.711 . 5 Electronic attachment: ANSI-C source code. Rec. ITU-T G.711 (1988)/Amd.2 (11/2009) 1 Recommendation ITU-T G.711 Pulse code modulation (PCM) of voice frequencies Amendment 2 New Appendix III Audio quality enhancement toolbox1III.1 Scope This appendix contains the description of
24、a toolbox to provide audio quality enhancements to the legacy ITU-T G.711 codec. This appendix is organized as follows. The references, definitions, abbreviations and acronyms, and conventions used throughout this appendix are defined in clauses III.2, III.3, III.4, and III.5, respectively. Clause I
25、II.6 gives a general outline of the four algorithms. The noise shaping (NS) is discussed in clause III.7.1. The frame erasure concealment (FERC) is presented in clause III.8.1. The noise gate (NG) and the postfilter (PF) are described in clauses III.8.2 and III.8.3, respectively. Clause III.9 descri
26、bes the software that defines this toolbox in 16-32-bit fixed-point arithmetic. III.2 References ITU-T Recommendation G.191 (2005), Software tools for speech and audio coding standardization. ITU-T Recommendation G.192 (1996), A common digital parallel interface for speech standardization activities
27、. ITU-T Recommendation G.711.1 (2008), Wideband embedded extension for G.711 pulse code modulation. III.3 Definitions This clause is intentionally left blank. III.4 Abbreviations and acronyms This appendix uses the abbreviations and acronyms listed in Table III.1. Table III.1 Glossary of abbreviatio
28、ns and acronyms Acronym Description FERC Frame Erasure Concealment NB Narrow-Band NG Noise Gate NS Noise Shaping PCM Pulse Code Modulation PF PostFilter WMOPS Weighted Millions of Operations Per Second _ 1This appendix includes an electronic attachment containing the respective ANSI-C source code. 2
29、 Rec. ITU-T G.711 (1988)/Amd.2 (11/2009) III.5 Conventions Time-domain signals are denoted by their symbol and a sample index between parentheses, e.g., s(n). The variable n is used as sample index. Table III.2 lists the most relevant symbols used throughout this appendix. Table III.2 Glossary of mo
30、st relevant symbols Type Name Description Filters ()Fz Perceptual weighting filter Signals ()NBsn Input signal )(nsLBPre-processed input signal )(nsLB Perceptually weighted target signal )(0ndLDifference signal of )(nsLBand )(nsLB 0()Lsn Decoded signal of ITU-T G.711 bit stream, without offset Loffc
31、)(0nsLDecoded signal of ITU-T G.711 ()1LBsn Signal after decoding and FERC ()LBsn Signal after postfilter ()NBsn Signal after noise gate Parameters Loffc Encoder offset value ia LP coefficient of the perceptual filter 0LI ITU-T G.711 compatible bit stream III.6 General description of the toolbox Thi
32、s toolbox contains four algorithms for audio quality enhancement of the legacy ITU-T G.711. The noise shaping (NS) is applied only in the encoder, the frame erasure concealment (FERC), the noise gate (NG) and the postfilter (PF) are applied only in the decoder. These algorithms have been extracted f
33、rom ITU-T G.711.1 scalable coder/decoder and can be used with ITU-T G.711 legacy coder/decoder. The tools may be used separately or in combination. This toolbox is implemented in fixed point using basic operators version 2.2 defined in the ITU-T G.191 software tool library. This appendix provides a
34、detailed description of all four algorithms. III.6.1 Tools for ITU-T G.711 encoder Only one tool is applied in the encoder: the noise shaping (NS) tool. Figure III.1 shows the high-level block diagram of an ITU-T G.711 encoder with the NS tool. Figure III.1 is described in detail in clause III.7. Re
35、c. ITU-T G.711 (1988)/Amd.2 (11/2009) 3 ()( 0,.,39)LBsnn =()FzPerceptual filter calculationEncoder711GLoffc0()Ldnja()LBs n0 ()Ls n0 ()Ls nG.711 bitstream_+_Pre-processing filterInput signal()(0,.,39)NBsnn =0LIDecoder711GFigure III.1 High-level block diagram of the noise shaping tool III.6.2 Tools fo
36、r ITU-T G.711 decoder Figure III.2 shows a high-level block diagram of an ITU-T G.711 decoder combined with three tools: the frame erasure concealment (FERC), the noise gate (NG) and the postfilter (PF). This figure illustrates the recommended execution order of the tools when they are combined. Fig
37、ure III.2 High-level block diagram of decoder toolbox III.6.3 Algorithmic delay Table III.3 gives the algorithmic delay of each tool and the algorithmic delay for the combination of the three tools at the decoder side. Note that these algorithmic delays are given for 5 ms frame size. Table III.3 Alg
38、orithmic delay of the toolbox (ms) NS NG PF FERC FERC+PF+NG 0 0 2 5 5 III.6.4 Computational complexity and storage requirements The observed worst-case complexity of the toolbox is based on the implementation with basic operators of the ITU-T software tool library STL2005 v2.2 in ITU-T G.191. The wo
39、rst computational complexity is detailed in Table III.4, and all the figures show the observed worst complexity either in -law or A-law. The storage requirements in 16-bit words for the four tools are given in Tables III.5. Note that the RAM figures are based on the arrays which form the dominant pa
40、rt, but not on singular variables. It was found that the number of such variables was insignificant when compared with size required by arrays. Table III.4 Worst computational complexity of the toolbox WMOPS NS NG PF FERC FERC+PF+NG 0.87 0.23 2.02 2.05 3.31 ()NBs nG.711 bitstream1 ()LBs n0LIG.711 de
41、coderPostfilterFERCNoise gate Synthesizedoutput signal0 ()Ls n ()LBs n4 Rec. ITU-T G.711 (1988)/Amd.2 (11/2009) Table III.5 Storage requirements of the toolbox Memory type NS NG PF FERC Static RAM (kWords) 0.093 0.003 0.353 0.984 Scratch RAM (kWords) 0.107 0.012 0.529 0.314 Data ROM (kWords) 0.088 0
42、 0.191 0.121 Program ROM (number of basic ops) 191 37 593 728 III.6.5 Toolbox description The description of the toolbox algorithms is made in terms of bit-exact fixed-point mathematical operations. The ANSI-C code indicated in clause III.9, which constitutes an integral part of this appendix, refle
43、cts this bit-exact, fixed-point descriptive approach. The mathematical descriptions of the encoder and decoder can be implemented in other fashions, possibly leading to a codec implementation not complying with this appendix. Therefore, the algorithm description of the ANSI-code of clause III.9 shal
44、l take precedence over the mathematical descriptions whenever discrepancies are found. III.7 Functional description of the toolbox for the encoder III.7.1 Noise shaping (NS) tool The input signal sNB(n) is encoded using -law or A-law pulse code modulation (PCM) with noise feedback to perceptually sh
45、ape the coding noise of the PCM encoder. The encoder with weighted noise feedback loop is shown in Figure III.1. First, the input signal sNB(n) is pre-processed by a high-pass filter with a cut-off frequency of 50 Hz. Then, the pre-processed signal sLB(n) is added to a noise feedback signal and the
46、offset value cLoff, and the resulting signal sLB(n) is fed to the legacy ITU-T G.711 encoder. Based on the obtained bit stream IL0(n) the legacy ITU-T G.711 decoder locally decodes the signal L0(n) and the offset value cLoffis removed to obtain L0(n). An LP analysis is then performed on L0(n) to obt
47、ain the coefficients ai, and the perceptual filter F(z) is calculated. Then the quantization noise dL0(n), filtered by F(z), is fed back to be added to the input signal sLB(n). It should be noted that for very low energy signals, the legacy ITU-T G.711 encoding, based on log-PCM, is replaced by a di
48、fferent encoding scheme called “dead-zone quantizer“. This is described later in clause III.7.1.4. III.7.1.1 Pre-processing high-pass filter Same as clause 7.1 of ITU-T G.711.1. III.7.1.2 PCM encoder based on G.711 Same as clause 7.3.1 of ITU-T G.711.1. III.7.1.3 Perceptual filtering Same as clause
49、7.3.2 of ITU-T G.711.1. III.7.1.4 Dead-zone quantizer Same as clause 7.3.3 of ITU-T G.711.1. III.8 Functional description of the toolbox for the decoder The toolbox includes three tools at the decoder side. All these tools can be used either in combination or alone. Figure III.2 describes the tool position in the processing chain. The algorithmic descriptions of the tools are given in the following clauses. Rec. ITU-T G.711 (1988)/Amd.2 (11/2009) 5 III.8.1 Narrow-band frame erasure