1、STDeITU-T RECMN G.7LL APP II-ENGL 2000 INTERNATIONAL TELECOMMUN ITU=T TELECOMMUNICATION STANDARDIZATION SECTOR OF ITU U 4862591 0688053 105 m CATION UNION G.711 Appendix II (0212000) SERIES G: TRANSMISSION SYSTEMS AND MEDIA, DIGITAL SYSTEMS AND NETWORKS Digital transmission systems - Terminal equipm
2、ents - Coding of analogue signals by pulse code modulation Pulse code modulation (PCM) of voice frequencies Appendix II: A comfort noise payload definition for ITU-T G.711 use in packet-based multimedia com mu n ication systems ITU-T Recommendation G.71 I - Appendix II (Formerly CCITT Recommendation
3、) STD-ITU-T RECMN G-711 APP II-ENGL 2000 4862592 0688054 042 ITU-T G-SERIES RECOMMENDATIONS TRANSMISSION SYSTEMS AND MEDIA, DIGITAL SYSTEMS AND NETWORKS INTERNATIONAL TELEPHONE CONNECTIONS AND CIRCUITS INTERNATIONAL ANALOGUE CARRIER SYSTEM TRANSMISSION SYSTEMS INDIVIDUAL CHARACTERISTICS OF INTERNATI
4、ONAL CARRIER TELEPHONE SYSTEMS ON METALLIC LINES GENERAL CHARACTERISTICS COMMON TO ALL ANALOGUE CARRIER- GENERAL CHARACTERISTICS OF INTERNATIONAL CARRIER TELEPHONE WITH METALLIC LINES SYSTEMS ON RADIO-RELAY OR SATELLITE LINKS AND INTERCONNECTION COORDINATION OF RADIOTELEPHONY AND LINE TELEPHONY TEST
5、ING EQUIPMENTS TRANSMISSION MEDIA CHARACTERISTICS DIGITAL TRANSMISSION SYSTEMS TERMINAL EQUIPMENTS General G. 100-G. 199 G.200-G .299 G.300-G.399 G.400-G .449 G.450-G.499 G.600-G .699 G.700-G.799 G.700-G .709 )I Coding of analogue signals by methods other than PCM G .720-G.729 Principal characterist
6、ics of primary multiplex equipment Principal characteristics of second order multiplex equipment Principal characteristics of higher order multiplex equipment Principal characteristics of transcoder and digital multiplication equipment Operations, administration and maintenance features of transmiss
7、ion equipment Principal characteristics of multiplexing equipment for the synchronous digital hierarchy Other terminal equipment DIGITAL NETWORKS DIGITAL SECTIONS AND DIGITAL LINE SYSTEM G.730-G.739 G.740-G . 749 G.750-G.759 G.760-G.769 G.770-G.779 G.780-G. 789 G.790-G .799 G.800-G.899 G.900-G .999
8、For further details, please refer to ITU-TList of Recommendations. STD*ITU-T RECMN G07LL APP II-ENGL 2000 = 486259L 0688055 T8B = ITU-T RECOMMENDATION G.711 PULSE CODE MODULATION (PCM) OF VOICE FREQUENCIES APPENDIX II A comfort noise payload defmition for ITU-T G.711 use in packet-based multimedia c
9、ommunication systems Summary This appendix defines a comfort noise payload format (or bit-stream) for ITU-T G.711 use in packet- based multimedia communication systems. The use of the payload format is intended for packet-based systems with a large header overhead where the packet transmission rate
10、plays a significant role in the overall system bit-rate. In this situation, the use of VADTXCNG can significantly reduce the packet transmission rate and hence improve the bandwidth efficiency. Source Appendix II to ITU-T Recommendation G.711 was prepared by ITU-T Study Group 16 (1997-2000) and was
11、approved under the WTSC Resolution No. 1 procedure on 28 February 2000. STDOITU-T RECMN G.711 APP II-ENGL 2000 rn 48b259L Ob8805b 914 FOREWORD IT (International Telecommunication Union) is the United Nations Specialized Agency in the field of telecommunications. The ITU Telecommunication Standardiza
12、tion Sector (ITU-T) is a permanent organ of the ITU. The ITU-T is responsible for studying technical, operating and tariff questions and issuing Recommendations on them with a view to standardizing telecommunications on a worldwide basis. The World Telecommunication Standardization Conference (WTSC)
13、, which meets every four years, establishes the topics for study by the ITU-T Study Groups which, in their turn, produce Recommendations on these topics. The approval of Recommendations by the Members of the ITU-T is covered by the procedure laid down in WTSC Resolution No. 1. In some areas of infor
14、mation technology which fall within ITU-Ts purview, the necessary standards are prepared on a collaborative basis with IS0 and IEC. NOTE In this Recommendation, the expression “Administration“ is used for conciseness to indicate both a telecommunication administration and a recognized operating agen
15、cy. INTELLECTUAL PROPERTY RIGHTS The ITU draws attention to the possibility that the practice or implementation of this Recommendation may involve the use of a claimed Intellectual Property Right. The ITU takes no position concerning the evidence, validity or applicability of claimed Intellectual Pr
16、operty Rights, whether asserted by ITU members or others outside of the Recommendation development process. As of the date of approval of this Recommendation, the ITU had not received notice of intellectual property, protected by patents, which may be required to implement this Recommendation. Howev
17、er, implementors are cautioned that this may not represent the latest information and are therefore strongly urged to consult the TSB patent database. o ITU 2000 All rights reserved. No part of this publication may be reproduced or utilized in any form or by any means, electronic or mechanical, incl
18、uding photocopying and microfilm, without permission in writing from the ITU. Appendix II . A comfort noise payload definition for ITU-T G.711 use in packet-based II . 1 11.2 11.3 11.4 11.5 multimedia communication systems Scope Comfort noise payload definition . II.2.1 Noise level . 11.2.2 Reflecti
19、on coefficients . 11.2.3 Payload packing Guidelines for use . 11.3.1 Factors affecting system performance . 11.3.2 Illustration of bandwidth savings in packet-based network applications Performance results Example solution 11.5.1 Algorithm description . 11.5.2 Tested configuration 1 1 1 1 2 2 2 3 4
20、4 7 7 10 STD-ITU-T RECMN G*7LL APP II-ENGL 2000 4Bh2591 OhBBCISB 797 = Recommendation G.711 PULSE CODE MODULATION (PCM) OF VOICE FlWQUENCIES APPENDIX II A comfort noise payload definition for ITU-T G.711 use in packet-based multimedia communication systems (Geneva, 2000) 11.1 Scope This appendix def
21、ines a comfort noise payload format (or bit-stream) for ITU-T G.711 use in packet- based multimedia communication systems. The payload format is generic and may also be used with other speech codecs without built-in Discontinuous Transmission (DTX) capability such as ITU-T Recommendations G.726 i, G
22、.727 2, G.728 3, and G.722 4. The payload format provides a minimum interoperability specification for communication of comfort noise parameters. The comfort noise analysis and synthesis as well as the Voice Activity Detection (VAD) and DTX algorithms are unspecified and left implementation-specific
23、. However, an example solution has been tested and is described. It uses the VAD and DTX of G.729 Annex B 5 and a comfort noise generation algorithm (CNG) which is provided for information. The use of the payload format is intended for packet-based systems with a large header overhead where the pack
24、et transmission rate plays a significant role in the overall system bit-rate. In this situation, the use of VADDTWCNG can significantly reduce the packet transmission rate and hence improve the bandwidth efficiency. 11.2 Comfort noise payload definition The comfort noise payload consists of a descri
25、ption of the noise level and spectral information in the form of reflection coefficients. The use of spectral information is optional and the all-pole model order is left unspecified. The encoder can determine the appropriate model order based on such considerations as quality, complexity, expected
26、environmental noise and signal bandwidth. The model order is not explicitly transmitted since it can be derived from the length of the payload at the receiver. For complexity or other reasons, the decoder may reduce the model order by setting higher order reflection coefficients to zero. 11.2.1 Nois
27、e level The noise level is expressed in dov, with values fkom O to 127 representing O to -127 dBov. dBov is the level relative to the overload of the system. The noise level is packed with the Most Significant Bit (MSB) first with the unused bit always set to O according to Figure II. 1, O 1 2 3 4 5
28、 6 7 O I Levei J MSB Figure II.lIG.711- Noise level bit packing STD.ITU-T RECflN G.711 APP II-ENGL 2000 = 4862593 0688059 623 = 11.2.2 Reflection coefficients The spectral information is transmitted using reflection coefficients 6. From the polynomial: M A(z) = 1 - Xcy-j j-i obtained by linear predi
29、ction analysis, the set of refl ection coefficients may be obtained from the set of LPC coefficients using a backward recursion of the form: where i goes from M , to A4 - 1 , down to 1 with the initial condition: ai (W = a. J 12 jlM Note that the above formulation results in the solution to kl given
30、 by: where ri is the ith autocorrelation coefficient of the input signal. Each reflection coefficient can have values between -1 and 1 and is quantized uniformly using 8 bits. The quantized value is represented by the 8 bit index N, where N = O, . , 254, and index N = 255 is reserved for future use.
31、 Each index N is packed into a separate byte with the MSB first. The quantized value of each reflection coefficient can be obtained fi-om its corresponding index using: ki (Ni) = - 258 * (Ni-127) forNi =O, ., 254;-1 0.06) end Th = 0.06 end The reflection coefficients k, (i) are computed fiom the sel
32、ected autocorrelation coefficients using the Levinson-Durbin algorithm. 11.5.1 .i .4 Quantization For Silence Insertion Descriptor (SID) frames, the energy E(i) and the reflection coefficients km (i) are quantized and packed according to the specified payload format. 11.5.1.2 Decoder The decoder pro
33、duces comfort noise by passing a scaled white noise excitation through a linear prediction synthesis filter. The details follow in the following subclauses. 11.5.1.2.1 Parameter update The reflection coefficients from the last received SID fiame are used in the current frame. Let the last received c
34、omfort noise parameters be denoted LESID where the energy has been converted from dBov to base-2 logarithm. The energy used in the current fiame is given by: LE(i)=LE(i-l).a+LESID .(i.o-a) where a = 0.9. This smoothing procedure is done to avoid any abrupt changes in signal energy in the comfort noi
35、se. 11.5.1.2.2 Excitation generation A random number generator with a Gaussian distribution is used to produce the sequence Rn that is scaled by the factor q to the correct energy according to the equation: where L is the length of the excitation, and E(i) is the frame energy. A constant approximati
36、on for the denominator of the above equation is used to avoid the dot product operation and reduce complexity. STDmITU-T RECMN G-711 APP 11-ENGL 2000 4862593 Ob88Obb 8b3 = 11.5.1.2.3 LP synthesis The reflection coefficients are converted to linear prediction coefficients for use in the linear predic
37、tion (LP) synthesis filter according to the following recursion 6: Frame Sie 5 ms 10 ms 20 ms being solved for i = 1,2, . , M and with the final set defined as: RAM (words) ROM (words) WMOPS 650 1300 1.1 690 1300 0.66 760 1300 0.47 b) J ai =a The linear prediction synthesis filter is defined as: 1 1
38、 j=l The scaled excitation is passed through the filter to produce the final comfort noise. The length of the excitation L is, in general, equal to the frame length. However, for the first inactive frame following an active frame, L is equal to the frame length plus the model order (M). In this case
39、, the first M output samples from the synthesis filter are ignored. 11.5.1.3 Delay There is no delay inherent in the comfort noise algorithm. 11.5.1.4 Complexity The algorithm has been implemented in 16-bit fixed-point using the ITU-T software Tool Library. The memory and resource usage at different
40、 frame sizes operating at 8.0 kHz sampling rate and a 10th order all-pole model is summarized in Table II.2. The WMOPS are obtained using the operations counter within the library and represent the worst case. The ROM is the estimated size on a typical fixed-point DSP. Table II.2/G.711- CNG Resource
41、 Requirements for a 10th order model STD.ITU-T RECMN G.733 APP II-ENGL 2000 M 4862593 0688067 7TT 11.5.2 Tested configuration The algorithm as tested is specified in Table 11.3. Parameter As Tested 1 I Sampling Rate I 8.0 kHz I Frame Size Model Order Look-Ahead Delay A look-ahead of 5 ms was added d
42、uring testing by delaying the input to the accompanying speech codec (G.711) as in Figure 11.7. The look-ahead was introduced to properly tailor the usage of the VAD of G.729 Annex B to the CNG example solution. The look-ahead delay can be avoided in practice by adding an extra hangover frame to the
43、 G.729 Annex B VAD. b-CNG Input Frame-+/ Input PCM Stream I l 1 Audio Coder Input LCNG 4 Lookahead T16093-00 Figure II.7/G.711- CNG Look-ahead during Testing References 11 Z 3 4 5 6 7 CCITT Recommendation G.726 (1 990), 40, 32,24, I6 kbith adaptive differential pulse code modulation (ADPCM). CCITT R
44、ecommendation G.727 (1 990), 5, 4-, 3- and 2-bitshample embedded adaptive diflerential pulse code modulation (ADPCM). CCITT Recommendation G.728 (1 992), Coding of speech at I6 kbitsh using low-delay code excited linear prediction. CCITT Recommendation G.722 (1988), 7 kHz audio-coding within 64 kbit
45、h. ITU-T Recommendation G.729 Annex B (1996), A silence compression scheme for G. 729 optimized for terminals conforming to Recommendation K 70. RABINER (L.R.), SCHAFER (R. W.): Digital processing of speech signals, Prentice-Hall, 1978. ITU-T Recommendation G.191 (1996), Software tools for speech and audio coding standardization.