1、 International Telecommunication Union ITU-T G.722.1TELECOMMUNICATION STANDARDIZATION SECTOR OF ITU (05/2005) SERIES G: TRANSMISSION SYSTEMS AND MEDIA, DIGITAL SYSTEMS AND NETWORKS Digital terminal equipments Coding of analogue signals by methods other than PCM Low-complexity coding at 24 and 32 kbi
2、t/s for hands-free operation in systems with low frame loss ITU-T Recommendation G.722.1 ITU-T G-SERIES RECOMMENDATIONS TRANSMISSION SYSTEMS AND MEDIA, DIGITAL SYSTEMS AND NETWORKS INTERNATIONAL TELEPHONE CONNECTIONS AND CIRCUITS G.100G.199 GENERAL CHARACTERISTICS COMMON TO ALL ANALOGUE CARRIER-TRAN
3、SMISSION SYSTEMS G.200G.299 INDIVIDUAL CHARACTERISTICS OF INTERNATIONAL CARRIER TELEPHONE SYSTEMS ON METALLIC LINES G.300G.399 GENERAL CHARACTERISTICS OF INTERNATIONAL CARRIER TELEPHONE SYSTEMS ON RADIO-RELAY OR SATELLITE LINKS AND INTERCONNECTION WITH METALLIC LINES G.400G.449 COORDINATION OF RADIO
4、TELEPHONY AND LINE TELEPHONY G.450G.499 TRANSMISSION MEDIA CHARACTERISTICS G.600G.699 DIGITAL TERMINAL EQUIPMENTS G.700G.799 General G.700G.709 Coding of analogue signals by pulse code modulation G.710G.719 Coding of analogue signals by methods other than PCM G.720G.729 Principal characteristics of
5、primary multiplex equipment G.730G.739 Principal characteristics of second order multiplex equipment G.740G.749 Principal characteristics of higher order multiplex equipment G.750G.759 Principal characteristics of transcoder and digital multiplication equipment G.760G.769 Operations, administration
6、and maintenance features of transmission equipment G.770G.779 Principal characteristics of multiplexing equipment for the synchronous digital hierarchy G.780G.789 Other terminal equipment G.790G.799 DIGITAL NETWORKS G.800G.899 DIGITAL SECTIONS AND DIGITAL LINE SYSTEM G.900G.999 QUALITY OF SERVICE AN
7、D PERFORMANCE GENERIC AND USER-RELATED ASPECTS G.1000G.1999 TRANSMISSION MEDIA CHARACTERISTICS G.6000G.6999 DATA OVER TRANSPORT GENERIC ASPECTS G.7000G.7999 ETHERNET OVER TRANSPORT ASPECTS G.8000G.8999 ACCESS NETWORKS G.9000G.9999 For further details, please refer to the list of ITU-T Recommendation
8、s. ITU-T Rec. G.722.1 (05/2005) i ITU-T Recommendation G.722.1 Low complexity coding at 24 and 32 kbit/s for hands-free operation in systems with low frame loss Summary This Recommendation describes a low complexity encoder and decoder that may be used for 7-kHz bandwidth audio signals working at 24
9、 kbit/s or 32 kbit/s. Further, this algorithm is recommended for use in hands-free applications such as conferencing where there is a low probability of frame loss. It may be used with speech or music inputs. The bit rate may be changed at any 20-ms frame boundary. New Annex C contains the descripti
10、on of a low-complexity extension mode to G.722.1, which doubles the algorithm to permit 14-kHz audio bandwidth using a 32-kHz audio sample rate, at 24, 32, and 48 kbit/s. This mode is suitable for use in video conferencing, teleconferencing, and Internet streaming applications, and uses the same 20-
11、ms frame length, 40-ms algorithmic delay, and same algorithmic steps as the 7-kHz mode. Less than 5.5 WMOPS are required for encoding and decoding in the baseline 7-kHz mode, and less than 11 WMOPS are required for encoding and decoding in the 14-kHz mode of Annex C. This Recommendation includes a s
12、oftware package which contains the encoder and decoder source code and a set of test vectors for developers. These vectors are a tool providing an indication of success in implementing this code. The fixed-point code implements both the 7-kHz mode (main body) and the 14-kHz mode (Annex C). The float
13、ing point implements only the 7-kHz mode. Source ITU-T Recommendation G.722.1 was approved on 14 May 2005 by ITU-T Study Group 16 (2005-2008) under the ITU-T Recommendation A.8 procedure. ii ITU-T Rec. G.722.1 (05/2005) FOREWORD The International Telecommunication Union (ITU) is the United Nations s
14、pecialized agency in the field of telecommunications. The ITU Telecommunication Standardization Sector (ITU-T) is a permanent organ of ITU. ITU-T is responsible for studying technical, operating and tariff questions and issuing Recommendations on them with a view to standardizing telecommunications
15、on a worldwide basis. The World Telecommunication Standardization Assembly (WTSA), which meets every four years, establishes the topics for study by the ITU-T study groups which, in turn, produce Recommendations on these topics. The approval of ITU-T Recommendations is covered by the procedure laid
16、down in WTSA Resolution 1. In some areas of information technology which fall within ITU-Ts purview, the necessary standards are prepared on a collaborative basis with ISO and IEC. NOTE In this Recommendation, the expression “Administration“ is used for conciseness to indicate both a telecommunicati
17、on administration and a recognized operating agency. Compliance with this Recommendation is voluntary. However, the Recommendation may contain certain mandatory provisions (to ensure e.g. interoperability or applicability) and compliance with the Recommendation is achieved when all of these mandator
18、y provisions are met. The words “shall“ or some other obligatory language such as “must“ and the negative equivalents are used to express requirements. The use of such words does not suggest that compliance with the Recommendation is required of any party. INTELLECTUAL PROPERTY RIGHTS ITU draws atte
19、ntion to the possibility that the practice or implementation of this Recommendation may involve the use of a claimed Intellectual Property Right. ITU takes no position concerning the evidence, validity or applicability of claimed Intellectual Property Rights, whether asserted by ITU members or other
20、s outside of the Recommendation development process. As of the date of approval of this Recommendation, ITU had received notice of intellectual property, protected by patents, which may be required to implement this Recommendation. However, implementors are cautioned that this may not represent the
21、latest information and are therefore strongly urged to consult the TSB patent database. ITU 2005 All rights reserved. No part of this publication may be reproduced, by any means whatsoever, without the prior written permission of ITU. ITU-T Rec. G.722.1 (05/2005) iii CONTENTS Page 1 Scope 1 2 Normat
22、ive references 2 3 The encoder 2 3.1 The Modulated Lapped Transform (MLT). 4 3.2 Computing and quantizing the amplitude envelope . 5 3.3 Coding the amplitude envelope 5 3.4 Categorization procedure 6 3.5 Scalar Quantized Vector Huffman Coding (SQVH) 8 3.6 Rate control 11 3.7 Transmission of the MLT
23、vector indices 11 3.8 Bit stream 11 4 The decoder 11 4.1 Decoding the amplitude envelope 12 4.2 Determining categorization 12 4.3 Decoding MLT coefficients . 12 4.4 Noise-fill. 13 4.5 Insufficient bits. 13 4.6 Frame erasure . 13 4.7 The Inverse MLT (IMLT) 13 5 C code . 14 6 Flow chart of categorizat
24、ion procedure 15 Annex A Packet format, capability identifiers and capability parameters 20 A.1 References 20 A.2 Packet structure for G.722.1 frames. 20 A.3 Capability Identifiers and Parameters for use with ITU-T Rec. H.245 21 Annex B Floating-point implementation for G.722.1. 23 B.1 Introduction
25、23 B.2 Algorithmic description 23 B.3 ANSI C code. 24 Annex C 14 kHz mode at 24, 32, and 48 kbit/s 25 C.1 Introduction 25 C.2 Algorithmic description 25 C.3 ANSI C code. 27 ITU-T Rec. G.722.1 (05/2005) 1 ITU-T Recommendation G.722.1 Low-complexity coding at 24 and 32 kbit/s for hands-free operation
26、in systems with low frame loss11 Scope This Recommendation describes a digital wideband coder algorithm that provides an audio bandwidth of 50 Hz to 7 kHz, operating at a bit rate of 24 kbit/s or 32 kbit/s. The digital input to the coder may be 14-, 15- or 16-bit 2s complement format at a sample rat
27、e of 16 kHz (handled in the same way as in ITU-T Rec. G.722). The analogue and digital interface circuitry at the encoder input and decoder output should conform to the same specifications described in ITU-T Rec. G.722. The algorithm is based on transform technology, using a Modulated Lapped Transfo
28、rm (MLT). It operates on 20-ms frames (320 samples) of audio. Because the transform window (basis function length) is 640 samples and a 50 per cent (320 samples) overlap is used between frames, the effective look-ahead buffer size is 20 ms. Hence the total algorithmic delay of 40 ms is the sum of th
29、e frame size plus look-ahead. All other delays are due to computational and network transmission delays. The description of the coding algorithm of this Recommendation is made in terms of bit-exact, fixed-point mathematical operations. The C code indicated in clause 5, which constitutes an integral
30、part of this Recommendation, reflects this bit-exact, fixed-point descriptive approach, and shall take precedence over the mathematical descriptions of clauses 3 and 4 whenever discrepancies are found. The mathematical descriptions of the encoder (clause 3), and decoder (clause 4), could have been i
31、mplemented in several other fashions, but the C code of clause 5 has been provided as reference purposes. Thus, to comply with this Recommendation, any implementation must produce for any input signal the same output results as the C code of clause 5. Note that to ensure that this goal is achieved,
32、implementations should follow the computational details, tables of constants, sequencing of variable adaptation and use given by the C code of clause 5. However, it is recognized that there are many parts of the algorithm critical to maintaining correct bit-exact operation. For these parts, implemen
33、tations shall reproduce the computational details, tables of constants, sequencing of variables adaptation and use written in the C code of clause 5. It is recognized that the C code provided is for reference, and has not been optimized (in terms of memory, complexity, etc.) for a specific implement
34、ation platform. The C code may require optimization for a particular implementation. A non-exhaustive set of test signals is provided as part of this Recommendation, as a tool to assist implementors to verify their implementations of the encoder and decoder comply with this Recommendation. In practi
35、ce, purchasers of wideband equipment or software implementations will expect them to be compliant with this standard to ensure interworking capability. Implementors may choose to optimize the C code, or otherwise modify the reference C code. In such cases the implementor shall verify that his implem
36、entation produces the same resultant output for any given input as would be expected using the C code expressed in clause 5. _ 1This Recommendation includes a software package that contains the encoder and decoder source code and a set of test vectors for developers. 2 ITU-T Rec. G.722.1 (05/2005) 2
37、 Normative references The following ITU-T Recommendations and other references contain provisions which, through reference in this text, constitute provisions of this Recommendation. At the time of publication, the editions indicated were valid. All Recommendations and other references are subject t
38、o revision; users of this Recommendation are therefore encouraged to investigate the possibility of applying the most recent edition of the Recommendations and other references listed below. A list of the currently valid ITU-T Recommendations is regularly published. The reference to a document withi
39、n this Recommendation does not give it, as a stand-alone document, the status of a Recommendation. 1 CCITT Recommendation G.722 (1988), 7-kHz audio-coding within 64 kbit/s. 2 ITU-T Recommendation G.192 (1996), A common digital parallel interface for speech standardization activities. 3 ISO/IEC 9899:
40、1999, Programming languages C. 3 The encoder Figure 1 presents a block diagram of the encoder. Figure 1/G.722.1 Block diagram of the encoder Every 20 milliseconds (320 samples) the most recent 640 time domain audio samples are fed to a Modulated Lapped Transform (MLT). Each transform produces a fram
41、e of 320 MLT coefficients, and each frame of MLT coefficients is coded independently, i.e., there is no state information left ITU-T Rec. G.722.1 (05/2005) 3 over from the previous frame. For 24-kbit/s and 32-kbit/s operation the allotment of bits per frame is 480 and 640, respectively. The transfor
42、m coefficients generated by the MLT transform are first applied to a module that computes the amplitude envelope and quantizes it; see Figure 2. The amplitude envelope is a coarse representation of the MLT spectrum. The spectrum is divided into blocks of 20 MLT coefficients called regions. Each regi
43、on represents a bandwidth of 500 Hz. As the bandwidth is 7 kHz, the number_of_regions is set at fourteen. MLT coefficients representing frequencies above 7 kHz are ignored. The code bits representing the amplitude envelope are sent to the MUX (Multiplexer) for transmission to the decoder. The bits r
44、emaining after quantization and coding of the amplitude envelope are used to encode the MLT coefficients in the categorization process. Figure 2/G.722.1 An illustration of how the spectrum is divided into fourteen regions, each containing 20 MLT coefficients Using the quantized amplitude envelope an
45、d the number of bits remaining in the frame after amplitude envelope encoding (and provision for four categorization control bits), the categorization procedure generates sixteen sets of categorizations (categorization 0 to categorization 15). Different categorizations require different numbers of b
46、its to encode the same MLT coefficients. Each categorization consists of a set of fourteen category assignments, one assignment for each of the fourteen regions. A category defines a set of predetermined quantization and coding parameters for a region. Associated with each category is an expected nu
47、mber of bits required to encode a region. Because this coder uses variable length Huffman coding, the final number of bits used will vary depending on the particular sequence of MLT coefficients in the region. Next, the MLT coefficients are quantized and coded differently for each one of the sixteen
48、 computed categorizations. For each categorization the actual number of code bits required is determined. 4 ITU-T Rec. G.722.1 (05/2005) The quantization and encoding proceed region by region. A categorization determines the category assignment for all the fourteen regions, and the category assignme
49、nt together with the amplitude envelope for each region determine all of the quantization and coding parameters that will be used for all twenty MLT coefficients in the region. The MLT coefficients in a region are first normalized by the quantized amplitude envelope in the region and then scalar quantized. The resulting scalar quantization indices are combined into vector indices. The vector indices are then Huffman coded, i.e., they are coded with a variable number of bits. The most frequent vector indices require fewer bi