1、 ETSI TS 126 290 V14.0.0 (2017-04) Digital cellular telecommunications system (Phase 2+) (GSM); Universal Mobile Telecommunications System (UMTS); LTE; Audio codec processing functions; Extended Adaptive Multi-Rate - Wideband (AMR-WB+) codec; Transcoding functions (3GPP TS 26.290 version 14.0.0 Rele
2、ase 14) TECHNICAL SPECIFICATION ETSI ETSI TS 126 290 V14.0.0 (2017-04)13GPP TS 26.290 version 14.0.0 Release 14Reference RTS/TSGS-0426290ve00 Keywords GSM,LTE,UMTS ETSI 650 Route des Lucioles F-06921 Sophia Antipolis Cedex - FRANCE Tel.: +33 4 92 94 42 00 Fax: +33 4 93 65 47 16 Siret N 348 623 562 0
3、0017 - NAF 742 C Association but non lucratif enregistre la Sous-Prfecture de Grasse (06) N 7803/88 Important notice The present document can be downloaded from: http:/www.etsi.org/standards-search The present document may be made available in electronic versions and/or in print. The content of any
4、electronic and/or print versions of the present document shall not be modified without the prior written authorization of ETSI. In case of any existing or perceived difference in contents between such versions and/or in print, the only prevailing document is the print of the Portable Document Format
5、 (PDF) version kept on a specific network drive within ETSI Secretariat. Users of the present document should be aware that the document may be subject to revision or change of status. Information on the current status of this and other ETSI documents is available at https:/portal.etsi.org/TB/ETSIDe
6、liverableStatus.aspx If you find errors in the present document, please send your comment to one of the following services: https:/portal.etsi.org/People/CommiteeSupportStaff.aspx Copyright Notification No part may be reproduced or utilized in any form or by any means, electronic or mechanical, incl
7、uding photocopying and microfilm except as authorized by written permission of ETSI. The content of the PDF version shall not be modified without the written authorization of ETSI. The copyright and the foregoing restriction extend to reproduction in all media. European Telecommunications Standards
8、Institute 2017. All rights reserved. DECTTM, PLUGTESTSTM, UMTSTMand the ETSI logo are Trade Marks of ETSI registered for the benefit of its Members. 3GPPTM and LTE are Trade Marks of ETSI registered for the benefit of its Members and of the 3GPP Organizational Partners. GSM and the GSM logo are Trad
9、e Marks registered and owned by the GSM Association. ETSI ETSI TS 126 290 V14.0.0 (2017-04)23GPP TS 26.290 version 14.0.0 Release 14Intellectual Property Rights IPRs essential or potentially essential to the present document may have been declared to ETSI. The information pertaining to these essenti
10、al IPRs, if any, is publicly available for ETSI members and non-members, and can be found in ETSI SR 000 314: “Intellectual Property Rights (IPRs); Essential, or potentially Essential, IPRs notified to ETSI in respect of ETSI standards“, which is available from the ETSI Secretariat. Latest updates a
11、re available on the ETSI Web server (https:/ipr.etsi.org/). Pursuant to the ETSI IPR Policy, no investigation, including IPR searches, has been carried out by ETSI. No guarantee can be given as to the existence of other IPRs not referenced in ETSI SR 000 314 (or the updates on the ETSI Web server) w
12、hich are, or may be, or may become, essential to the present document. Foreword This Technical Specification (TS) has been produced by ETSI 3rd Generation Partnership Project (3GPP). The present document may refer to technical specifications or reports using their 3GPP identities, UMTS identities or
13、 GSM identities. These should be interpreted as being references to the corresponding ETSI deliverables. The cross reference between GSM, UMTS, 3GPP and ETSI identities can be found under http:/webapp.etsi.org/key/queryform.asp. Modal verbs terminology In the present document “shall“, “shall not“, “
14、should“, “should not“, “may“, “need not“, “will“, “will not“, “can“ and “cannot“ are to be interpreted as described in clause 3.2 of the ETSI Drafting Rules (Verbal forms for the expression of provisions). “must“ and “must not“ are NOT allowed in ETSI deliverables except when used in direct citation
15、. ETSI ETSI TS 126 290 V14.0.0 (2017-04)33GPP TS 26.290 version 14.0.0 Release 14Contents Intellectual Property Rights 2g3Foreword . 2g3Modal verbs terminology 2g3Foreword . 6g31 Scope 7g32 References 7g33 Definitions and abbreviations . 8g33.1 Definitions 8g33.2 Abbreviations . 9g34 Outline descrip
16、tion . 9g34.1 Functional description of audio parts . 10g34.2 Preparation of input samples 10g34.3 Principles of the extended adaptive multi-rate wideband codec . 10g34.3.1 Encoding and decoding structure 11g34.3.2 LP analysis and synthesis in low-frequency band. 13g34.3.3 ACELP and TCX coding 13g34
17、.3.4 Coding of high-frequency band 13g34.3.5 Stereo coding 13g34.3.6 Low complexity operation 13g34.3.7 Frame erasure concealment. 13g34.3.8 Bit allocation . 13g35 Functional description of the encoder 16g35.1 Input signal pre-processing. 16g35.1.1 High Pass Filtering . 16g35.1.2 Stereo Signal Downm
18、ixing/Bandsplitting 16g35.2 Principle of the hybrid ACELP/TCX core encoding 17g35.2.1 Timing chart of the ACELP and TCX modes . 17g35.2.2 ACELP/TCX mode combinations and mode encoding 18g35.2.3 ACELP/TCX closed-loop mode selection 19g35.2.4 ACELP/TCX open-loop mode selection. 20g35.3 Hybrid ACELP/TC
19、X core encoding description 24g35.3.1 Pre-emphasis . 24g35.3.2 LP analysis and interpolation 24g35.3.2.1 Windowing and auto-correlation computation 24g35.3.2.2 Levinson-Durbin algorithm . 24g35.3.2.3 LP to ISP conversion . 24g35.3.2.4 ISP to LP conversion . 24g35.3.2.5 Quantization of the ISP coeffi
20、cient . 25g35.3.2.6 Interpolation of the ISPs 25g35.3.3 Perceptual weighting. 25g35.3.4 ACELP Excitation encoder . 25g35.3.4.1 Open-loop pitch analysis . 25g35.3.4.2 Impulse response computation 25g35.3.4.3 Target signal computation . 26g35.3.4.4 Adaptive codebook 26g35.3.4.5 Algebraic codebook 26g3
21、5.3.4.5.1 Codebook structure 26g35.3.4.5.2 Pulse indexing . 26g35.3.4.5.3 Codebook search . 26g35.3.4.6 Quantization of the adaptive and fixed codebook gains 26g35.3.5 TCX Excitation encoder . 27g35.3.5.1 TCX encoder block diagram . 27g35.3.5.2 Computation of the target signal for transform coding .
22、 30g3ETSI ETSI TS 126 290 V14.0.0 (2017-04)43GPP TS 26.290 version 14.0.0 Release 145.3.5.3 Zero-input response subtraction 30g35.3.5.4 Windowing of target signal . 31g35.3.5.5 Transform 32g35.3.5.6 Spectrum pre-shaping 32g35.3.5.7 Split multi-rate lattice VQ . 33g35.3.5.8 Spectrum de-shaping . 38g3
23、5.3.5.9 Inverse transform 38g35.3.5.10 Gain optimization and quantization 38g35.3.5.11 Windowing for overlap-and-add . 39g35.3.5.12 Memory update . 39g35.3.5.13 Excitation signal computation . 39g35.4 Mono Signal High-Band encoding (BWE) 39g35.5 Stereo signal encoding 42g35.5.1 Stereo Signal Low-Ban
24、d Encoding . 42g35.5.1.1 Principle 43g35.5.1.2 Signal Windowing . 43g35.5.1.3 Pre-echo mode . 44g35.5.1.4 Redundancy reduction . 44g35.5.2 Stereo Signal Mid-Band Processing . 44g35.5.2.1 Principle 44g35.5.2.2 Residual computation 45g35.5.2.3 Filter computation, smoothing and quantization . 45g35.5.2
25、.4 Channel energy matching 45g35.5.3 Stereo Signal High-Band Processing 46g35.6 Packetization 46g35.6.1 Packetization of TCX encoded parameters . 46g35.6.1.1 Multiplexing principle for a single binary table 47g35.6.1.2 Multiplexing in case of multiple binary tables 48g35.6.2 Packetization procedure
26、for all parameters . 50g35.6.3 TCX gain multiplexing . 52g35.6.4 Stereo Packetization 53g36 Functional description of the decoder 53g36.1 Mono Signal Low-Band synthesis 53g36.1.1 ACELP mode decoding and signal synthesis . 54g36.1.2 TCX mode decoding and signal synthesis 54g36.1.3 Post-processing of
27、Mono Low-Band signal 57g36.2 Mono Signal High-Band synthesis . 59g36.3 Stereo Signal synthesis . 62g36.3.1 Stereo signal low-band synthesis 63g36.3.2 Stereo Signal Mid-Band synthesis 64g36.3.3 Stereo Signal High-Band synthesis . 65g36.3.4 Stereo output signal generation . 65g36.4 Stereo to mono conv
28、ersion . 65g36.4.1 Low-Band synthesis 65g36.4.2 High-Band synthesis. 65g36.5 Bad frame concealment 65g36.5.1 Mono . 65g36.5.1.1 Mode decoding and extrapolation . 65g36.5.1.2 TCX bad frame concealment. 68g36.5.1.2.1 Spectrum de-shaping . 68g36.5.1.2.2 Spectrum Extrapolation . 68g36.5.1.2.3 Amplitude
29、Extrapolation . 69g36.5.1.2.4 Phase Extrapolation . 69g36.5.2 Stereo 70g36.5.2.1 Low-band 70g36.5.2.2 Mid-band. 71g36.6 Output signal generation 71g37 Detailed bit allocation of the Extended AMR-WB codec 72g38 Storage and Transport Interface formats 78g38.1 Available Modes and Bitrates 78g3ETSI ETSI
30、 TS 126 290 V14.0.0 (2017-04)53GPP TS 26.290 version 14.0.0 Release 148.2 AMR-WB+ Transport Interface Format . 81g38.3 AMR-WB+ File Storage Format 83g3Annex A (informative): Change history . 85g3History 86g3ETSI ETSI TS 126 290 V14.0.0 (2017-04)63GPP TS 26.290 version 14.0.0 Release 14Foreword This
31、Technical Specification has been produced by the 3rdGeneration Partnership Project (3GPP). This document describes the Extended Adaptive Multi-Rate Wideband (AMR-WB+) coder within the 3GPP system. The contents of the present document are subject to continuing work within the TSG and may change follo
32、wing formal TSG approval. Should the TSG modify the contents of the present document, it will be re-released by the TSG with an identifying change of release date and an increase in version number as follows: Version x.y.z where: x the first digit: 1 presented to TSG for information; 2 presented to
33、TSG for approval; 3 or greater indicates TSG approved document under change control. y the second digit is incremented for all changes of substance, i.e. technical enhancements, corrections, updates, etc. z the third digit is incremented when editorial only changes have been incorporated in the docu
34、ment. ETSI ETSI TS 126 290 V14.0.0 (2017-04)73GPP TS 26.290 version 14.0.0 Release 141 Scope This Telecommunication Standard (TS) describes the detailed mapping from input blocks of monophonic or stereophonic audio samples in 16 bit uniform PCM format to encoded blocks and from encoded blocks to out
35、put blocks of reconstructed monophonic or stereophonic audio samples. The coding scheme is an extension of the AMR-WB coding scheme 3 and is referred to as extended AMR-WB or AMR-WB+ codec. It comprises all AMR-WB speech codec modes including VAD/DTX/CNG 2810 as well as extended functionality for en
36、coding general audio signals such as music, speech, mixed, and other signals. In the case of discrepancy between the requirements described in the present document and the ANSI-C code computational description of these requirements contained in 4, 5, the description in 4, 5, respectively, will preva
37、il. The ANSI-C code is not described in the present document, see 4, 5 for a description of the floating-point or, respectively, fixed-point ANSI-C code. 2 References The following documents contain provisions which, through reference in this text, constitute provisions of the present document. Refe
38、rences are either specific (identified by date of publication, edition number, version number, etc.) or non-specific. For a specific reference, subsequent revisions do not apply. For a non-specific reference, the latest version applies. In the case of a reference to a 3GPP document (including a GSM
39、document), a non-specific reference implicitly refers to the latest version of that document in the same Release as the present document. 1 GSM 43.050: “ Digital cellular telecommunications system (Phase 2); Transmission planning aspects of the speech service in the GSM Public Land Mobile Network (P
40、LMN) system“ 2 3GPP TS 26.194: “AMR wideband speech codec; Voice Activity Detection (VAD)“. 3 3GPP TS 26.190: “ AMR Wideband speech codec; Transcoding functions “. 4 3GPP TS 26.304: “ANSI-C code for the floating point Extended AMR Wideband codec“. 5 3GPP TS 26.273: “ANSI-C code for the fixed point E
41、xtended AMR Wideband codec“. 6 M. Xie and J.-P. Adoul, “Embedded algebraic vector quantization (EAVQ) with application to wideband audio coding,“ IEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP), Atlanta, GA, U.S.A, vol. 1, pp. 240-243, 1996. 7 J.H. Conway and N.J.A
42、. Sloane, “A fast encoding method for lattice codes and quantizers,“ IEEE Trans. Inform. Theory, vol. IT-29, no. 6, pp. 820-824, Nov. 1983 8 3GPP TS 26.193: “AMR Wideband speech codec; Source controlled rate operation“. 9 3GPP TS 26.244: “Transparent end-to-end packet switched streaming service (PSS
43、); 3GPP file format (3GP)“ 10 3GPP TS 26.192: “AMR Wideband speech codec; Comfort noise aspects“ ETSI ETSI TS 126 290 V14.0.0 (2017-04)83GPP TS 26.290 version 14.0.0 Release 143 Definitions and abbreviations 3.1 Definitions For the purposes of the present document, the following terms and apply. ada
44、ptive codebook: The adaptive codebook contains excitation vectors that are adapted for every subframe. The adaptive codebook is derived from the long-term filter state. The lag value can be viewed as an index into the adaptive codebook. algebraic codebook: A fixed codebook where algebraic code is us
45、ed to populate the excitation vectors (innovation vectors). The excitation contains a small number of nonzero pulses with predefined interlaced sets of potential positions. The amplitudes and positions of the pulses of the kthexcitation codevector can be derived from its index k through a rule requi
46、ring no or minimal physical storage, in contrast with stochastic codebooks whereby the path from the index to the associated codevector involves look-up tables. anti-sparseness processing: An adaptive post-processing procedure applied to the fixed codebook vector in order to reduce perceptual artifa
47、cts from a sparse fixed codebook vector. closed-loop pitch analysis: This is the adaptive codebook search, i.e., a process of estimating the pitch (lag) value from the weighted input speech and the long term filter state. In the closed-loop search, the lag is searched using error minimization loop (
48、analysis-by-synthesis). In the adaptive multi-rate wideband codec, closed-loop pitch search is performed for every subframe. direct form coefficients: One of the formats for storing the short term filter parameters. In the adaptive multi-rate wideband codec, all filters which are used to modify spee
49、ch samples use direct form coefficients. fixed codebook: The fixed codebook contains excitation vectors for speech synthesis filters. The contents of the codebook are non-adaptive (i.e., fixed). In the adaptive multi-rate wideband codec, the fixed codebook is implemented using an algebraic codebook. fractional lags: A set of lag values having sub-sample resolution. In the adaptive multi-rate wideband codec a sub-sample resolution of thor ndof a sample is used. super frame: A time interval equal to 1024 samples (80ms at a 12.8 k