1、 ETSI TS 126 445 V14.2.0 (2018-01) Universal Mobile Telecommunications System (UMTS); LTE; Codec for Enhanced Voice Services (EVS); Detailed algorithmic description (3GPP TS 26.445 version 14.2.0 Release 14) floppy3TECHNICAL SPECIFICATION ETSI ETSI TS 126 445 V14.2.0 (2018-01)13GPP TS 26.445 version
2、 14.2.0 Release 14Reference RTS/TSGS-0426445vE20 Keywords LTE,UMTS ETSI 650 Route des Lucioles F-06921 Sophia Antipolis Cedex - FRANCE Tel.: +33 4 92 94 42 00 Fax: +33 4 93 65 47 16 Siret N 348 623 562 00017 - NAF 742 C Association but non lucratif enregistre la Sous-Prfecture de Grasse (06) N 7803/
3、88 Important notice The present document can be downloaded from: http:/www.etsi.org/standards-search The present document may be made available in electronic versions and/or in print. The content of any electronic and/or print versions of the present document shall not be modified without the prior
4、written authorization of ETSI. In case of any existing or perceived difference in contents between such versions and/or in print, the only prevailing document is the print of the Portable Document Format (PDF) version kept on a specific network drive within ETSI Secretariat. Users of the present doc
5、ument should be aware that the document may be subject to revision or change of status. Information on the current status of this and other ETSI documents is available at https:/portal.etsi.org/TB/ETSIDeliverableStatus.aspx If you find errors in the present document, please send your comment to one
6、of the following services: https:/portal.etsi.org/People/CommiteeSupportStaff.aspx Copyright Notification No part may be reproduced or utilized in any form or by any means, electronic or mechanical, including photocopying and microfilm except as authorized by written permission of ETSI. The content
7、of the PDF version shall not be modified without the written authorization of ETSI. The copyright and the foregoing restriction extend to reproduction in all media. ETSI 2018. All rights reserved. DECTTM, PLUGTESTSTM, UMTSTMand the ETSI logo are trademarks of ETSI registered for the benefit of its M
8、embers. 3GPPTM and LTE are trademarks of ETSI registered for the benefit of its Members and of the 3GPP Organizational Partners. oneM2M logo is protected for the benefit of its Members. GSM and the GSM logo are trademarks registered and owned by the GSM Association. ETSI ETSI TS 126 445 V14.2.0 (201
9、8-01)23GPP TS 26.445 version 14.2.0 Release 14Intellectual Property Rights Essential patents IPRs essential or potentially essential to the present document may have been declared to ETSI. The information pertaining to these essential IPRs, if any, is publicly available for ETSI members and non-memb
10、ers, and can be found in ETSI SR 000 314: “Intellectual Property Rights (IPRs); Essential, or potentially Essential, IPRs notified to ETSI in respect of ETSI standards“, which is available from the ETSI Secretariat. Latest updates are available on the ETSI Web server (https:/ipr.etsi.org/). Pursuant
11、 to the ETSI IPR Policy, no investigation, including IPR searches, has been carried out by ETSI. No guarantee can be given as to the existence of other IPRs not referenced in ETSI SR 000 314 (or the updates on the ETSI Web server) which are, or may be, or may become, essential to the present documen
12、t. Trademarks The present document may include trademarks and/or tradenames which are asserted and/or registered by their owners. ETSI claims no ownership of these except for any which are indicated as being the property of ETSI, and conveys no right to use or reproduce any trademark and/or tradenam
13、e. Mention of those trademarks in the present document does not constitute an endorsement by ETSI of products, services or organizations associated with those trademarks. Foreword This Technical Specification (TS) has been produced by ETSI 3rd Generation Partnership Project (3GPP). The present docum
14、ent may refer to technical specifications or reports using their 3GPP identities, UMTS identities or GSM identities. These should be interpreted as being references to the corresponding ETSI deliverables. The cross reference between GSM, UMTS, 3GPP and ETSI identities can be found under http:/webapp
15、.etsi.org/key/queryform.asp. Modal verbs terminology In the present document “shall“, “shall not“, “should“, “should not“, “may“, “need not“, “will“, “will not“, “can“ and “cannot“ are to be interpreted as described in clause 3.2 of the ETSI Drafting Rules (Verbal forms for the expression of provisi
16、ons). “must“ and “must not“ are NOT allowed in ETSI deliverables except when used in direct citation. ETSI ETSI TS 126 445 V14.2.0 (2018-01)33GPP TS 26.445 version 14.2.0 Release 14Contents Intellectual Property Rights 2g3Foreword . 2g3Modal verbs terminology 2g3Foreword . 14g31 Scope 15g32 Referenc
17、es 15g33 Definitions, abbreviations and mathematical expressions 17g33.1 Definitions 17g33.2 Abbreviations . 17g33.3 Mathematical Expressions 19g34 General description of the coder 20g34.1 Introduction 20g34.2 Input/output sampling rate 20g34.3 Codec delay 20g34.4 Coder overview 20g34.4.1 Encoder ov
18、erview . 21g34.4.1.1 Linear Prediction Based Operation . 21g34.4.1.2 Frequency Domain Operation . 22g34.4.1.3 Inactive Signal coding . 22g34.4.1.4 Source Controlled VBR Coding 23g34.4.2 Decoder overview . 23g34.4.2.1 Parametric Signal Representation Decoding (Bandwidth Extension) . 23g34.4.2.2 Frame
19、 loss concealment 23g34.4.3 DTX/CNG operation. 23g34.4.3.1 Inactive Signal coding . 24g34.4.4 AMR-WB-interoperable option 24g34.4.5 Channel-Aware Mode . 24g34.5 Organization of the rest of the Technical Standard 24g35 Functional description of the encoder 25g35.1 Common processing . 25g35.1.1 High-p
20、ass Filtering . 25g35.1.2 Complex low-delay filter bank analysis 25g35.1.2.1 Sub-band analysis . 25g35.1.2.2 Sub-band energy estimation 26g35.1.3 Sample rate conversion to 12.8 kHz . 27g35.1.3.1 Conversion of 16, 32 and 48 kHz signals to 12.8 kHz 27g35.1.3.2 Conversion of 8 kHz signals to 12.8 kHz 2
21、7g35.1.3.3 Conversion of input signals to 16, 25.6 and 32 kHz . 29g35.1.4 Pre-emphasis . 29g35.1.5 Spectral analysis . 30g35.1.5.1 Windowing and DFT. 30g35.1.5.2 Energy calculations . 31g35.1.6 Bandwidth detection . 32g35.1.6.1 Mean and maximum energy values per band 32g35.1.7 Bandwidth decision. 34
22、g35.1.8 Time-domain transient detection 37g35.1.9 Linear prediction analysis . 38g35.1.9.1 LP analysis window 38g35.1.9.2 Autocorrelation computation. 38g35.1.9.3 Adaptive lag windowing . 39g35.1.9.4 Levinson-Durbin algorithm . 39g35.1.9.5 Conversion of LP coefficients to LSP parameters 40g35.1.9.6
23、LSP interpolation 41g3ETSI ETSI TS 126 445 V14.2.0 (2018-01)43GPP TS 26.445 version 14.2.0 Release 145.1.9.7 Conversion of LSP parameters to LP coefficients 41g35.1.9.8 LP analysis at 16kHz . 42g35.1.10 Open-loop pitch analysis 43g35.1.10.1 Perceptual weighting . 43g35.1.10.2 Correlation function co
24、mputation . 44g35.1.10.3 Correlation reinforcement with past pitch values 45g35.1.10.4 Normalized correlation computation . 46g35.1.10.5 Correlation reinforcement with pitch lag multiples . 46g35.1.10.6 Initial pitch lag determination and reinforcement based on pitch coherence with other half-frames
25、 47g35.1.10.7 Pitch lag determination and parameter update 48g35.1.10.8 Correction of very short and stable open-loop pitch estimates . 49g35.1.10.9 Fractional open-loop pitch estimate for each subframe. 51g35.1.11 Background noise energy estimation 52g35.1.11.1 First stage of noise energy update .
26、52g35.1.11.2 Second stage of noise energy update . 54g35.1.11.2.1 Basic parameters for noise energy update . 54g35.1.11.2.2 Spectral diversity . 55g35.1.11.2.3 Complementary non-stationarity . 55g35.1.11.2.4 HF energy content . 56g35.1.11.2.5 Tonal stability 56g35.1.11.2.6 High frequency dynamic ran
27、ge 60g35.1.11.2.7 Combined decision for background noise energy update 60g35.1.11.3 Energy-based parameters for noise energy update 62g35.1.11.3.1 Closeness to current background estimate . 62g35.1.11.3.2 Features related to last correlation or harmonic event . 62g35.1.11.3.3 Energy-based pause dete
28、ction . 63g35.1.11.3.4 Long-term linear prediction efficiency 63g35.1.11.3.5 Additional long-term parameters used for noise estimation 64g35.1.11.4 Decision logic for noise energy update . 65g35.1.12 Signal activity detection 68g35.1.12.1 SAD1 module 69g35.1.12.1.1 SNR outlier filtering 71g35.1.12.2
29、 SAD2 module 72g35.1.12.3 Combined decision of SAD1 and SAD2 modules for WB and SWB signals . 75g35.1.12.4 Final decision of the SAD1 module for NB signals 75g35.1.12.5 Post-decision parameter update . 76g35.1.12.6 SAD3 module 77g35.1.12.6.1 Sub-band FFT 77g35.1.12.6.2 Computation of signal features
30、 78g35.1.12.6.3 Computation of SNR parameters . 81g35.1.12.6.4 Decision of background music 83g35.1.12.6.5 Decision of background update flag 83g35.1.12.6.6 SAD3 Pre-decision 84g35.1.12.6.7 SAD3 Hangover 86g35.1.12.7 Final SAD decision . 86g35.1.12.8 DTX hangover addition . 88g35.1.13 Coding mode de
31、termination 90g35.1.13.1 Unvoiced signal classification . 91g35.1.13.1.1 Voicing measure 92g35.1.13.1.2 Spectral tilt 92g35.1.13.1.3 Sudden energy increase from a low energy level 93g35.1.13.1.4 Total frame energy difference . 94g35.1.13.1.5 Energy decrease after spike . 94g35.1.13.1.6 Decision abou
32、t UC mode . 95g35.1.13.2 Stable voiced signal classification . 96g35.1.13.3 Signal classification for FEC. 96g35.1.13.3.1 Signal classes for FEC . 97g35.1.13.3.2 Signal classification parameters 97g35.1.13.3.3 Classification procedure 98g35.1.13.4 Transient signal classification . 99g35.1.13.5 Modif
33、ication of coding mode in special cases 100g3ETSI ETSI TS 126 445 V14.2.0 (2018-01)53GPP TS 26.445 version 14.2.0 Release 145.1.13.6 Speech/music classification. 101g35.1.13.6.1 First stage of the speech/music classifier . 101g35.1.13.6.2 Scaling of features in the first stage of the speech/music cl
34、assifier . 103g35.1.13.6.3 Log-probability and decision smoothing . 104g35.1.13.6.4 State machine and final speech/music decision . 105g35.1.13.6.5 Improvement of the classification for mixed and music content . 108g35.1.13.6.6 Second stage of the speech/music classifier 112g35.1.13.6.7 Context-base
35、d improvement of the classification for stable tonal signals . 114g35.1.13.6.8 Detection of sparse spectral content 118g35.1.13.6.9 Decision about AC mode . 120g35.1.13.6.10 Decision about IC mode 120g35.1.14 Coder technology selection . 120g35.1.14.1 ACELP/MDCT-based technology selection at 9.6kbps
36、, 16.4 and 24.4 kbps 121g35.1.14.1.1 Segmental SNR estimation of the MDCT-based technology 121g35.1.14.1.2 Segmental SNR estimation of the ACELP technology 127g35.1.14.1.3 Hysteresis and final decision . 128g35.1.14.2 TCX/HQ MDCT technology selection at 13.2 and 16.4 kbps . 129g35.1.14.3 TCX/HQ MDCT
37、 technology selection at 24.4 and 32 kbps 131g35.1.14.4 TD/Multi-mode FD BWE technology selection at 13.2 kbps and 32 kbps . 134g35.2 LP-based Coding 135g35.2.1 Perceptual weighting. 135g35.2.2 LP filter coding and interpolation . 136g35.2.2.1 LSF quantization . 136g35.2.2.1.1 LSF weighting function
38、 . 136g35.2.2.1.2 Bit allocation . 139g35.2.2.1.3 Predictor allocation 140g35.2.2.1.4 LSF quantizer structure . 140g35.2.2.1.5 LSFQ for voiced coding mode at 16 kHz internal sampling frequency : BC-TCVQ 145g35.2.2.1.6 Mid-frame LSF quantizer 152g35.2.3 Excitation coding 153g35.2.3.1 Excitation codin
39、g in the GC, VC and high rate IC/UC modes 153g35.2.3.1.1 Computation of the LP residual signal 154g35.2.3.1.2 Target signal computation . 155g35.2.3.1.3 Impulse response computation 155g35.2.3.1.4 Adaptive codebook 155g35.2.3.1.5 Algebraic codebook. 158g35.2.3.1.6 Combined algebraic codebook 168g35.
40、2.3.1.7 Gain quantization. 182g35.2.3.2 Excitation coding in TC mode 187g35.2.3.2.1 Glottal pulse codebook search . 187g35.2.3.2.2 TC frame configurations 191g35.2.3.3 Excitation coding in UC mode at low rates . 196g35.2.3.3.1 Structure of the Gaussian codebook 196g35.2.3.3.2 Correction of the Gauss
41、ian codebook spectral tilt . 197g35.2.3.3.3 Search of the Gaussian codebook 198g35.2.3.3.4 Quantization of the Gaussian codevector gain 199g35.2.3.3.5 Other parameters in UC mode . 200g35.2.3.3.6 Update of filter memories 200g35.2.3.4 Excitation coding in IC and UC modes at 9.6 kbps 200g35.2.3.4.1 A
42、lgebraic codebook . 201g35.2.3.4.2 Gaussian noise generation . 202g35.2.3.4.3 Gain coding . 202g35.2.3.4.4 Memory update 204g35.2.3.5 Excitation coding in GSC mode 204g35.2.3.5.1 Determining the subframe length 205g35.2.3.5.2 Computing time-domain excitation contribution . 205g35.2.3.5.3 Frequency t
43、ransform of residual and time-domain excitation contribution . 206g35.2.3.5.4 Computing energy dynamics of transformed residual and quantization of noise level . 207g35.2.3.5.6 Find and encode the cut-off frequency 207g35.2.3.5.7 Band energy computation and quantization. 209g35.2.3.5.8 PVQ Bit alloc
44、ation 209g35.2.3.5.9 Quantization of difference signal. 210g3ETSI ETSI TS 126 445 V14.2.0 (2018-01)63GPP TS 26.445 version 14.2.0 Release 145.2.3.5.10 Spectral dynamic and noise filling 210g35.2.3.5.11 Quantized gain addition, temporal and frequency contributions combination 210g35.2.3.5.12 Specific
45、s for wideband 8kbps 211g35.2.3.5.13 Inverse DCT 212g35.2.3.5.14 Remove pre-echo in case of onset detection 212g35.2.4 Bass post-filter gain quantization 213g35.2.5 Source Controlled VBR Coding . 213g35.2.5.1 Principles of VBR Coding 213g35.2.5.2 EVS VBR Encoder Coding Modes and Bit-Rates 214g35.2.5
46、.3 Prototype-Pitch-Period (PPP) Encoding . 214g35.2.5.3.1 PPP Algorithm . 214g35.2.5.3.2 Amplitude Quantization 216g35.2.5.3.3 Phase Quantization 216g35.2.5.4 Noise-Excited-Linear-Prediction (NELP) Encoding . 216g35.2.5.5 Average Data Rate (ADR) Control for the EVS VBR mode 216g35.2.6 Coding of uppe
47、r band for LP-based Coding Modes . 220g35.2.6.1 Bandwidth extension in time domain 220g35.2.6.1.1 High band target signal generation 221g35.2.6.1.2 TBE LP analysis 222g35.2.6.1.3 Quantization of linear prediction parameters. 224g35.2.6.1.4 Interpolation of LSF coefficients . 227g35.2.6.1.5 Target an
48、d residual energy calculation and quantization . 229g35.2.6.1.6 Generation of the upsampled version of the lowband excitation . 229g35.2.6.1.7 Non-Linear Excitation Generation 230g35.2.6.1.8 Spectral flip of non-linear excitation in time domain 231g35.2.6.1.9 Down-sample using all-pass filters 231g3
49、5.2.6.1.10 Adaptive spectral whitening 232g35.2.6.1.11 Envelope modulated noise mixing. 232g35.2.6.1.12 Spectral shaping of the noise added excitation 234g35.2.6.1.13 Post processing of the shaped excitation . 235g35.2.6.1.14 Estimation of temporal gain shape parameters 236g35.2.6.1.15 Estimation of frame gain parameters . 239g35.2.6.1.16 Estimation of TEC/TFA envelope parameters. 241g35.2.6.1.17 Estimation of full-band frame energy parameters . 244g35.2.6.2 Multi-mode FD Bandwidth Extension Coding . 246g35.2.6.2.1 SWB/FB Multi-mode FD Bandwidth Extension . 246g35.2.6.2.2 WB
copyright@ 2008-2019 麦多课文库(www.mydoc123.com)网站版权所有
备案/许可证编号:苏ICP备17064731号-1