1、 ETSI TS 1Digital cellular telecoEnhanced Full R(3GPP TS 46.0TECHNICAL SPECIFICATION146 060 V13.0.0 (2016communications system (PhaRate (EFR) speech transcodi.060 version 13.0.0 Release 13GLOBAL SYSTEMOBILE COMMUN16-01) hase 2+); ding 13) TEM FOR ICATIONSRETSI ETSI TS 146 060 V13.0.0 (2016-01)13GPP
2、TS 46.060 version 13.0.0 Release 13Reference RTS/TSGS-0446060vd00 Keywords GSM ETSI 650 Route des Lucioles F-06921 Sophia Antipolis Cedex - FRANCE Tel.: +33 4 92 94 42 00 Fax: +33 4 93 65 47 16 Siret N 348 623 562 00017 - NAF 742 C Association but non lucratif enregistre la Sous-Prfecture de Grasse
3、(06) N 7803/88 Important notice The present document can be downloaded from: http:/www.etsi.org/standards-search The present document may be made available in electronic versions and/or in print. The content of any electronic and/or print versions of the present document shall not be modified withou
4、t the prior written authorization of ETSI. In case of any existing or perceived difference in contents between such versions and/or in print, the only prevailing document is the print of the Portable Document Format (PDF) version kept on a specific network drive within ETSI Secretariat. Users of the
5、 present document should be aware that the document may be subject to revision or change of status. Information on the current status of this and other ETSI documents is available at http:/portal.etsi.org/tb/status/status.asp If you find errors in the present document, please send your comment to on
6、e of the following services: https:/portal.etsi.org/People/CommiteeSupportStaff.aspx Copyright Notification No part may be reproduced or utilized in any form or by any means, electronic or mechanical, including photocopying and microfilm except as authorized by written permission of ETSI. The conten
7、t of the PDF version shall not be modified without the written authorization of ETSI. The copyright and the foregoing restriction extend to reproduction in all media. European Telecommunications Standards Institute 2016. All rights reserved. DECTTM, PLUGTESTSTM, UMTSTMand the ETSI logo are Trade Mar
8、ks of ETSI registered for the benefit of its Members. 3GPPTM and LTE are Trade Marks of ETSI registered for the benefit of its Members and of the 3GPP Organizational Partners. GSM and the GSM logo are Trade Marks registered and owned by the GSM Association. ETSI ETSI TS 146 060 V13.0.0 (2016-01)23GP
9、P TS 46.060 version 13.0.0 Release 13Intellectual Property Rights IPRs essential or potentially essential to the present document may have been declared to ETSI. The information pertaining to these essential IPRs, if any, is publicly available for ETSI members and non-members, and can be found in ET
10、SI SR 000 314: “Intellectual Property Rights (IPRs); Essential, or potentially Essential, IPRs notified to ETSI in respect of ETSI standards“, which is available from the ETSI Secretariat. Latest updates are available on the ETSI Web server (https:/ipr.etsi.org/). Pursuant to the ETSI IPR Policy, no
11、 investigation, including IPR searches, has been carried out by ETSI. No guarantee can be given as to the existence of other IPRs not referenced in ETSI SR 000 314 (or the updates on the ETSI Web server) which are, or may be, or may become, essential to the present document. Foreword This Technical
12、Specification (TS) has been produced by ETSI 3rd Generation Partnership Project (3GPP). The present document may refer to technical specifications or reports using their 3GPP identities, UMTS identities or GSM identities. These should be interpreted as being references to the corresponding ETSI deli
13、verables. The cross reference between GSM, UMTS, 3GPP and ETSI identities can be found under http:/webapp.etsi.org/key/queryform.asp. Modal verbs terminology In the present document “shall“, “shall not“, “should“, “should not“, “may“, “need not“, “will“, “will not“, “can“ and “cannot“ are to be inte
14、rpreted as described in clause 3.2 of the ETSI Drafting Rules (Verbal forms for the expression of provisions). “must“ and “must not“ are NOT allowed in ETSI deliverables except when used in direct citation. ETSI ETSI TS 146 060 V13.0.0 (2016-01)33GPP TS 46.060 version 13.0.0 Release 13Contents Intel
15、lectual Property Rights 2g3Foreword . 2g3Modal verbs terminology 2g3Foreword . 4g31 Scope 5g32 References 5g33 Definitions, symbols and abbreviations . 6g33.1 Definitions 6g33.2 Symbols 7g33.3 Abbreviations . 11g34 Outline description . 11g34.1 Functional description of audio parts . 11g34.2 Prepara
16、tion of speech samples 12g34.2.1 PCM format conversion 12g34.3 Principles of the GSM enhanced full rate speech encoder . 12g34.4 Principles of the GSM enhanced full rate speech decoder . 14g34.5 Sequence and subjective importance of encoded parameters . 14g35 Functional description of the encoder 14
17、g35.1 Pre-processing 14g35.2 Linear prediction analysis and quantization . 15g35.2.1 Windowing and auto-correlation computation 15g35.2.2 Levinson-Durbin algorithm 16g35.2.3 LP to LSP conversion . 17g35.2.4 LSP to LP conversion . 18g35.2.5 Quantization of the LSP coefficients 19g35.2.6 Interpolation
18、 of the LSPs 20g35.3 Open-loop pitch analysis 20g35.4 Impulse response computation . 21g35.5 Target signal computation 21g35.6 Adaptive codebook search 22g35.7 Algebraic codebook structure and search . 23g35.8 Quantization of the fixed codebook gain 26g35.9 Memory update 27g36 Functional description
19、 of the decoder 27g36.1 Decoding and speech synthesis 27g36.2 Post-processing . 29g36.2.1 Adaptive post-filtering 29g36.2.2 Up-scaling . 30g37 Variables, constants and tables in the C-code of the GSM EFR codec 30g37.1 Description of the constants and variables used in the C code . 30g38 Homing seque
20、nces 33g38.1 Functional description 33g38.2 Definitions 33g38.3 Encoder homing . 35g38.4 Decoder homing . 35g38.5 Encoder home state 36g38.6 Decoder home state 37g39 Bibliography . 42g3Annex A (informative): Change history . 43g3History 44 ETSI ETSI TS 146 060 V13.0.0 (2016-01)43GPP TS 46.060 versio
21、n 13.0.0 Release 13Foreword This Technical Specification has been produced by the 3rdGeneration Partnership Project (3GPP). The present document describes the detailed mapping between input blocks of 160 speech samples in 13-bit uniform PCM format to encoded blocks of 244 bits and from encoded block
22、s of 244 bits to output blocks of 160 reconstructed speech samples within the digital cellular telecommunications system. The contents of the present document are subject to continuing work within the TSG and may change following formal TSG approval. Should the TSG modify the contents of the present
23、 document, it will be re-released by the TSG with an identifying change of release date and an increase in version number as follows: Version x.y.z where: x the first digit: 1 presented to TSG for information; 2 presented to TSG for approval; 3 or greater indicates TSG approved document under change
24、 control. y the second digit is incremented for all changes of substance, i.e. technical enhancements, corrections, updates, etc. z the third digit is incremented when editorial only changes have been incorporated in the document. ETSI ETSI TS 146 060 V13.0.0 (2016-01)53GPP TS 46.060 version 13.0.0
25、Release 131 Scope The present document describes the detailed mapping between input blocks of 160 speech samples in 13-bit uniform PCM format to encoded blocks of 244 bits and from encoded blocks of 244 bits to output blocks of 160 reconstructed speech samples. The sampling rate is 8 000 sample/s le
26、ading to a bit rate for the encoded bit stream of 12,2 kbit/s. The coding scheme is the so-called Algebraic Code Excited Linear Prediction Coder, hereafter referred to as ACELP. The present document also specifies the conversion between A-law or -law (PCS 1900) PCM and 13-bit uniform PCM. Performanc
27、e requirements for the audio input and output parts are included only to the extent that they affect the transcoder performance. This part also describes the codec down to the bit level, thus enabling the verification of compliance to the part to a high degree of confidence by use of a set of digita
28、l test sequences. These test sequences are described in GSM 06.54 7 and are available on disks. In case of discrepancy between the requirements described in the present document and the fixed point computational description (ANSI-C code) of these requirements contained in GSM 06.53 6, the descriptio
29、n in GSM 06.53 6 will prevail. The transcoding procedure specified in the present document is applicable for the enhanced full rate speech traffic channel (TCH) in the GSM system. In GSM 06.51 5, a reference configuration for the speech transmission chain of the GSM enhanced full rate (EFR) system i
30、s shown. According to this reference configuration, the speech encoder takes its input as a 13-bit uniform PCM signal either from the audio part of the Mobile Station or on the network side, from the PSTN via an 8-bit/A-law or -law (PCS 1900) to 13-bit uniform PCM conversion. The encoded speech at t
31、he output of the speech encoder is delivered to a channel encoder unit which is specified in GSM 05.03 3. In the receive direction, the inverse operations take place. 2 References The following documents contain provisions which, through reference in this text, constitute provisions of the present d
32、ocument. References are either specific (identified by date of publication, edition number, version number, etc.) or non-specific. For a specific reference, subsequent revisions do not apply. For a non-specific reference, the latest version applies. In the case of a reference to a 3GPP document (inc
33、luding a GSM document), a non-specific reference implicitly refers to the latest version of that document in the same Release as the present document. 1 GSM 01.04: “Digital cellular telecommunications system (Phase 2+); Abbreviations and acronyms“. 2 GSM 03.50: “Digital cellular telecommunications s
34、ystem (Phase 2+); Transmission planning aspects of the speech service in the GSM Public Land Mobile Network (PLMN) system“. 3 GSM 05.03: “Digital cellular telecommunications system (Phase 2+); Channel coding“. 4 GSM 06.32: “Digital cellular telecommunications system (Phase 2+); Voice Activity Detect
35、ion (VAD)“. 5 GSM 06.51: “Digital cellular telecommunications system (Phase 2+); Enhanced Full Rate (EFR) speech processing functions General description“. 6 GSM 06.53: “Digital cellular telecommunications system (Phase 2+); ANSI-C code for the GSM Enhanced Full Rate (EFR) speech codec“. 7 GSM 06.54
36、: “Digital cellular telecommunications system (Phase 2+); Test vectors for the GSM Enhanced Full Rate (EFR) speech codec“. ETSI ETSI TS 146 060 V13.0.0 (2016-01)63GPP TS 46.060 version 13.0.0 Release 138 ITU-T Recommendation G.711 (1988): “Coding of analogue signals by pulse code modulation Pulse co
37、de modulation (PCM) of voice frequencies“. 9 ITU-T Recommendation G.726: “40, 32, 24, 16 kbit/s adaptive differential pulse code modulation (ADPCM)“. 3 Definitions, symbols and abbreviations 3.1 Definitions For the purposes of the present document, the following terms and definitions apply: adaptive
38、 codebook: adaptive codebook contains excitation vectors that are adapted for every subframe. The adaptive codebook is derived from the long term filter state. The lag value can be viewed as an index into the adaptive codebook. adaptive postfilter: this filter is applied to the output of the short t
39、erm synthesis filter to enhance the perceptual quality of the reconstructed speech. In the GSM enhanced full rate codec, the adaptive postfilter is a cascade of two filters: a formant postfilter and a tilt compensation filter. algebraic codebook: fixed codebook where algebraic code is used to popula
40、te the excitation vectors (innovation vectors).The excitation contains a small number of nonzero pulses with predefined interlaced sets of positions. closed-loop pitch analysis: this is the adaptive codebook search, i.e., a process of estimating the pitch (lag) value from the weighted input speech a
41、nd the long term filter state. In the closed-loop search, the lag is searched using error minimization loop (analysis-by-synthesis). In the GSM enhanced full rate codec, closed-loop pitch search is performed for every subframe. direct form coefficients: one of the formats for storing the short term
42、filter parameters. In the GSM enhanced full rate codec, all filters which are used to modify speech samples use direct form coefficients. fixed codebook: fixed codebook contains excitation vectors for speech synthesis filters. The contents of the codebook are non-adaptive (i.e., fixed). In the GSM e
43、nhanced full rate codec, the fixed codebook is implemented using an algebraic codebook. fractional lags: set of lag values having sub-sample resolution. In the GSM enhanced full rate codec a sub-sample resolution of 1/6th of a sample is used. frame: time interval equal to 20 ms (160 samples at an 8
44、kHz sampling rate). integer lags: set of lag values having whole sample resolution. interpolating filter: FIR filter used to produce an estimate of sub-sample resolution samples, given an input sampled with integer sample resolution. inverse filter: this filter removes the short term correlation fro
45、m the speech signal. The filter models an inverse frequency response of the vocal tract. lag: long term filter delay. This is typically the true pitch period, or a multiple or sub-multiple of it. Line Spectral Frequencies: (see Line Spectral Pair). Line Spectral Pair: transformation of LPC parameter
46、s. Line Spectral Pairs are obtained by decomposing the inverse filter transfer function A(z) to a set of two transfer functions, one having even symmetry and the other having odd symmetry. The Line Spectral Pairs (also called as Line Spectral Frequencies) are the roots of these polynomials on the z-
47、unit circle). ETSI ETSI TS 146 060 V13.0.0 (2016-01)73GPP TS 46.060 version 13.0.0 Release 13LP analysis window: for each frame, the short term filter coefficients are computed using the high pass filtered speech samples within the analysis window. In the GSM enhanced full rate codec, the length of
48、the analysis window is 240 samples. For each frame, two asymmetric windows are used to generate two sets of LP coefficients. No samples of the future frames are used (no lookahead). LP coefficients: Linear Prediction (LP) coefficients (also referred as Linear Predictive Coding (LPC) coefficients) is
49、 a generic descriptive term for describing the short term filter coefficients. open-loop pitch search: process of estimating the near optimal lag directly from the weighted speech input. This is done to simplify the pitch analysis and confine the closed-loop pitch search to a small number of lags around the open-loop estimated lags. In the GSM enhanced full rate codec, open-loop pitch search is performed every 10 ms. residual: output signal resulting from an inverse filtering operation. short term synthesis filter: this filter introduces, i