ETSI TS 146 060-2017 Digital cellular telecommunications system (Phase 2+) (GSM) Enhanced Full Rate (EFR) speech transcoding (V14 0 0 3GPP TS 46 060 version 14 0 0 Release 14)《数字蜂窝.pdf

资源描述

1、 ETSI TS 146 060 V14.0.0 (2017-04) Digital cellular telecommunications system (Phase 2+) (GSM); Enhanced Full Rate (EFR) speech transcoding (3GPP TS 46.060 version 14.0.0 Release 14) TECHNICAL SPECIFICATION GLOBAL SYSTEM FOR MOBILE COMMUNICATIONSRETSI ETSI TS 146 060 V14.0.0 (2017-04)13GPP TS 46.060

2、 version 14.0.0 Release 14Reference RTS/TSGS-0446060ve00 Keywords GSM ETSI 650 Route des Lucioles F-06921 Sophia Antipolis Cedex - FRANCE Tel.: +33 4 92 94 42 00 Fax: +33 4 93 65 47 16 Siret N 348 623 562 00017 - NAF 742 C Association but non lucratif enregistre la Sous-Prfecture de Grasse (06) N 78

3、03/88 Important notice The present document can be downloaded from: http:/www.etsi.org/standards-search The present document may be made available in electronic versions and/or in print. The content of any electronic and/or print versions of the present document shall not be modified without the pri

4、or written authorization of ETSI. In case of any existing or perceived difference in contents between such versions and/or in print, the only prevailing document is the print of the Portable Document Format (PDF) version kept on a specific network drive within ETSI Secretariat. Users of the present

5、document should be aware that the document may be subject to revision or change of status. Information on the current status of this and other ETSI documents is available at https:/portal.etsi.org/TB/ETSIDeliverableStatus.aspx If you find errors in the present document, please send your comment to o

6、ne of the following services: https:/portal.etsi.org/People/CommiteeSupportStaff.aspx Copyright Notification No part may be reproduced or utilized in any form or by any means, electronic or mechanical, including photocopying and microfilm except as authorized by written permission of ETSI. The conte

7、nt of the PDF version shall not be modified without the written authorization of ETSI. The copyright and the foregoing restriction extend to reproduction in all media. European Telecommunications Standards Institute 2017. All rights reserved. DECTTM, PLUGTESTSTM, UMTSTMand the ETSI logo are Trade Ma

8、rks of ETSI registered for the benefit of its Members. 3GPPTM and LTE are Trade Marks of ETSI registered for the benefit of its Members and of the 3GPP Organizational Partners. GSM and the GSM logo are Trade Marks registered and owned by the GSM Association. ETSI ETSI TS 146 060 V14.0.0 (2017-04)23G

9、PP TS 46.060 version 14.0.0 Release 14Intellectual Property Rights IPRs essential or potentially essential to the present document may have been declared to ETSI. The information pertaining to these essential IPRs, if any, is publicly available for ETSI members and non-members, and can be found in E

10、TSI SR 000 314: “Intellectual Property Rights (IPRs); Essential, or potentially Essential, IPRs notified to ETSI in respect of ETSI standards“, which is available from the ETSI Secretariat. Latest updates are available on the ETSI Web server (https:/ipr.etsi.org/). Pursuant to the ETSI IPR Policy, n

11、o investigation, including IPR searches, has been carried out by ETSI. No guarantee can be given as to the existence of other IPRs not referenced in ETSI SR 000 314 (or the updates on the ETSI Web server) which are, or may be, or may become, essential to the present document. Foreword This Technical

12、 Specification (TS) has been produced by ETSI 3rd Generation Partnership Project (3GPP). The present document may refer to technical specifications or reports using their 3GPP identities, UMTS identities or GSM identities. These should be interpreted as being references to the corresponding ETSI del

13、iverables. The cross reference between GSM, UMTS, 3GPP and ETSI identities can be found under http:/webapp.etsi.org/key/queryform.asp. Modal verbs terminology In the present document “shall“, “shall not“, “should“, “should not“, “may“, “need not“, “will“, “will not“, “can“ and “cannot“ are to be int

14、erpreted as described in clause 3.2 of the ETSI Drafting Rules (Verbal forms for the expression of provisions). “must“ and “must not“ are NOT allowed in ETSI deliverables except when used in direct citation. ETSI ETSI TS 146 060 V14.0.0 (2017-04)33GPP TS 46.060 version 14.0.0 Release 14Contents Inte

15、llectual Property Rights 2g3Foreword . 2g3Modal verbs terminology 2g3Foreword . 5g31 Scope 6g32 References 6g33 Definitions, symbols and abbreviations . 7g33.1 Definitions 7g33.2 Symbols 8g33.3 Abbreviations . 12g34 Outline description . 12g34.1 Functional description of audio parts . 12g34.2 Prepar

16、ation of speech samples 13g34.2.1 PCM format conversion 13g34.3 Principles of the GSM enhanced full rate speech encoder . 13g34.4 Principles of the GSM enhanced full rate speech decoder . 15g34.5 Sequence and subjective importance of encoded parameters . 15g35 Functional description of the encoder 1

17、5g35.1 Pre-processing 15g35.2 Linear prediction analysis and quantization . 16g35.2.1 Windowing and auto-correlation computation 16g35.2.2 Levinson-Durbin algorithm 17g35.2.3 LP to LSP conversion . 18g35.2.4 LSP to LP conversion . 19g35.2.5 Quantization of the LSP coefficients 20g35.2.6 Interpolatio

18、n of the LSPs 21g35.3 Open-loop pitch analysis 21g35.4 Impulse response computation . 22g35.5 Target signal computation 22g35.6 Adaptive codebook search 23g35.7 Algebraic codebook structure and search . 24g35.8 Quantization of the fixed codebook gain 27g35.9 Memory update 28g36 Functional descriptio

19、n of the decoder 28g36.1 Decoding and speech synthesis 28g36.2 Post-processing . 30g36.2.1 Adaptive post-filtering 30g36.2.2 Up-scaling . 31g37 Variables, constants and tables in the C-code of the GSM EFR codec 31g37.1 Description of the constants and variables used in the C code . 31g38 Homing sequ

20、ences 34g38.1 Functional description 34g38.2 Definitions 34g38.3 Encoder homing . 36g38.4 Decoder homing . 36g38.5 Encoder home state 37g38.6 Decoder home state 38g39 Bibliography . 43g3Annex A (informative): Change history . 44g3ETSI ETSI TS 146 060 V14.0.0 (2017-04)43GPP TS 46.060 version 14.0.0 R

21、elease 14History 45g3ETSI ETSI TS 146 060 V14.0.0 (2017-04)53GPP TS 46.060 version 14.0.0 Release 14Foreword This Technical Specification has been produced by the 3rdGeneration Partnership Project (3GPP). The present document describes the detailed mapping between input blocks of 160 speech samples

22、in 13-bit uniform PCM format to encoded blocks of 244 bits and from encoded blocks of 244 bits to output blocks of 160 reconstructed speech samples within the digital cellular telecommunications system. The contents of the present document are subject to continuing work within the TSG and may change

23、 following formal TSG approval. Should the TSG modify the contents of the present document, it will be re-released by the TSG with an identifying change of release date and an increase in version number as follows: Version x.y.z where: x the first digit: 1 presented to TSG for information; 2 present

24、ed to TSG for approval; 3 or greater indicates TSG approved document under change control. y the second digit is incremented for all changes of substance, i.e. technical enhancements, corrections, updates, etc. z the third digit is incremented when editorial only changes have been incorporated in th

25、e document. ETSI ETSI TS 146 060 V14.0.0 (2017-04)63GPP TS 46.060 version 14.0.0 Release 141 Scope The present document describes the detailed mapping between input blocks of 160 speech samples in 13-bit uniform PCM format to encoded blocks of 244 bits and from encoded blocks of 244 bits to output b

26、locks of 160 reconstructed speech samples. The sampling rate is 8 000 sample/s leading to a bit rate for the encoded bit stream of 12,2 kbit/s. The coding scheme is the so-called Algebraic Code Excited Linear Prediction Coder, hereafter referred to as ACELP. The present document also specifies the c

27、onversion between A-law or -law (PCS 1900) PCM and 13-bit uniform PCM. Performance requirements for the audio input and output parts are included only to the extent that they affect the transcoder performance. This part also describes the codec down to the bit level, thus enabling the verification o

28、f compliance to the part to a high degree of confidence by use of a set of digital test sequences. These test sequences are described in GSM 06.54 7 and are available on disks. In case of discrepancy between the requirements described in the present document and the fixed point computational descrip

29、tion (ANSI-C code) of these requirements contained in GSM 06.53 6, the description in GSM 06.53 6 will prevail. The transcoding procedure specified in the present document is applicable for the enhanced full rate speech traffic channel (TCH) in the GSM system. In GSM 06.51 5, a reference configurati

30、on for the speech transmission chain of the GSM enhanced full rate (EFR) system is shown. According to this reference configuration, the speech encoder takes its input as a 13-bit uniform PCM signal either from the audio part of the Mobile Station or on the network side, from the PSTN via an 8-bit/A

31、law or -law (PCS 1900) to 13-bit uniform PCM conversion. The encoded speech at the output of the speech encoder is delivered to a channel encoder unit which is specified in GSM 05.03 3. In the receive direction, the inverse operations take place. 2 References The following documents contain provisi

32、ons which, through reference in this text, constitute provisions of the present document. References are either specific (identified by date of publication, edition number, version number, etc.) or non-specific. For a specific reference, subsequent revisions do not apply. For a non-specific referenc

33、e, the latest version applies. In the case of a reference to a 3GPP document (including a GSM document), a non-specific reference implicitly refers to the latest version of that document in the same Release as the present document. 1 GSM 01.04: “Digital cellular telecommunications system (Phase 2+);

34、 Abbreviations and acronyms“. 2 GSM 03.50: “Digital cellular telecommunications system (Phase 2+); Transmission planning aspects of the speech service in the GSM Public Land Mobile Network (PLMN) system“. 3 GSM 05.03: “Digital cellular telecommunications system (Phase 2+); Channel coding“. 4 GSM 06.

35、32: “Digital cellular telecommunications system (Phase 2+); Voice Activity Detection (VAD)“. 5 GSM 06.51: “Digital cellular telecommunications system (Phase 2+); Enhanced Full Rate (EFR) speech processing functions General description“. 6 GSM 06.53: “Digital cellular telecommunications system (Phase

36、 2+); ANSI-C code for the GSM Enhanced Full Rate (EFR) speech codec“. 7 GSM 06.54: “Digital cellular telecommunications system (Phase 2+); Test vectors for the GSM Enhanced Full Rate (EFR) speech codec“. ETSI ETSI TS 146 060 V14.0.0 (2017-04)73GPP TS 46.060 version 14.0.0 Release 148 ITU-T Recommend

37、ation G.711 (1988): “Coding of analogue signals by pulse code modulation Pulse code modulation (PCM) of voice frequencies“. 9 ITU-T Recommendation G.726: “40, 32, 24, 16 kbit/s adaptive differential pulse code modulation (ADPCM)“. 3 Definitions, symbols and abbreviations 3.1 Definitions For the purp

38、oses of the present document, the following terms and definitions apply: adaptive codebook: adaptive codebook contains excitation vectors that are adapted for every subframe. The adaptive codebook is derived from the long term filter state. The lag value can be viewed as an index into the adaptive c

39、odebook. adaptive postfilter: this filter is applied to the output of the short term synthesis filter to enhance the perceptual quality of the reconstructed speech. In the GSM enhanced full rate codec, the adaptive postfilter is a cascade of two filters: a formant postfilter and a tilt compensation

40、filter. algebraic codebook: fixed codebook where algebraic code is used to populate the excitation vectors (innovation vectors).The excitation contains a small number of nonzero pulses with predefined interlaced sets of positions. closed-loop pitch analysis: this is the adaptive codebook search, i.e

41、 a process of estimating the pitch (lag) value from the weighted input speech and the long term filter state. In the closed-loop search, the lag is searched using error minimization loop (analysis-by-synthesis). In the GSM enhanced full rate codec, closed-loop pitch search is performed for every s

42、ubframe. direct form coefficients: one of the formats for storing the short term filter parameters. In the GSM enhanced full rate codec, all filters which are used to modify speech samples use direct form coefficients. fixed codebook: fixed codebook contains excitation vectors for speech synthesis f

43、ilters. The contents of the codebook are non-adaptive (i.e., fixed). In the GSM enhanced full rate codec, the fixed codebook is implemented using an algebraic codebook. fractional lags: set of lag values having sub-sample resolution. In the GSM enhanced full rate codec a sub-sample resolution of 1/6

44、th of a sample is used. frame: time interval equal to 20 ms (160 samples at an 8 kHz sampling rate). integer lags: set of lag values having whole sample resolution. interpolating filter: FIR filter used to produce an estimate of sub-sample resolution samples, given an input sampled with integer samp

45、le resolution. inverse filter: this filter removes the short term correlation from the speech signal. The filter models an inverse frequency response of the vocal tract. lag: long term filter delay. This is typically the true pitch period, or a multiple or sub-multiple of it. Line Spectral Frequenci

46、es: (see Line Spectral Pair). Line Spectral Pair: transformation of LPC parameters. Line Spectral Pairs are obtained by decomposing the inverse filter transfer function A(z) to a set of two transfer functions, one having even symmetry and the other having odd symmetry. The Line Spectral Pairs (also

47、called as Line Spectral Frequencies) are the roots of these polynomials on the z-unit circle). ETSI ETSI TS 146 060 V14.0.0 (2017-04)83GPP TS 46.060 version 14.0.0 Release 14LP analysis window: for each frame, the short term filter coefficients are computed using the high pass filtered speech sample

48、s within the analysis window. In the GSM enhanced full rate codec, the length of the analysis window is 240 samples. For each frame, two asymmetric windows are used to generate two sets of LP coefficients. No samples of the future frames are used (no lookahead). LP coefficients: Linear Prediction (L

49、P) coefficients (also referred as Linear Predictive Coding (LPC) coefficients) is a generic descriptive term for describing the short term filter coefficients. open-loop pitch search: process of estimating the near optimal lag directly from the weighted speech input. This is done to simplify the pitch analysis and confine the closed-loop pitch search to a small number of lags around the open-loop estimated lags. In the GSM enhanced full rate codec, open-loop pitch search is performed every 10 ms. residual: output signal resulting from an in

展开阅读全文