ETSI TS 126 194-2016 Digital cellular telecommunications system (Phase 2+) Universal Mobile Telecommunications System (UMTS) LTE Speech codec speech processing functions Adaptive M.pdf

上传人:fuellot230 文档编号:741835 上传时间:2019-01-11 格式:PDF 页数:17 大小:154.26KB
下载 相关 举报
ETSI TS 126 194-2016 Digital cellular telecommunications system (Phase 2+) Universal Mobile Telecommunications System (UMTS) LTE Speech codec speech processing functions Adaptive M.pdf_第1页
第1页 / 共17页
ETSI TS 126 194-2016 Digital cellular telecommunications system (Phase 2+) Universal Mobile Telecommunications System (UMTS) LTE Speech codec speech processing functions Adaptive M.pdf_第2页
第2页 / 共17页
ETSI TS 126 194-2016 Digital cellular telecommunications system (Phase 2+) Universal Mobile Telecommunications System (UMTS) LTE Speech codec speech processing functions Adaptive M.pdf_第3页
第3页 / 共17页
ETSI TS 126 194-2016 Digital cellular telecommunications system (Phase 2+) Universal Mobile Telecommunications System (UMTS) LTE Speech codec speech processing functions Adaptive M.pdf_第4页
第4页 / 共17页
ETSI TS 126 194-2016 Digital cellular telecommunications system (Phase 2+) Universal Mobile Telecommunications System (UMTS) LTE Speech codec speech processing functions Adaptive M.pdf_第5页
第5页 / 共17页
点击查看更多>>
资源描述

1、 ETSI TS 1Digital cellular telecoUniversal Mobile TelSpeech codec sAdaptive Multi-Rate - Voice A(3GPP TS 26.1TECHNICAL SPECIFICATION126 194 V13.0.0 (2016communications system (Phaelecommunications System (LTE; c speech processing functionWideband (AMR-WB) speecActivity Detector (VAD) .194 version 13

2、.0.0 Release 1316-01) hase 2+); (UMTS); ons; ech codec; 13) ETSI ETSI TS 126 194 V13.0.0 (2016-01)13GPP TS 26.194 version 13.0.0 Release 13Reference RTS/TSGS-0426194vd00 Keywords GSM,LTE,UMTS ETSI 650 Route des Lucioles F-06921 Sophia Antipolis Cedex - FRANCE Tel.: +33 4 92 94 42 00 Fax: +33 4 93 65

3、 47 16 Siret N 348 623 562 00017 - NAF 742 C Association but non lucratif enregistre la Sous-Prfecture de Grasse (06) N 7803/88 Important notice The present document can be downloaded from: http:/www.etsi.org/standards-search The present document may be made available in electronic versions and/or i

4、n print. The content of any electronic and/or print versions of the present document shall not be modified without the prior written authorization of ETSI. In case of any existing or perceived difference in contents between such versions and/or in print, the only prevailing document is the print of

5、the Portable Document Format (PDF) version kept on a specific network drive within ETSI Secretariat. Users of the present document should be aware that the document may be subject to revision or change of status. Information on the current status of this and other ETSI documents is available at http

6、:/portal.etsi.org/tb/status/status.asp If you find errors in the present document, please send your comment to one of the following services: https:/portal.etsi.org/People/CommiteeSupportStaff.aspx Copyright Notification No part may be reproduced or utilized in any form or by any means, electronic o

7、r mechanical, including photocopying and microfilm except as authorized by written permission of ETSI. The content of the PDF version shall not be modified without the written authorization of ETSI. The copyright and the foregoing restriction extend to reproduction in all media. European Telecommuni

8、cations Standards Institute 2016. All rights reserved. DECTTM, PLUGTESTSTM, UMTSTMand the ETSI logo are Trade Marks of ETSI registered for the benefit of its Members. 3GPPTM and LTE are Trade Marks of ETSI registered for the benefit of its Members and of the 3GPP Organizational Partners. GSM and the

9、 GSM logo are Trade Marks registered and owned by the GSM Association. ETSI ETSI TS 126 194 V13.0.0 (2016-01)23GPP TS 26.194 version 13.0.0 Release 13Intellectual Property Rights IPRs essential or potentially essential to the present document may have been declared to ETSI. The information pertainin

10、g to these essential IPRs, if any, is publicly available for ETSI members and non-members, and can be found in ETSI SR 000 314: “Intellectual Property Rights (IPRs); Essential, or potentially Essential, IPRs notified to ETSI in respect of ETSI standards“, which is available from the ETSI Secretariat

11、. Latest updates are available on the ETSI Web server (https:/ipr.etsi.org/). Pursuant to the ETSI IPR Policy, no investigation, including IPR searches, has been carried out by ETSI. No guarantee can be given as to the existence of other IPRs not referenced in ETSI SR 000 314 (or the updates on the

12、ETSI Web server) which are, or may be, or may become, essential to the present document. Foreword This Technical Specification (TS) has been produced by ETSI 3rd Generation Partnership Project (3GPP). The present document may refer to technical specifications or reports using their 3GPP identities,

13、UMTS identities or GSM identities. These should be interpreted as being references to the corresponding ETSI deliverables. The cross reference between GSM, UMTS, 3GPP and ETSI identities can be found under http:/webapp.etsi.org/key/queryform.asp. Modal verbs terminology In the present document “shal

14、l“, “shall not“, “should“, “should not“, “may“, “need not“, “will“, “will not“, “can“ and “cannot“ are to be interpreted as described in clause 3.2 of the ETSI Drafting Rules (Verbal forms for the expression of provisions). “must“ and “must not“ are NOT allowed in ETSI deliverables except when used

15、in direct citation. ETSI ETSI TS 126 194 V13.0.0 (2016-01)33GPP TS 26.194 version 13.0.0 Release 13Contents Intellectual Property Rights 2g3Foreword . 2g3Modal verbs terminology 2g3Foreword . 4g31 Scope 5g32 Normative References 5g33 Technical Description. 5g33.1 Definitions, symbols and abbreviatio

16、ns 5g33.1.1 Definitions 5g33.1.2 Symbols 5g33.1.2.1 Variables . 5g33.1.2.2 Constants. 6g33.1.2.3 Functions . 7g33.1.3 Abbreviations 8g33.2 General . 8g33.3 Functional description 8g33.3.1 Filter bank and computation of sub-band levels . 9g33.3.2 Tone detection 10g33.3.3 VAD decision . 11g33.3.3.1 Ha

17、ngover addition . 12g33.3.3.2 Background noise estimation 13g33.3.3.3 Speech level estimation . 14g34 Computational details . 14g3Annex A (informative): Change history . 15g3History 16g3ETSI ETSI TS 126 194 V13.0.0 (2016-01)43GPP TS 26.194 version 13.0.0 Release 13Foreword This Technical Specificati

18、on has been produced by the 3GPP. This document specifies the Voice Activity Detector (VAD) to be used in the Discontinuous Transmission (DTX) as described in 3. The contents of the present document are subject to continuing work within the TSG and may change following formal TSG approval. Should th

19、e TSG modify the contents of this TS, it will be re-released by the TSG with an identifying change of release date and an increase in version number as follows: Version x.y.z where: x the first digit: 1 presented to TSG for information; 2 presented to TSG for approval; 3 Indicates TSG approved docum

20、ent under change control. y the second digit is incremented for all changes of substance, i.e. technical enhancements, corrections, updates, etc. z the third digit is incremented when editorial only changes have been incorporated in the specification; ETSI ETSI TS 126 194 V13.0.0 (2016-01)53GPP TS 2

21、6.194 version 13.0.0 Release 131 Scope This document specifies the Voice Activity Detector (VAD) to be used in the Discontinuous Transmission (DTX) as described in 3. The requirements are mandatory on any VAD to be used either in User Equipment (UE) or Base Station Systems (BSS)s that utilize the AM

22、R wideband speech codec. 2 Normative References The following documents contain provisions which, through reference in this text, constitute provisions of the present document. - References are either specific (identified by date of publication, edition number, version number, etc.) or non-specific.

23、 - For a specific reference, subsequent revisions do not apply. - For a non-specific reference, the latest version applies. In the case of a reference to a 3GPP document (including a GSM document), a non-specific reference implicitly refers to the latest version of that document in the same Release

24、as the present document. 1 3GPP TS 26.173: “ANSI-C code for the Adaptive Multi-Rate Wideband speech codec“ . 2 3GPP TS 26.190: “Speech codec speech processing functions; Adaptive Multi-Rate - Wideband (AMR-WB) speech codec; Transcoding functions“ . 3 3GPP TS 26.193: “Speech codec speech processing f

25、unctions; Adaptive Multi-Rate - Wideband (AMR-WB) speech codec; Source controlled rate operation“. 4 ITU, The International Telecommunications Union, Blue Book, Vol. III, Telephone Transmission Quality, IXth Plenary Assembly, Melbourne, 14-25 November, 1988, Recommendation G.711, Pulse code modulati

26、on (PCM) of voice frequencies. 5 3GPP TR 21.905: “Vocabulary for 3GPP Specifications“. 3 Technical Description 3.1 Definitions, symbols and abbreviations 3.1.1 Definitions For the purposes of the present document, the terms and definitions given in TR 21.905 5 and the following apply. A term defined

27、 in the present document takes precedence over the definition of the same term, if any, in TR 21.905 5. frame: Time interval of 20 ms corresponding to the time segmentation of the speech transcoder. 3.1.2 Symbols For the purposes of this TS, the following symbols apply. 3.1.2.1 Variables bckr_estn b

28、ackground noise estimate at the frequency band “n“ ETSI ETSI TS 126 194 V13.0.0 (2016-01)63GPP TS 26.194 version 13.0.0 Release 13burst_count counts length of a speech burst, used by VAD hangover addition hang_count hangover counter, used by VAD hangover addition leveln signal level at the frequency

29、 band “n“ new_speech pointer of the speech encoder, points a buffer containing last received samples of a speech frame 2 noise_level estimated noise level pow_sum input power s(i) samples of the input frame snr_sum measure between input frame and noise estimate speech_level estimated speech level st

30、at_count stationary counter stat_rat measure indicating stationary of the input frame tone_flag flag indicating the presence of a tone vad_thr VAD threshold VAD_flag Boolean VAD flag vadreg intermediate VAD decision 3.1.2.2 Constants ALPHA_UP1 constant for updating noise estimate (see subclause 3.3.

31、5.2) ALPHA_DOWN1 constant for updating noise estimate (see subclause 3.3.5.2) ALPHA_UP2 constant for updating noise estimate (see subclause 3.3.5.2) ALPHA_DOWN2 constant for updating noise estimate (see subclause 3.3.5.2) ALPHA3 constant for updating noise estimate (see subclause 3.3.5.2) ALPHA4 con

32、stant for updating average signal level (see subclause 3.3.5.2) ALPHA5 constant for updating average signal level (see subclause 3.3.5.2) BURST_HIGH constant for controlling VAD hangover addition (see subclause 3.3.5.1) BURST_P1 constant for controlling VAD hangover addition (see subclause 3.3.5.1)

33、BURST_SLOPE constant for controlling VAD hangover addition (see subclause 3.3.5.1) COEFF3 coefficient for the filter bank (see subclause 3.3.1) COEFF5_1 coefficient for the filter bank (see subclause 3.3.1) COEFF5_2 coefficient for the filter bank (see subclause 3.3.1) HANG_HIGH constant for control

34、ling VAD hangover addition (see subclause 3.3.5.1) HANG_LOW constant for controlling VAD hangover addition (see subclause 3.3.5.1) HANG_P1 constant for controlling VAD hangover addition (see subclause 3.3.5.1) HANG_SLOPE constant for controlling VAD hangover addition (see subclause 3.3.5.1) FRAME_LE

35、N size of a speech frame, 256 samples (20 ms) ETSI ETSI TS 126 194 V13.0.0 (2016-01)73GPP TS 26.194 version 13.0.0 Release 13MIN_SPEECH_LEVEL1 constant for speech estimation (see subclause 3.3.5.3) MIN_SPEECH_LEVEL2 constant for speech estimation (see subclause 3.3.5.3) MIN_SPEECH_SNR constant for V

36、AD threshold adaptation (see subclause 3.3.5) NO_P1 constant for VAD threshold adaptation (see subclause 3.3.5) NO_SLOPE constant for VAD threshold adaptation (see subclause 3.3.5) NOISE_MAX maximum value for noise estimate (see subclause 3.3.5.2) NOISE_MIN minimum value for noise estimate (see subc

37、lause 3.3.5.2) POW_TONE_THR threshold for tone detection (see subclause 3.3.5) SP_ACTIVITY_COUNT constant for speech estimation (see subclause 3.3.5.3) SP_ALPHA_DOWN constant for speech estimation (see subclause 3.3.5.3) SP_ALPHA_UP constant for speech estimation (see subclause 3.3.5.3) SP_CH_MAX co

38、nstant for VAD threshold adaptation (see subclause 3.3.5) SP_CH_MIN constant for VAD threshold adaptation (see subclause 3.3.5) SP_EST_COUNT constant for speech estimation (see subclause 3.3.5.3) SP_P1 constant for VAD threshold adaptation (see subclause 3.3.5) SP_SLOPE constant for VAD threshold ad

39、aptation (see subclause 3.3.5) STAT_COUNT threshold for stationary detection (see subclause 3.3.5.2) STAT_THR threshold for stationary detection (see subclause 3.3.5.2) STAT_THR_LEVEL threshold for stationary detection (see subclause 3.3.5.2) THR_HIGH constant for VAD threshold adaptation (see subcl

40、ause 3.3.5) TONE_THR threshold for tone detection (see subclause 3.3.3) VAD_POW_LOW constant for controlling VAD hangover addition (see subclause 3.3.5.1) 3.1.2.3 Functions + Addition - Subtraction * Multiplication / Division | x | absolute value of x AND Boolean AND OR Boolean ORxnnab()=() ( ) ( )

41、()=+xa xa xb xb11K MIN(x,y) =xyyyxx,3.1.3 Abbreviations For the purposes of the present document, the abbreviations given in TR 21.905 5 and the following apply. An abbreviation defined in the present document takes precedence over the definition of the same abbreviation, if any, in TR 21.905 5. ANS

42、I American National Standards Institute DTX Discontinuous Transmission VAD Voice Activity Detector CNG Comfort Noise Generation 3.2 General The function of the VAD algorithm is to indicate whether each 20 ms frame contains signals that should be transmitted, e.g. speech, music or information tones.

43、The output of the VAD algorithm is a Boolean flag (VAD_flag) indicating presence of such signals. 3.3 Functional description The block diagram of the VAD algorithm is depicted in Figure 1. The VAD algorithm uses parameters of the speech encoder to compute the Boolean VAD flag (VAD_flag). This input

44、frame for VAD is sampled at the 6.4 kHz frequency and thus it contains 256 samples. Samples of the input frame (s(i) are divided into sub-bands and level of the signal (leveln) in each band is calculated. Input for the tone detection function are the normalized open-loop pitch gains which are calcul

45、ated by open-loop pitch analysis of the speech encoder. The tone detection function computes a flag (tone_flag) which indicates presence of a signalling tone, voiced speech, or other strongly periodic signal. Background noise level (bckr_estn) is estimated in each band based on the VAD decision, sig

46、nal stationarity and the tone-flag. Intermediate VAD decision is calculated by comparing input SNR (leveln/bckr_estn) to an adaptive threshold. The threshold is adapted based on noise and long term speech estimates. Finally, the VAD flag is calculated by adding hangover to the intermediate VAD decis

47、ion. Filter bankandcomputationof sub-bandlevelsVADdecisionTonedetectionol_gainVAD_flaglevelntone_flags(i)Figure 1: Simplified block diagram of the VAD algorithm ETSI ETSI TS 126 194 V13.0.0 (2016-01)93GPP TS 26.194 version 13.0.0 Release 133.3.1 Filter bank and computation of sub-band levels The inp

48、ut signal is divided into frequency bands using a 12-band filter bank (Figure 2). Cut-off frequencies for the filter bank are shown in Table 1. Table 1. Cut-off frequencies for the filter bank Band number Frequencies 1 0 200 Hz 2 200 400 Hz 3 400 600 Hz 4 600 800 Hz 5 800 1200 Hz 6 1200 1600 Hz 7 16

49、00 2000 Hz 8 2000 2400 Hz 9 2400 - 3200 Hz 10 3200 4000 Hz 11 4000 4800 Hz 12 4800 6400 Hz Input for the filter bank is a speech frame pointed by the new_speech pointer of the speech encoder 1. Input values for the filter bank are scaled down by one bit. This ensures safe scaling, i.e. saturation can not occur during calculation of the filter bank. 5th orderfilter block5th orderfilter block5th orderfilter block3rd orderfilter block5th ord

展开阅读全文
相关资源
猜你喜欢
相关搜索

当前位置:首页 > 标准规范 > 国际标准 > 其他

copyright@ 2008-2019 麦多课文库(www.mydoc123.com)网站版权所有
备案/许可证编号:苏ICP备17064731号-1