ImageVerifierCode 换一换
格式:PDF , 页数:32 ,大小:1.06MB ,
资源ID:733871      下载积分:10000 积分
快捷下载
登录下载
邮箱/手机:
温馨提示:
如需开发票,请勿充值!快捷下载时,用户名和密码都是您填写的邮箱或者手机号,方便查询和重复下载(系统自动生成)。
如填写123,账号就是123,密码也是123。
特别说明:
请自助下载,系统不会自动发送文件的哦; 如果您已付费,想二次下载,请登录后访问:我的下载记录
支付方式: 支付宝扫码支付 微信扫码支付   
注意:如需开发票,请勿充值!
验证码:   换一换

加入VIP,免费下载
 

温馨提示:由于个人手机设置不同,如果发现不能下载,请复制以下地址【http://www.mydoc123.com/d-733871.html】到电脑端继续下载(重复下载不扣费)。

已注册用户请登录:
账号:
密码:
验证码:   换一换
  忘记密码?
三方登录: 微信登录  

下载须知

1: 本站所有资源如无特殊说明,都需要本地电脑安装OFFICE2007和PDF阅读器。
2: 试题试卷类文档,如果标题没有明确说明有答案则都视为没有答案,请知晓。
3: 文件的所有权益归上传用户所有。
4. 未经权益所有人同意不得将文件中的内容挪作商业或盈利用途。
5. 本站仅提供交流平台,并不能对任何下载内容负责。
6. 下载文件中如有侵权或不适当内容,请与我们联系,我们立即纠正。
7. 本站不保证下载资源的准确性、安全性和完整性, 同时也不承担用户因使用这些下载资源对自己和他人造成任何形式的伤害或损失。

版权提示 | 免责声明

本文(ETSI GSM 06 32-1994 European Digital Cellular Telecommunications System (Phase 2) Voice Activity Detection (VAD) (ETS 300 580-6 Version 4 1 0)《欧洲数字蜂窝通信系统(第2阶段) 语音活动检测(VAD)(ETS 300 _1.pdf)为本站会员(wealthynice100)主动上传,麦多课文库仅提供信息存储空间,仅对用户上传内容的表现方式做保护处理,对上载内容本身不做任何修改或编辑。 若此文所含内容侵犯了您的版权或隐私,请立即通知麦多课文库(发送邮件至master@mydoc123.com或直接QQ联系客服),我们立即给予删除!

ETSI GSM 06 32-1994 European Digital Cellular Telecommunications System (Phase 2) Voice Activity Detection (VAD) (ETS 300 580-6 Version 4 1 0)《欧洲数字蜂窝通信系统(第2阶段) 语音活动检测(VAD)(ETS 300 _1.pdf

1、3404583 OLO553L 279 Released: 15 February 1994 GSM 06.32 Version: 4.0.3 Date: 21 January 1994 Source: ETSI TC-SMG Reference: GSM 06.32 UDC: 621.396.21 Key words: European digital cellular telecommunications system, Global System for Mobile communications (GSM) European digital cellular telecommunica

2、tions system (Phase 2); Voice Activity Detection (VAD) (GSM 06.32) ETSI European Telecommunications Standards Institute ETSI Secretariat Postal address: 06921 Sophia Antipolis Cedex - FRANCE Office address: Route des Lucioles - Sophia Antipolis - Valbonne - FRANCE Tel.: + 33 92 94 42 O0 - Fax: + 33

3、93 65 47 16 European Telecommunications Standards Institute 1 994. All rights reserved. No part may be reproduced except as authorised by written permission. The copyright and the foregoing restriction on reproduction extend to all media in which the information may be embodied. W 3404583 0305532 LO

4、5 Page 2 GSM 06.32 version 4.0.3: January 1994 Whilst every care has been taken in the preparation and publication of this document, errors in content, typographical or otherwise, may occur. If you have comments concerning its accuracy, please write to “ETSI Editing and Standards Approval Dept.“ at

5、the address shown on the title page. . 3404583 0305533 041 Page 3 GSM 06.32 version 4.0.3: January 1994 Contents Foreword 5 0.1 scope 7 0.2 Normative references. . 7 0.3 Definitions and abbreviations . 7 1 General . 7 2 Functional description 8 2.1 Overview and principles of operation 8 2.2 Algorith

6、m description . 8 2.2.1 Adaptive filtering and energy computation 10 2.2.2 ACF averaging . 10 2.2.3 Predictor values computation . 11 2.2.4 Spectral comparison 11 2.2.5 Periodicity detection . 12 2.2.6 Threshold adaptation 13 2.2.7 VAD decision 15 2.2.8 VAD hangover addition 15 3 Computational detai

7、ls. . 15 3.1 Adaptive filtering and energy computation . 17 3.2 ACF averaging 18 3.3 Predictor values computation 19 3.3.1 Schur recursion to compute reflection coefficients 19 3.3.2 Step-up procedure to obtain the aavlO 81 20 3.3.3 Computation of the ravlO 81 . 21 3.4 Spectral comparison . 21 3.5 P

8、eriodicity detection . 22 3.6 Threshold adaptation . 23 3.7 VAD decision . 26 3.8 VAD hangover addition 26 3.9 Periodicity updating 26 4 Digital test sequences 27 4.1 Test configuration. 27 4.2 Test sequences 28 Annex 1 (informative): Simplified block filtering operation 29 Annex 2 (informative): De

9、scription of digital test sequences . 30 A2.1 Test sequences 30 A2.2 File format description 32 Annex 3 (informative): VAD performance . 34 , 3404583 0305534 T88 Page 5 GSM 06.32 version 4.0.3: January 1994 Foreword This space is reserved for the foreword of future versions of this document. Previou

10、s page is blank . 3404583 0305535 934 Page 7 GSM 06.32 version 4.0.3: January 1994 0.1 Scope This technical specification specifies the voice activity detector (VAD) to be used in the Discontinuous Transmission (DTX) as described in GSM 06.31. It also specifies the test methods to be used to verfi t

11、hat a VAD complies with the technical specification. The requirements are mandatory on any VAD to be used either in the GSM Mobile Stations or Base Station Systems. 0.2 Normative references This E incorporates by dated and undated reference, provisions from other publications. These normative refere

12、nces are cited at the appropriate places in the text and the publications are listed hereafter. For dated references, subsequent amendments to or revisions of any of these publications apply to this ETC only when incorporated in it by amendment or revision. For undated references, the latest edition

13、 of the publication referred to applies. PI 31 141 GSM 01.04: “European digital cellular telecommunication system (Phase 2); Definitions, abbreviations and acronyms“. GSM 06.1 O: “European digital cellular telecommunication system (Phase 2); Full rate speech transcoding“. GSM 06.1 2: “European digit

14、al cellular telecommunication system (Phase 2); Comfort noise aspect for full rate speech traffic channels. GSM 06.31 : “European digital cellular telecommunication system (Phase 2); Discontinuous Transmission (DTX) for full rate speech traffic channel“. 0.3 Definitions and abbreviations Definitions

15、 and abbreviations used in this specification are listed in GSM 01.04. 1 Gen era I The function of the VAD is to indicate whether each 20ms frame produced by the speech encoder contains speech or not. The output is a binary flag which is used by the TX DTX handler defined in GSM 06.31. The technical

16、 specification is organised as follows: Section 2 describes the principles of operation of the VAD. In section 3, the computational details necessary for the fixed point implementation of the VAD algorithm are given. This section uses the same notation as used for computational details in GSM 06.10.

17、 The verification of the VAD is based on the use of digital test sequences. Section 4 defines the input and output signals and the test configuration, whereas the detailed description of the test sequences is contained in annex 2. The performance of the VAD algorithm is characterised by the amount o

18、f audible speech clipping it introduces and the percentage activity it indicates. These characteristics for the VAD defined in this technical specification have been established by extensive testing under a wide range of operating condaions. The results are summarised in annex 3. Previous page is bl

19、ank CENELEC GSM*Ob-32 94 3404583 010553b 850 = Page 0 GSM 06.32 version 4.0.3: January 1994 2 Functional description The purpose of this section is to give the reader an understanding of the principles of operation of the VAD, whereas the detailed description is given in section 3. In case of discre

20、pancy between the two descriptions, the detailed description of section 3 shall prevail. In the following subsections of section 2, a Pascal programming type of notation has been used to describe the algorithm. 21 Overview and principles of operation The function of the VAD is to distinguish between

21、 noise with speech present and noise without speech present. The biggest difficulty for detecting speech in a mobile environment is the very low speectVnoise ratios which are often encountered. The accuracy of the VAD is improved by using filtering to increase the speech/noise ratio before the decis

22、ion is made. For a mobile environment, the worst speechhoise ratios are encountered in moving vehicles. It has been found that the noise is relatively stationary for quite long periods in a mobile environment. It is therefore passible to use an adaptive filter with coefficients obtained during noise

23、, to remove much of the vehicle noise. The VAD is basically an energy detector. The energy of the filtered signal is compared with a threshold; speech is indicated whenever the threshold is exceeded. The noise encountered in mobile environments may be constantly changing in level. The spectrum of th

24、e noise can also change, and varies greatly over different vehicles. Because of these changes the VAD threshold and adaptive filter coefficients must be constantly adapted. To give reliable detection the threshold must be sufficiently above the noise level to avoid noise being identified as speech b

25、ut nut so far above it that low level parts of speech are identified as noise. The threshold and the adaptive filter coefficients are only updated when speech is not present. It is, of course, potentially dangerous for a VAD to update these values on the basis of its own decision. This adaptation th

26、erefore only occurs when the signai seems stationary in the frequency domain but does not have the pitch component inherent in voiced speech and information tones. A further mechanism is used to ensure that low level noise (which is oten not stationary over long periods) is not detected as speech. H

27、ere, an additional fixed threshold is used. A VA hangover period is used to eliminate mid-burst clipping of low level speech. Hangover is only added to speech-bursts which exceed a certain duration to avoid extending noise spikes. 22 Algorithm description The block diagram of the VA0 algorithm is sh

28、own in Figure 2-1. The individual blocks are described in the following sections. ACF and N are calculated in the speech encoder. ACF st at 3404583 0305537 797 Page 9 GSM 06.32 version 4.0.3 January 1994 i Adaptive 1 p vad , 1 VAD 1 vad i hangover filtering and energy decision .hiitinn computation 7

29、 vad II I I l +I Periodicity Threshold I lth vad Spectral comparison Predictor values om put at ion I av p-A averaging m2-1. Functional block diagram of the VAD The global variables shown in the block diagram are described as follows: - ACF are autocorrelation coefficients which are calculated in th

30、e speech encoder defined in GSM 06.10 (section 3.1.4, see also Annex 1). The inputs to the speech encoder are 16 bit 2s complement numbers, as described in GSM 06.1 O, section 4.2.0. - av and avl are averaged ACF vectors. - ravl are autocorrelated predictor values obtained from avl. - wad are the au

31、tocorrelated predictor values of the adaptive filter. - N is the long term predictor lag value which is obtained every subsegment in the speech coder defined in GSM 06.10. - ptch indicates whether the signal has a steady periodic component. - pvad is the energy in the current frame of the input sign

32、al after filtering. - thvad is an adaptive threshold. - stat indicates spectral stationarity. - wad indicates the VAD decision before hangover is added. - vad is the final VAD decision with hangover included. W 3404583 0105538 623 W Page 10 GSM 06.32 version 4.0.3: January 1994 22.1 Adaptive filteri

33、ng and energy computation had is computed as follows: 8 i=l mad := rvad ACFO + 2SUM rvadi ACFi This corresponds to performing an 8th order block filtering on the input samples to the speech encoder, after zero offset compensation and pre-emphasis. This is explained in annex 1. 22.2 ACF averaging Spe

34、ctral characteristics of the input signal have to be obtained using blocks that are larger than one 20ms kame. This is done by averaging the autocorrelation values for several consecutive kames. This averaging is given by the following equations: frames - 1 j=O avOn)i := SUM ACFn-ji ; i = 08 avini :

35、= avOn-framesi ; i = 08 Where n represents the current frame, n-1 represents the previous frame etc. The values of constants are given in table 2-1. Table 2-1. Constants and variables for ACF averaging 3404583 0305539 5bT Page 11 GSM 06.32 version 4.0.3: January 1994 223 Predictor values computation

36、 The filter predictor values aavl are obtained from the autocorrelation values avl according to the equation: where : and : aavlO := -1 avl is used in preference to avo as avo may contain speech. The autocorrelated predictor values ravl are then obtained: 8-i k= O ravli := SUM aavlk aavlk+i 2.2.4 Sp

37、ectral comparison ; i = 08 The spectra represented by the autocorrelated predictor values ravl and the averaged autocorrelation values avo are compared using the distortion measure dm defined below. This measure is used to produce a boolean value stat every 20ms, as given by these equations: 3904583

38、 0305540 281 Page 12 GSM 08.32 version 4.0.3: January 1994 8 dm := ( ravlOavOO + LSUM ravliavOi ) / avOO i=l difference := Idm - lastdm) laatdm := dm stat := difference = nthresh The following operations are done after the VAD decision and when the current LTP lag values (NO N3) are available, this

39、reduces the delay of the VAD decision. (N(-1) = N3 of previous segment.) lagcount := O for j := O to 3 do begin emallag := maximum(Nj,Nj-1) mod minimum(Nj,Nj-1) if minimum(smallag,minimum(Nj,Nj-l)-smaag) C lthresh then increment(1agcount) end veryoldlagcount := oldlagcount oldlagcount := lagcount 34

40、04583 OLO.5541 118 Page 13 GSM 06.32 version 4.0.3: January 1994 The values of Constants and initial values are given in table 2-3. Table 2-3. Constants and variables for periodic*Q detection 22.6 Threshold adaptation A check is made every 20ms to determine whether the VAD decision threshold (thvad)

41、 should be changed. This adaptation is carried out according to the flowchart shown in fig 2-2. The constants used are given in table 2-4. Adaptation takes place in two different situations: firstly whenever ACFO is very low and secondly whenever there is a very high probability that speech is not p

42、resent. In the first case, the threshold is adapted if the energy of the input signal is less than pth. The threshold is set to plev without carrying out any further tests because at these very low levels the effect of the signal quantization makes it impossible to obtain reliable results from these

43、 tests. In the second case, the decision threshold (thvad) and the adaptive filter coefficients (wad) are only updated with the ravl values when the signal is stationary and has no periodic component. In this situation there is a very high probability that speech is not present. The stationarity is

44、detected in the frequency domain, by calculating the spectral difference using consecutive averaged ACF values. If this spectral difference changes very little over a certain number of frames (adp), and the signal does not have a periodic component inherent in voiced speech and information tones, th

45、en adaptation occurs. The stepsize by which the threshold is adapted is not constant but a proportion of the current value (determined by constants dec and inc). The adaptation begins by experimentally multiplying the threshold by a factor of (1-l/dec). If the new threshold is now higher than or equ

46、al to Pvad times fac then the threshold needed to be decreased and it is left at this new lower level. If, on the other hand, the new threshold level is less than Pvad times fac then the threshold either needed to be increased or kept constant. In this case it is set to Pvad times fac unless this wo

47、uld mean multiplying it by more than a factor of (l+l/inc) (in which case it is multiplied by a factor of (l+l/inc). The threshold is never allowed to be greater than Pvad+margin. Table 2-4. Constants and variables for threshold adaptation 3404583 0305542 054 ldec Ihvad =th vid -Ih vad L Page 14 GSM

48、 06.32 version 4.0.3: January 1994 YB 1 linc,P vad “E) Ih vad =min(th vad th vad I BEGIN b- th p t margin vad vad yes I 1 + adamnt = adp t END Fia 2-2. Flow diagom for threshold aptation m 3404583 0305543 T90 m Page 15 GSM 06.32 version 4.0.3: January 1994 227 VAD decision Prior to hangover the VAD

49、decision condition is: wad := pvad thvad 22.8 VAD hangover addition VAD hangover is only added to bursts of speech greater than or equal to burstconst blocks. The boolean variable vad indicates the decision of the VAD with hangover included. The values of the constants are given in table 2-5. The hangover algorithm is as follows: if wad then increment(burstc0unt) else burstcount := O if burstcount = burstconst then begin hangcount := hangconst; burstcount := burstconst end vad := wad or (hangcount

copyright@ 2008-2019 麦多课文库(www.mydoc123.com)网站版权所有
备案/许可证编号:苏ICP备17064731号-1