ETSI TS 146 032-2016 Digital cellular telecommunications system (Phase 2+) Full rate speech Voice Activity Detector (VAD) for full rate speech traffic channels (V13 0 0 3GPP TS 46 .pdf

上传人:testyield361 文档编号:743974 上传时间:2019-01-11 格式:PDF 页数:38 大小:220.81KB
下载 相关 举报
ETSI TS 146 032-2016 Digital cellular telecommunications system (Phase 2+) Full rate speech Voice Activity Detector (VAD) for full rate speech traffic channels (V13 0 0 3GPP TS 46 .pdf_第1页
第1页 / 共38页
ETSI TS 146 032-2016 Digital cellular telecommunications system (Phase 2+) Full rate speech Voice Activity Detector (VAD) for full rate speech traffic channels (V13 0 0 3GPP TS 46 .pdf_第2页
第2页 / 共38页
ETSI TS 146 032-2016 Digital cellular telecommunications system (Phase 2+) Full rate speech Voice Activity Detector (VAD) for full rate speech traffic channels (V13 0 0 3GPP TS 46 .pdf_第3页
第3页 / 共38页
ETSI TS 146 032-2016 Digital cellular telecommunications system (Phase 2+) Full rate speech Voice Activity Detector (VAD) for full rate speech traffic channels (V13 0 0 3GPP TS 46 .pdf_第4页
第4页 / 共38页
ETSI TS 146 032-2016 Digital cellular telecommunications system (Phase 2+) Full rate speech Voice Activity Detector (VAD) for full rate speech traffic channels (V13 0 0 3GPP TS 46 .pdf_第5页
第5页 / 共38页
点击查看更多>>
资源描述

1、 ETSI TS 1Digital cellular telecoFVoice Afor full rate(3GPP TS 46.0floppy3TECHNICAL SPECIFICATION146 032 V13.0.0 (2016communications system (PhaFull rate speech; Activity Detector (VAD) ate speech traffic channels .032 version 13.0.0 Release 13GLOBAL SYSTEMOBILE COMMUN16-01) hase 2+); 13) TEM FOR IC

2、ATIONSRETSI ETSI TS 146 032 V13.0.0 (2016-01)13GPP TS 46.032 version 13.0.0 Release 13Reference RTS/TSGS-0446032vd00 Keywords GSM ETSI 650 Route des Lucioles F-06921 Sophia Antipolis Cedex - FRANCE Tel.: +33 4 92 94 42 00 Fax: +33 4 93 65 47 16 Siret N 348 623 562 00017 - NAF 742 C Association but n

3、on lucratif enregistre la Sous-Prfecture de Grasse (06) N 7803/88 Important notice The present document can be downloaded from: http:/www.etsi.org/standards-search The present document may be made available in electronic versions and/or in print. The content of any electronic and/or print versions o

4、f the present document shall not be modified without the prior written authorization of ETSI. In case of any existing or perceived difference in contents between such versions and/or in print, the only prevailing document is the print of the Portable Document Format (PDF) version kept on a specific

5、network drive within ETSI Secretariat. Users of the present document should be aware that the document may be subject to revision or change of status. Information on the current status of this and other ETSI documents is available at http:/portal.etsi.org/tb/status/status.asp If you find errors in t

6、he present document, please send your comment to one of the following services: https:/portal.etsi.org/People/CommiteeSupportStaff.aspx Copyright Notification No part may be reproduced or utilized in any form or by any means, electronic or mechanical, including photocopying and microfilm except as a

7、uthorized by written permission of ETSI. The content of the PDF version shall not be modified without the written authorization of ETSI. The copyright and the foregoing restriction extend to reproduction in all media. European Telecommunications Standards Institute 2016. All rights reserved. DECTTM,

8、 PLUGTESTSTM, UMTSTMand the ETSI logo are Trade Marks of ETSI registered for the benefit of its Members. 3GPPTM and LTE are Trade Marks of ETSI registered for the benefit of its Members and of the 3GPP Organizational Partners. GSM and the GSM logo are Trade Marks registered and owned by the GSM Asso

9、ciation. ETSI ETSI TS 146 032 V13.0.0 (2016-01)23GPP TS 46.032 version 13.0.0 Release 13Intellectual Property Rights IPRs essential or potentially essential to the present document may have been declared to ETSI. The information pertaining to these essential IPRs, if any, is publicly available for E

10、TSI members and non-members, and can be found in ETSI SR 000 314: “Intellectual Property Rights (IPRs); Essential, or potentially Essential, IPRs notified to ETSI in respect of ETSI standards“, which is available from the ETSI Secretariat. Latest updates are available on the ETSI Web server (https:/

11、ipr.etsi.org/). Pursuant to the ETSI IPR Policy, no investigation, including IPR searches, has been carried out by ETSI. No guarantee can be given as to the existence of other IPRs not referenced in ETSI SR 000 314 (or the updates on the ETSI Web server) which are, or may be, or may become, essentia

12、l to the present document. Foreword This Technical Specification (TS) has been produced by ETSI 3rd Generation Partnership Project (3GPP). The present document may refer to technical specifications or reports using their 3GPP identities, UMTS identities or GSM identities. These should be interpreted

13、 as being references to the corresponding ETSI deliverables. The cross reference between GSM, UMTS, 3GPP and ETSI identities can be found under http:/webapp.etsi.org/key/queryform.asp. Modal verbs terminology In the present document “shall“, “shall not“, “should“, “should not“, “may“, “need not“, “w

14、ill“, “will not“, “can“ and “cannot“ are to be interpreted as described in clause 3.2 of the ETSI Drafting Rules (Verbal forms for the expression of provisions). “must“ and “must not“ are NOT allowed in ETSI deliverables except when used in direct citation. ETSI ETSI TS 146 032 V13.0.0 (2016-01)33GP

15、P TS 46.032 version 13.0.0 Release 13Contents Intellectual Property Rights 2g3Foreword . 2g3Modal verbs terminology 2g3Foreword . 5g31 Scope 6g32 References 6g33 Abbreviations . 6g34 General . 6g35 Functional description 7g35.1 Overview and principles of operation 7g35.2 Algorithm description . 7g35

16、.2.1 Adaptive filtering and energy computation 9g35.2.2 ACF averaging 9g35.2.3 Predictor values computation 9g35.2.4 Spectral comparison 10g35.2.5 Periodicity detection . 10g35.2.6 Information tone detection 11g35.2.7 Threshold adaptation. 12g35.2.8 VAD decision . 15g35.2.9 VAD hangover addition 15g

17、36 Computational details . 15g36.1 Adaptive filtering and energy computation 17g36.2 ACF averaging . 18g36.3 Predictor values computation . 18g36.3.1 Schur recursion to compute reflection coefficients . 19g36.3.2 Step-up procedure to obtain the aav108 . 19g36.3.3 Computation of the rav108 . 20g36.4

18、Spectral comparison . 20g36.5 Periodicity detection . 21g36.6 Threshold adaptation 21g36.7 VAD decision . 23g36.8 VAD hangover addition . 23g36.9 Periodicity updating . 24g36.10 Tone detection 24g36.10.1 Windowing . 24g36.10.2 Auto-correlation 24g36.10.3 Computation of the reflection coefficients 25

19、g36.10.4 Filter coefficient calculation . 26g36.10.5 Pole Frequency Test 26g36.10.6 Prediction gain test 26g37 Digital test sequences . 27g37.1 Test configuration. 27g37.2 Test sequences 28g3Annex A (informative): 29g3A.1 Simplified block filtering operation . 29g3A.2 Description of digital test seq

20、uences 29g3A.2.1 Test sequences 29g3A.2.2 File format description . 31g3A.3 VAD performance 33g3ETSI ETSI TS 146 032 V13.0.0 (2016-01)43GPP TS 46.032 version 13.0.0 Release 13A.4 Pole frequency calculation . 34g3Annex B (normative): Test sequences . 35g3Annex C (informative): Change history . 36g3Hi

21、story 37g3ETSI ETSI TS 146 032 V13.0.0 (2016-01)53GPP TS 46.032 version 13.0.0 Release 13Foreword This Technical Specification has been produced by the 3rdGeneration Partnership Project (3GPP). The present document specifies the Voice Activity Detector (VAD) to be used in the Discontinuous Transmiss

22、ion (DTX) for the digital cellular telecommunications system. Archive en_300965v080000p0.zip which accompanies the present document, contains test sequences, as described in clause A.2. en_300965v080000p0.zip Annex B: Test sequences for the GSM Full Rate speech codec; Test sequences files *.inp, *.c

23、od, *.vad. The specification from which the present document has been derived was originally based on CEPT documentation, hence the presentation of the present document may not be entirely in accordance with the ETSI/PNE Rules. The contents of the present document are subject to continuing work with

24、in the TSG and may change following formal TSG approval. Should the TSG modify the contents of the present document, it will be re-released by the TSG with an identifying change of release date and an increase in version number as follows: Version x.y.z where: x the first digit: 1 presented to TSG f

25、or information; 2 presented to TSG for approval; 3 or greater indicates TSG approved document under change control. y the second digit is incremented for all changes of substance, i.e. technical enhancements, corrections, updates, etc. z the third digit is incremented when editorial only changes hav

26、e been incorporated in the document. ETSI ETSI TS 146 032 V13.0.0 (2016-01)63GPP TS 46.032 version 13.0.0 Release 131 Scope The present document specifies the Voice Activity Detector (VAD) to be used in the Discontinuous Transmission (DTX) as described in GSM 06.31. It also specifies the test method

27、s to be used to verify that a VAD complies with the technical specification. The requirements are mandatory on any VAD to be used either in the GSM Mobile Stations (MS)s or Base Station Systems (BSS)s. 2 References The following documents contain provisions which, through reference in this text, con

28、stitute provisions of the present document. References are either specific (identified by date of publication, edition number, version number, etc.) or non-specific. For a specific reference, subsequent revisions do not apply. For a non-specific reference, the latest version applies. In the case of

29、a reference to a 3GPP document (including a GSM document), a non-specific reference implicitly refers to the latest version of that document in the same Release as the present document. 1 GSM 01.04: “Digital cellular telecommunications system (Phase 2+); Abbreviations and acronyms“. 2 GSM 06.10: “Di

30、gital cellular telecommunications system(Phase 2+); Full rate speech; Transcoding“. 3 GSM 06.12: “Digital cellular telecommunications system(Phase 2+); Full rate speech; Comfort noise aspect for full rate speech traffic channels“. 4 GSM 06.31: “Digital cellular telecommunications system(Phase 2+); F

31、ull rate speech; Discontinuous Transmission (DTX) for full rate speech traffic channels“. 3 Abbreviations Abbreviations used in the present document are listed in GSM 01.04 1. 4 General The function of the VAD is to indicate whether each 20 ms frame produced by the speech encoder contains speech or

32、not. The output is a binary flag which is used by the TX DTX handler defined in GSM 06.31 4. The ETS is organized as follows. Clause 2 describes the principles of operation of the VAD. In clause 3, the computational details necessary for the fixed point implementation of the VAD algorithm are given.

33、 This clause uses the same notation as used for computational details in GSM 06.10. The verification of the VAD is based on the use of digital test sequences. Clause 4 defines the input and output signals and the test configuration, whereas the detailed description of the test sequences is contained

34、 in clause A.2. The performance of the VAD algorithm is characterized by the amount of audible speech clipping it introduces and the percentage activity it indicates. These characteristics for the VAD defined in the present document have been established by extensive testing under a wide range of op

35、erating conditions. The results are summarized in clause A.3. ETSI ETSI TS 146 032 V13.0.0 (2016-01)73GPP TS 46.032 version 13.0.0 Release 135 Functional description The purpose of this clause is to give the reader an understanding of the principles of operation of the VAD, whereas the detailed desc

36、ription is given in clause 3. In case of discrepancy between the two descriptions, the detailed description of clause 3 shall prevail. In the following subclauses of clause 2, a Pascal programming type of notation has been used to describe the algorithm. 5.1 Overview and principles of operation The

37、function of the VAD is to distinguish between noise with speech present and noise without speech present. The biggest difficulty for detecting speech in a mobile environment is the very low speech/noise ratios which are often encountered. The accuracy of the VAD is improved by using filtering to inc

38、rease the speech/noise ratio before the decision is made. For a mobile environment, the worst speech/noise ratios are encountered in moving vehicles. It has been found that the noise is relatively stationary for quite long periods in a mobile environment. It is therefore possible to use an adaptive

39、filter with coefficients obtained during noise, to remove much of the vehicle noise. The VAD is basically an energy detector. The energy of the filtered signal is compared with a threshold; speech is indicated whenever the threshold is exceeded. The noise encountered in mobile environments may be co

40、nstantly changing in level. The spectrum of the noise can also change, and varies greatly over different vehicles. Because of these changes the VAD threshold and adaptive filter coefficients must be constantly adapted. To give reliable detection the threshold must be sufficiently above the noise lev

41、el to avoid noise being identified as speech but not so far above it that low level parts of speech are identified as noise. The threshold and the adaptive filter coefficients are only updated when speech is not present. It is, of course, potentially dangerous for a VAD to update these values on the

42、 basis of its own decision. This adaptation therefore only occurs when the signal seems stationary in the frequency domain but does not have the pitch component inherent in voiced speech. A tone detector is also used to prevent adaptation during information tones. A further mechanism is used to ensu

43、re that low level noise (which is often not stationary over long periods) is not detected as speech. Here, an additional fixed threshold is used. A VAD hangover period is used to eliminate mid-burst clipping of low level speech. Hangover is only added to speech-bursts which exceed a certain duration

44、 to avoid extending noise spikes. 5.2 Algorithm description The block diagram of the VAD algorithm is shown in figure 2.1. The individual blocks are described in the following subclauses. ACF, N and sof are calculated in the speech encoder. ETSI ETSI TS 146 032 V13.0.0 (2016-01)83GPP TS 46.032 versi

45、on 13.0.0 Release 13PredictorvaluescomputationACFaveragingSpectralcomparisonPeriodicitydetectionvvadthvadstatrvadpvadACFNav1rav1ptchav0vadAdaptive filtering and energy computationsoftoneTonedetectionVADhangoveradditionVADdecisionThresholdadaptationFigure 2.1: Functional block diagram of the VAD The

46、global variables shown in the block diagram are described as follows: - ACF are auto-correlation coefficients which are calculated in the speech encoder defined in GSM 06.10 (subclause 3.1.4, see also clause A.1). The inputs to the speech encoder are 16 bit 2s complement numbers, as described in GSM

47、 06.10, subclause 4.2.0; - av0 and av1 are averaged ACF vectors; - rav1 are autocorrelated predictor values obtained from av1; - rvad are the autocorrelated predictor values of the adaptive filter; - N is the long term predictor lag value which is obtained every sub-segment in the speech coder defin

48、ed in GSM 06.10; - ptch indicates whether the signal has a steady periodic component; - sof is the offset compensated signal frame obtained in the speech coder defined in GSM 06.10; - pvad is the energy in the current frame of the input signal after filtering; - thvad is an adaptive threshold; - sta

49、t indicates spectral stationarity; - vvad indicates the VAD decision before hangover is added; - vad is the final VAD decision with hangover included. ETSI ETSI TS 146 032 V13.0.0 (2016-01)93GPP TS 46.032 version 13.0.0 Release 135.2.1 Adaptive filtering and energy computation Pvad is computed as follows: Pvad rvad acf rvad acfiii=+=00182 This corresponds to performing an 8th order block filtering on the input samples to the speech encoder, after zero offset compensation and pre-emphasis. This is explained in clause A.1. 5.2.2 ACF averaging Spectral ch

展开阅读全文
相关资源
猜你喜欢
相关搜索

当前位置:首页 > 标准规范 > 国际标准 > 其他

copyright@ 2008-2019 麦多课文库(www.mydoc123.com)网站版权所有
备案/许可证编号:苏ICP备17064731号-1