ETSI TR 126 943-2017 Digital cellular telecommunications system (Phase 2+) (GSM) Universal Mobile Telecommunications System (UMTS) LTE Recognition performance evaluations of codecs.pdf

上传人:proposalcash356 文档编号:736878 上传时间:2019-01-12 格式:PDF 页数:22 大小:129.83KB
下载 相关 举报
ETSI TR 126 943-2017 Digital cellular telecommunications system (Phase 2+) (GSM) Universal Mobile Telecommunications System (UMTS) LTE Recognition performance evaluations of codecs.pdf_第1页
第1页 / 共22页
ETSI TR 126 943-2017 Digital cellular telecommunications system (Phase 2+) (GSM) Universal Mobile Telecommunications System (UMTS) LTE Recognition performance evaluations of codecs.pdf_第2页
第2页 / 共22页
ETSI TR 126 943-2017 Digital cellular telecommunications system (Phase 2+) (GSM) Universal Mobile Telecommunications System (UMTS) LTE Recognition performance evaluations of codecs.pdf_第3页
第3页 / 共22页
ETSI TR 126 943-2017 Digital cellular telecommunications system (Phase 2+) (GSM) Universal Mobile Telecommunications System (UMTS) LTE Recognition performance evaluations of codecs.pdf_第4页
第4页 / 共22页
ETSI TR 126 943-2017 Digital cellular telecommunications system (Phase 2+) (GSM) Universal Mobile Telecommunications System (UMTS) LTE Recognition performance evaluations of codecs.pdf_第5页
第5页 / 共22页
点击查看更多>>
资源描述

1、 ETSI TR 126 943 V14.0.0 (2017-04) Digital cellular telecommunications system (Phase 2+) (GSM); Universal Mobile Telecommunications System (UMTS); LTE; Recognition performance evaluations of codecs for Speech Enabled Services (SES) (3GPP TR 26.943 version 14.0.0 Release 14) floppy3TECHNICAL REPORT E

2、TSI ETSI TR 126 943 V14.0.0 (2017-04)13GPP TR 26.943 version 14.0.0 Release 14Reference RTR/TSGS-0426943ve00 Keywords GSM,LTE,UMTS ETSI 650 Route des Lucioles F-06921 Sophia Antipolis Cedex - FRANCE Tel.: +33 4 92 94 42 00 Fax: +33 4 93 65 47 16 Siret N 348 623 562 00017 - NAF 742 C Association but

3、non lucratif enregistre la Sous-Prfecture de Grasse (06) N 7803/88 Important notice The present document can be downloaded from: http:/www.etsi.org/standards-search The present document may be made available in electronic versions and/or in print. The content of any electronic and/or print versions

4、of the present document shall not be modified without the prior written authorization of ETSI. In case of any existing or perceived difference in contents between such versions and/or in print, the only prevailing document is the print of the Portable Document Format (PDF) version kept on a specific

5、 network drive within ETSI Secretariat. Users of the present document should be aware that the document may be subject to revision or change of status. Information on the current status of this and other ETSI documents is available at https:/portal.etsi.org/TB/ETSIDeliverableStatus.aspx If you find

6、errors in the present document, please send your comment to one of the following services: https:/portal.etsi.org/People/CommiteeSupportStaff.aspx Copyright Notification No part may be reproduced or utilized in any form or by any means, electronic or mechanical, including photocopying and microfilm

7、except as authorized by written permission of ETSI. The content of the PDF version shall not be modified without the written authorization of ETSI. The copyright and the foregoing restriction extend to reproduction in all media. European Telecommunications Standards Institute 2017. All rights reserv

8、ed. DECTTM, PLUGTESTSTM, UMTSTMand the ETSI logo are Trade Marks of ETSI registered for the benefit of its Members. 3GPPTM and LTE are Trade Marks of ETSI registered for the benefit of its Members and of the 3GPP Organizational Partners. GSM and the GSM logo are Trade Marks registered and owned by t

9、he GSM Association. ETSI ETSI TR 126 943 V14.0.0 (2017-04)23GPP TR 26.943 version 14.0.0 Release 14Intellectual Property Rights IPRs essential or potentially essential to the present document may have been declared to ETSI. The information pertaining to these essential IPRs, if any, is publicly avai

10、lable for ETSI members and non-members, and can be found in ETSI SR 000 314: “Intellectual Property Rights (IPRs); Essential, or potentially Essential, IPRs notified to ETSI in respect of ETSI standards“, which is available from the ETSI Secretariat. Latest updates are available on the ETSI Web serv

11、er (https:/ipr.etsi.org/). Pursuant to the ETSI IPR Policy, no investigation, including IPR searches, has been carried out by ETSI. No guarantee can be given as to the existence of other IPRs not referenced in ETSI SR 000 314 (or the updates on the ETSI Web server) which are, or may be, or may becom

12、e, essential to the present document. Foreword This Technical Report (TR) has been produced by ETSI 3rd Generation Partnership Project (3GPP). The present document may refer to technical specifications or reports using their 3GPP identities, UMTS identities or GSM identities. These should be interpr

13、eted as being references to the corresponding ETSI deliverables. The cross reference between GSM, UMTS, 3GPP and ETSI identities can be found under http:/webapp.etsi.org/key/queryform.asp. Modal verbs terminology In the present document “should“, “should not“, “may“, “need not“, “will“, “will not“,

14、“can“ and “cannot“ are to be interpreted as described in clause 3.2 of the ETSI Drafting Rules (Verbal forms for the expression of provisions). “must“ and “must not“ are NOT allowed in ETSI deliverables except when used in direct citation. ETSI ETSI TR 126 943 V14.0.0 (2017-04)33GPP TR 26.943 versio

15、n 14.0.0 Release 14Contents Intellectual Property Rights 2g3Foreword . 2g3Modal verbs terminology 2g3Foreword . 4g3Introduction 4g31 Scope 5g32 References 5g33 Abbreviations . 5g34 General . 6g34.1 Project History 6g34.2 Overview of the speech recognition framework for automated voice services work

16、item . 8g34.3 Presentation of the following sections 8g35 Recommendation criteria . 8g35.1 Overview 8g35.2 Scoring on individual databases . 8g35.3 Performance metric over all databases . 9g35.4 Comparisons between codecs . 9g35.4.1 Low data-rate codec comparison 9g35.4.2 High data-rate codec compar

17、ison 9g35.4.2.1 8 kHz sampling rate 9g35.4.2.2 16 kHz sampling rate 9g35.5 Detailed recommendation comparisons 9g36 Performance evaluation method . 10g36.1 Introduction 10g36.2 Recognition engines . 11g36.2.1 Recognizer for speech codecs based proposals . 11g36.2.2 Training and testing 11g36.2.3 Rec

18、ognizer for DSR 11g36.2.4 Training and testing 11g36.3 Usage of VAD for frame dropping . 12g36.4 Codec evaluations. 12g36.4.1 Recognition experiments under error-free channel . 12g36.5 Recognition experiments under channel errors 14g37 Recognition Performance Evaluation Results 15g3Annex A: Key sele

19、ction phase documents 19g3Annex B: Change history 20g3History 21g3ETSI ETSI TR 126 943 V14.0.0 (2017-04)43GPP TR 26.943 version 14.0.0 Release 14Foreword This Technical Report has been produced by the 3rdGeneration Partnership Project (3GPP). The contents of the present document are subject to conti

20、nuing work within the TSG and may change following formal TSG approval. Should the TSG modify the contents of the present document, it will be re-released by the TSG with an identifying change of release date and an increase in version number as follows: Version x.y.z where: x the first digit: 1 pre

21、sented to TSG for information; 2 presented to TSG for approval; 3 or greater indicates TSG approved document under change control. y the second digit is incremented for all changes of substance, i.e. technical enhancements, corrections, updates, etc. z the third digit is incremented when editorial o

22、nly changes have been incorporated in the document. Introduction SA4 has been working on the selection of a codec to recommend for Speech Enabled Services since October 2002 under the WID for SES 9. The usual process of agreeing “design constrains“ 10, “test and processing plan“ 7 and “recommendatio

23、n criteria“ 8 was followed and completed before evaluating the candidates. Two candidate codecs were proposed and evaluated: 1. ETSI Standard for the DSR Extended Advanced Front-end (ES 202 212) 2. AMR and AMR-WB audio codec The performance evaluations were conducted by two leading companies in the

24、area of speech recognition, IBM and Scansoft. Results from these evaluations were presented at SA4#30 in February 2004 and are summarised here. The “recommendation criteria“ have been applied and SA4 recommends the DSR codec for Speech Enabled Services. SES codecs are introduced in packet switched c

25、onversational services in Technical Specifications 26.235 Stage 1“. 2 3GPP TR 22.977: “Feasibility study for speech enabled services“. 3 ETSI ES 202 050: “Distributed Speech Recognition; Advanced Front-end Feature Extraction Algorithm; Compression Algorithm“. 4 ETSI ES 202 212: “Distributed Speech R

26、ecognition; Extended Advanced Front-end Feature Extraction Algorithm; Compression Algorithm, Back-end Speech Reconstruction Algorithm“. 5 3GPP TS 26.235: “Packet switched conversational multimedia applications; Default codecs“. 6 3GPP TS 26.236: “Packet switched conversational multimedia application

27、s; Transport Protocols“. 7 TD S4-030543 “Test and Processing plan for default codec evaluation for speech enabled services (SES)“, SA4 8 TD SP-030440 “Recommendation Criteria for Default Codec for Speech Enabled Services (SES)“, TSG SA. 9 TD SP-020687 WID Codec Work to Support Speech Recognition Fra

28、mework for Automated Voice Services (Rel-6), TSG SA. 10 TD S4-030248 “Design Constraints for default codec for speech enabled services (SES)“, SA4. Note: Annex A lists all the key SA4 SES selection phase documents. Temporary Documents are attached to this specification in a separate .zip file. 3 Abb

29、reviations For the purposes of the present document, the following abbreviations apply: AFE Advanced Front-end AMR Adaptive Multi-Rate AMR-NB AMR Narrowband AMR-WB AMR Wideband BLER Block Error Rate DSR Distributed Speech Recognition ETSI ETSI TR 126 943 V14.0.0 (2017-04)63GPP TR 26.943 version 14.0

30、.0 Release 14EDGE Enhanced Data for GSM Evolution ETSI European Telecommunications Standards Institute GSM Global System for Mobile communications SES Speech Enabled Services SNR Signal To Noise Ratio VAD Voice Activity Detector X-AFE eXtended Advanced Front-end 4 General 4.1 Project History Table 1

31、 below shows the progress and timeline of the project. In particular the creation of permanent documents; identification of candidate codecs and test organisations; running of the performance evaluations by test organisations; selection at SA4; verification; and the approval of CRs and TS at SA. Key

32、 milestones are highlighted in bold. Table 1: SES project timeline Meeting Status of progress in activities SA4 #23 (30 Sept - 4 Oct 2002) square4 Draft WID and work plan SA4 #24 (11-15 Nov 2002) Permanent documents o Design Constraints V1.0 o Test & Processing Plan V0.8 o Recommendation Criteria V0

33、.1 Intermediate deadline on SA4 reflector 31.12.2002 Submission of specification of additional databases as candidate for testing as part of test and processing plan. Intermediate deadline on SA4 reflector 31.12.2002 square4 Any company which would possibly like to submit a candidate will indicate b

34、efore 31.12.2002. Later indications will not be considered. SA4 #25 (20-24 Jan 2003) square4 List of testing organisations square4 Permanent documents o Design Constraints V1.1 o Test Plan & Processing Plan V1.0 o Recommendation Criteria V0.3 SA4 #25 bis (24-28 Feb 2003) square4 List of testing orga

35、nisations (IBM & SpeechWorks) square4 List of candidate codecs (DSR X-AFE & AMR-NB/AMR-WB) square4 Permanent documents o Design Constraints V2.0 o Test Plan & Processing Plan V1.3 ETSI ETSI TR 126 943 V14.0.0 (2017-04)73GPP TR 26.943 version 14.0.0 Release 14o Recommendation Criteria V0.3 SA4 SQ SES

36、 ad-hoc 1-2 April 2003 Basingstoke, UK square4 Permanent documents o Test & Processing Plan V1.4 o Recommendation Criteria V0.3 SA4 #26 (5-9 May 2003) square4 Permanent documents o Test & Processing Plan V2.0 o Recommendation Criteria V0.6 SA4 #27 (7-11 July 2003) Approval of permanent docs o Test &

37、 Processing Plan V2.2 o Recommendation Criteria V2.0 ASR vendor evaluations start. Aug 2003 square4 ASR vendors start tests. Deliverables from candidates: (31 October 2003) square4 Fixed point complexity assessment square4 Drafts of new 3GPP TSs (for new codecs), or existing specifications for infor

38、mation (codecs already in standards) square4 Justification document of having met the Design Constraints SA4 #29 (24-28 Nov 2003) square4 Preparation for verification square4 Agree verification plan by correspondence (19 Dec) square4 Complete any legal agreements (NDAs) that are needed (15 Feb) squa

39、re4 Verification labs to obtain any databases needed (15 Feb) Informative speech quality listening tests square4 Nokia and Ericsson to supply listening test speech files to Motorola (5thDec) square4 Motorola to process listening test speech files supplied by Nokia and Ericsson (15 Jan) square4 Nokia

40、 and Ericsson conduct listening tests Completion of ASR vendor evaluations (31 Jan 2004) square4 Results from ASR vendor evaluations to ETSI representative SA4 #30 (23-27 Feb 2004) SES Selection meeting square4 Results from evaluator tests available square4 Make recommendation square4 Prepare TSs fo

41、r approval SA#23 ETSI ETSI TR 126 943 V14.0.0 (2017-04)83GPP TR 26.943 version 14.0.0 Release 14square4 Prepare CRs for approval SA#23 SES Verification (1 March) square4 Verification of selected codec (ST-Micro). square4 Discussion of results of verification conference call March. SA #23 (15-17 Marc

42、h 2004) square4 TSs for information square4 CRs for information SA4 #31 (17-21 May 2004) Verification report SA #24 (7-10 June 2004) square4 TSs approval (TS 26.243) CRs approval (TS 26.235 & TS 26.236) 4.2 Overview of the speech recognition framework for automated voice services work item The work

43、item covered the evaluation of candidate codecs for use in a speech recognition framework for automated voice services. The 3GPP speech recognition framework enables the use of conventional codecs (e.g. AMR) or DSR optimised codecs to distribute in the network the speech engines that process speech

44、input or generate speech output. The aim of the work item is, through objective evaluation, to recommend a single codec for speech enabled services based on a speech recognition framework. 4.3 Presentation of the following sections The following sections provide a summary of the Selection Phase test

45、 results, including the results of the objective performance measurements, and a record of other relevant information for the selected candidate algorithm. - Section 5 describes the Recommendation Criteria defined for the Selection Phase - Section 6 defines the means used to measure the performance

46、of each of the candidates - Section 7 summarises the recognition evaluation results 5 Recommendation criteria 5.1 Overview The set of databases used for the evaluations are defined in the Test and Processing Plan 7. Each of these databases contains different types of speech material covering a varie

47、ty of tasks, environments and languages. Recommendation was based on a score obtained from the recognition performance measured on each of these different databases. Section 5.3 describes how the scores from all the individual databases are combined using a weighting table. 5.2 Scoring on individual

48、 databases For each database the reference performance is measured as the word error rate obtained from the ASR vendors system. This is the performance obtained from a state-of-the-art system from the ASR vendor assuming a transparent channel. The performance (word error rate) on a given database is

49、 also measured with the ASR vendors system for a codec under test as described in the test and processing plan 7. ETSI ETSI TR 126 943 V14.0.0 (2017-04)93GPP TR 26.943 version 14.0.0 Release 14Scoring for tests performed with channel BLER were also computed in a similar way. Note that only BLER of 1% and 3% were considered as part of the recommendation criteria8. 5.3 Performance metric over all databases The overall performance was determined by averaging the absolute word error rate using the weightings presented in the detaile

展开阅读全文
相关资源
猜你喜欢
相关搜索

当前位置:首页 > 标准规范 > 国际标准 > 其他

copyright@ 2008-2019 麦多课文库(www.mydoc123.com)网站版权所有
备案/许可证编号:苏ICP备17064731号-1