1、 ETSI TS 103 281 V1.2.1 (2018-01) Speech and multimedia Transmission Quality (STQ); Speech quality in the presence of background noise: Objective test methods for super-wideband and fullband terminals floppy3TECHNICAL SPECIFICATION ETSI ETSI TS 103 281 V1.2.1 (2018-01)2 Reference RTS/STQ-265 Keyword
2、s noise, quality, speech, testing, transmission ETSI 650 Route des Lucioles F-06921 Sophia Antipolis Cedex - FRANCE Tel.: +33 4 92 94 42 00 Fax: +33 4 93 65 47 16 Siret N 348 623 562 00017 - NAF 742 C Association but non lucratif enregistre la Sous-Prfecture de Grasse (06) N 7803/88 Important notice
3、 The present document can be downloaded from: http:/www.etsi.org/standards-search The present document may be made available in electronic versions and/or in print. The content of any electronic and/or print versions of the present document shall not be modified without the prior written authorizati
4、on of ETSI. In case of any existing or perceived difference in contents between such versions and/or in print, the only prevailing document is the print of the Portable Document Format (PDF) version kept on a specific network drive within ETSI Secretariat. Users of the present document should be awa
5、re that the document may be subject to revision or change of status. Information on the current status of this and other ETSI documents is available at https:/portal.etsi.org/TB/ETSIDeliverableStatus.aspx If you find errors in the present document, please send your comment to one of the following se
6、rvices: https:/portal.etsi.org/People/CommiteeSupportStaff.aspx Copyright Notification No part may be reproduced or utilized in any form or by any means, electronic or mechanical, including photocopying and microfilm except as authorized by written permission of ETSI. The content of the PDF version
7、shall not be modified without the written authorization of ETSI. The copyright and the foregoing restriction extend to reproduction in all media. ETSI 2018. All rights reserved. DECTTM, PLUGTESTSTM, UMTSTMand the ETSI logo are trademarks of ETSI registered for the benefit of its Members. 3GPPTM and
8、LTE are trademarks of ETSI registered for the benefit of its Members and of the 3GPP Organizational Partners. oneM2M logo is protected for the benefit of its Members. GSM and the GSM logo are trademarks registered and owned by the GSM Association. ETSI ETSI TS 103 281 V1.2.1 (2018-01)3 Contents Inte
9、llectual Property Rights 6g3Foreword . 6g3Modal verbs terminology 6g31 Scope 7g32 References 7g32.1 Normative references . 7g32.2 Informative references 8g33 Abbreviations . 10g34 Introduction 11g35 Underlying speech databases and preparations 11g36 Model descriptions . 11g36.1 Introduction 11g36.2
10、Common definitions . 12g36.3 Model A . 12g36.3.1 Introduction. 12g36.3.2 Pre-Processing 12g36.3.3 Spectral transformation . 13g36.3.4 Non-linear loudness transformation 16g36.3.5 Instrumental assessment of N-MOS . 17g36.3.5.1 Introduction . 17g36.3.5.2 Loudness-based features . 17g36.3.5.3 Sharpness
11、-based feature 17g36.3.6 Reference optimization and asymmetry 18g36.3.6.1 Introduction . 18g36.3.6.2 Reference optimization . 19g36.3.6.3 Masking of inaudible differences 19g36.3.6.4 Asymmetry 19g36.3.7 Instrumental assessment of S-MOS 20g36.3.7.1 Introduction . 20g36.3.7.2 Modulation-based features
12、 20g36.3.7.3 Spectral difference features . 20g36.3.7.4 Control parameters 21g36.3.7.5 Combination of features 22g36.3.8 Instrumental assessment of G-MOS . 22g36.4 Model B 23g36.4.1 Overview 23g36.4.2 Operational Modes 24g36.4.3 Temporal Alignment . 24g36.4.4 Voice Activity Detection (VAD) and segme
13、nt classification . 25g36.4.5 Auditory Model 25g36.4.5.1 Introduction . 25g36.4.5.2 Ear Canal model 26g36.4.5.3 Middle Ear model 26g36.4.5.4 Hydro-mechanical cochlear model 27g36.4.5.5 Hair Cell transduction model 27g36.4.5.6 Outer Hair motility model . 28g36.4.6 Feature Extraction . 28g36.4.6.1 Int
14、roduction . 28g36.4.6.2 Salient Formant Points (SFP) feature extraction . 28g36.4.6.3 COSM (Cochlear Output Statistic Metric) feature extraction . 30g36.4.7 Training and mapping . 34g36.5 Mapping of model outputs . 35g3ETSI ETSI TS 103 281 V1.2.1 (2018-01)4 7 Comparison of objective and subjective r
15、esults after the training process . 36g37.1 Introduction 36g37.2 Results for Model A . 36g37.3 Results for Cochlear Prediction Model (Model B) . 41g38 Validation results 45g38.1 Introduction 45g38.2 Validation database 1 (DES-17) . 45g38.2.1 Database description . 45g38.2.2 Validation database 1: Re
16、sults for model B 46g38.3 Validation database 2 (DES-20) . 48g38.3.1 Database description . 48g38.3.2 Validation database 2: Results for model A 48g38.4 Validation database 3 (DES-25) . 50g38.4.1 Database description . 50g38.4.2 Validation database 3: Results for model A 51g38.4.3 Validation databas
17、e 3: Results for model B 52g38.5 Validation database 4 (DES-26) . 54g38.5.1 Database description . 54g38.5.2 Validation database 4: Results for model A 54g38.5.3 Validation database 4: Results for model B 56g38.6 Validation database 5 (DES-27) . 58g38.6.1 Database description . 58g38.6.2 Validation
18、database 5: Results for model A 59g38.6.3 Validation database 5: Results for model B 61g39 Application of the models 63g39.1 Introduction 63g39.2 Speech material 63g39.3 Positioning of the device under test 63g39.4 Background noise playback 64g39.5 Recording and calibration procedure 64g39.6 Running
19、 the prediction models . 64g3Annex A (normative): Model configuration files . 65g3A.1 Introduction 65g3A.2 Model A 65g3A.3 Model B 65g3Annex B (normative): Summary of Training Databases 67g3Annex C (normative): Test vectors for model verification . 69g3Annex D (informative): Subjective testing frame
20、work . 70g3D.1 Introduction 70g3D.2 Subjective test plan . 70g3D.2.1 Traceability. 70g3D.2.2 Speech database requirements 70g3D.2.3 Reference Conditions . 70g3D.2.4 Test Conditions 70g3D.2.5 Post-processing of test conditions 71g3D.2.6 Calibration and equalization of headphones for presentation . 72
21、g3D.2.7 Requirements on the listening laboratory . 72g3D.2.8 Experimental design . 73g3D.2.9 Training session 73g3D.3 Set-up for acquisition of test conditions . 73g3D.3.1 Terminal positioning and HATS calibration 73g3D.3.2 Background Noise reproduction . 74g3D.3.3 Noise and speech playback synchron
22、ization 74g3ETSI ETSI TS 103 281 V1.2.1 (2018-01)5 D.3.4 Convergence sequence . 74g3D.3.5 Example of noise and speech playback sequence including convergence period 74g3D.3.6 Recordings at the network simulator electrical reference point 75g3D.3.7 Recordings at the MRP and terminals primary micropho
23、ne location 75g3Annex E (normative): Speech material to be used for objective testing . 76g3History 78g3ETSI ETSI TS 103 281 V1.2.1 (2018-01)6 Intellectual Property Rights Essential patents IPRs essential or potentially essential to the present document may have been declared to ETSI. The informatio
24、n pertaining to these essential IPRs, if any, is publicly available for ETSI members and non-members, and can be found in ETSI SR 000 314: “Intellectual Property Rights (IPRs); Essential, or potentially Essential, IPRs notified to ETSI in respect of ETSI standards“, which is available from the ETSI
25、Secretariat. Latest updates are available on the ETSI Web server (https:/ipr.etsi.org/). Pursuant to the ETSI IPR Policy, no investigation, including IPR searches, has been carried out by ETSI. No guarantee can be given as to the existence of other IPRs not referenced in ETSI SR 000 314 (or the upda
26、tes on the ETSI Web server) which are, or may be, or may become, essential to the present document. Trademarks The present document may include trademarks and/or tradenames which are asserted and/or registered by their owners. ETSI claims no ownership of these except for any which are indicated as b
27、eing the property of ETSI, and conveys no right to use or reproduce any trademark and/or tradename. Mention of those trademarks in the present document does not constitute an endorsement by ETSI of products, services or organizations associated with those trademarks. Foreword This Technical Specific
28、ation (TS) has been produced by ETSI Technical Committee Speech and multimedia Transmission Quality (STQ). The present document is to be used in conjunction with: ETSI ES 202 396-1 i.1: “Background noise simulation technique and background noise database“; and ETSI TS 103 224 i.19 series: “A sound f
29、ield reproduction method for terminal testing including a background noise database“. The present document describes an objective test method for super-wideband and fullband in order to provide a good prediction of the uplink speech quality in the presence of background noise of modern mobile termin
30、als in hand-held and hands-free. Modal verbs terminology In the present document “shall“, “shall not“, “should“, “should not“, “may“, “need not“, “will“, “will not“, “can“ and “cannot“ are to be interpreted as described in clause 3.2 of the ETSI Drafting Rules (Verbal forms for the expression of pro
31、visions). “must“ and “must not“ are NOT allowed in ETSI deliverables except when used in direct citation. ETSI ETSI TS 103 281 V1.2.1 (2018-01)7 1 Scope The present document describes testing methodologies which can be used to objectively evaluate the performance of super-wideband and fullband mobil
32、e terminals for speech communication in the presence of background noise. Background noise is a problem in mostly all situations and conditions and needs to be taken into account in terminal design. The present document provides information about the testing methods applicable to objectively evaluat
33、e the speech quality of mobile terminals (including any state-of-the-art codecs) employing background noise suppression in the presence of background noise. The present document includes: The method which is applicable to objectively determine the different parameters influencing the speech quality
34、in the presence of background noise taking into account: - the speech quality; - the background noise transmission quality; - the overall quality. The model results in comparison with the underlying subjective tests used for the training of the objective model. The underlying languages are: American
35、 English, German, Chinese (Mandarin). The model validation results. The present document is to be used in conjunction with: ETSI ES 202 396-1 i.1 which describes a recording and reproduction setup for realistic simulation of background noise scenarios in lab-type environments for the performance eva
36、luation of terminals and communication systems. ETSI TS 103 224 i.19 which describes a sound field reproduction method for terminal testing including a background noise database with background noise scenarios to be used in lab-type environments for the performance evaluation of terminals and commun
37、ication systems. American English speech sentences as enclosed in the present document. 2 References 2.1 Normative references References are either specific (identified by date of publication and/or edition number or version number) or non-specific. For specific references, only the cited version ap
38、plies. For non-specific references, the latest version of the referenced document (including any amendments) applies. Referenced documents which are not found to be publicly available in the expected location might be found at https:/docbox.etsi.org/Reference/. NOTE: While any hyperlinks included in
39、 this clause were valid at the time of publication, ETSI cannot guarantee their long term validity. The following referenced documents are necessary for the application of the present document. Not applicable. ETSI ETSI TS 103 281 V1.2.1 (2018-01)8 2.2 Informative references References are either sp
40、ecific (identified by date of publication and/or edition number or version number) or non-specific. For specific references, only the cited version applies. For non-specific references, the latest version of the referenced document (including any amendments) applies. NOTE: While any hyperlinks inclu
41、ded in this clause were valid at the time of publication, ETSI cannot guarantee their long term validity. The following referenced documents are not necessary for the application of the present document but they assist the user with regard to a particular subject area. i.1 ETSI ES 202 396-1: “Speech
42、 and multimedia Transmission Quality (STQ); Speech quality performance in the presence of background noise; Part 1: Background noise simulation technique and background noise database“. i.2 ETSI EG 202 396-3: “Speech and multimedia Transmission Quality (STQ); Speech Quality performance in the presen
43、ce of background noise Part 3: Background noise transmission - Objective test methods“. i.3 ETSI TS 103 106: “Speech and multimedia Transmission Quality (STQ); Speech quality performance in the presence of background noise: Background noise transmission for mobile terminals-objective test methods“.
44、i.4 ETSI TS 126 441: “Universal Mobile Telecommunications System (UMTS); LTE; Codec for Enhanced Voice Services (EVS); General overview (3GPP TS 26.441)“. i.5 Recommendation ITU-T P.835: “Subjective test methodology for evaluating speech communication systems that include noise suppression algorithm
45、“. i.6 Internet Engineering Task Force, Request for Comments 6716: “Definition of the Opus Audio Codec“, 09/2012. i.7 Recommendation ITU-T P.56: “Objective measurement of active speech level“. i.8 Recommendation ITU-T P.1401: “Methods, metrics and procedures for statistical evaluation, qualifying an
46、d comparison of objective quality prediction models“. i.9 Recommendation ITU-T G.160 Appendix II, Amendment 2: “Voice enhancement devices: Revised Appendix II - Objective measures for the characterization of the basic functioning of noise reduction algorithms“. i.10 Recommendation ITU-T P.501: “Test
47、 Signals for Use in Telephonometry“. i.11 Recommendation ITU-T P.58: “Head and Torso simulator for telephonometry“. i.12 Recommendation ITU-T P.57: “Artificial ears“. i.13 Recommendation ITU-T P.800: “Methods for subjective determination of transmission quality“. i.14 ETSI TS 126 132: “Universal Mob
48、ile Telecommunications System (UMTS); LTE; Speech and video telephony terminal acoustic test specification (3GPP TS 26.132)“. i.15 Recommendation ITU-T TD 477 (GEN/12): “Handbook of subjective test practical procedures“ (temporary document) - Geneva, 18-27 January 2011. i.16 AH-11-029, Better Refere
49、nce System for the P.835 SIG Rating Scale, Q7/12 Rapporteurs meeting, 20-21 June 2011, Geneva, Switzerland. i.17 3GPP, Tdoc S4(16)0397: “DESUDAPS-1: Common subjective testing framework for training and validation of SWB and FB P.835 test predictors“. i.18 Recommendation ITU-T P.64: “Determination of sensitivity/frequency characteristics of local telephone systems“. ETSI ETSI TS 103 281 V1.2.1 (2018-01)9 i.19 ETSI TS 103 224: “Speech and multimedia Transmission Quality (STQ); A sound field reproduction method for