1、 ETSI EG 202 396-2 V1.1.1 (2006-09)ETSI Guide Speech Processing, Transmission and Quality Aspects (STQ);Speech quality performancein the presence of background noise;Part 2: Background noise transmission - Network simulation -Subjective test database and resultsETSI ETSI EG 202 396-2 V1.1.1 (2006-09
2、) 2 Reference DEG/STQ-00038-2 Keywords noise, QoS, speech ETSI 650 Route des Lucioles F-06921 Sophia Antipolis Cedex - FRANCE Tel.: +33 4 92 94 42 00 Fax: +33 4 93 65 47 16 Siret N 348 623 562 00017 - NAF 742 C Association but non lucratif enregistre la Sous-Prfecture de Grasse (06) N 7803/88 Import
3、ant notice Individual copies of the present document can be downloaded from: http:/www.etsi.org The present document may be made available in more than one electronic version or in print. In any case of existing or perceived difference in contents between such versions, the reference version is the
4、Portable Document Format (PDF). In case of dispute, the reference shall be the printing on ETSI printers of the PDF version kept on a specific network drive within ETSI Secretariat. Users of the present document should be aware that the document may be subject to revision or change of status. Inform
5、ation on the current status of this and other ETSI documents is available at http:/portal.etsi.org/tb/status/status.asp If you find errors in the present document, please send your comment to one of the following services: http:/portal.etsi.org/chaircor/ETSI_support.asp Copyright Notification No par
6、t may be reproduced except as authorized by written permission. The copyright and the foregoing restriction extend to reproduction in all media. European Telecommunications Standards Institute 2006. All rights reserved. DECTTM, PLUGTESTSTM and UMTSTM are Trade Marks of ETSI registered for the benefi
7、t of its Members. TIPHONTMand the TIPHON logo are Trade Marks currently being registered by ETSI for the benefit of its Members. 3GPPTM is a Trade Mark of ETSI registered for the benefit of its Members and of the 3GPP Organizational Partners. ETSI ETSI EG 202 396-2 V1.1.1 (2006-09) 3 Contents Intell
8、ectual Property Rights5 Foreword.5 1 Scope 6 2 References 6 3 Abbreviations .8 4 Performance parameters.9 4.1 Overview 9 4.2 Performance key parameters 9 4.2.1 Delay.9 4.2.1.1 Codec delay.10 4.2.1.2 Packetization delay10 4.2.1.3 Output queuing delay 10 4.2.1.4 Serialization delay.10 4.2.1.5 Network
9、delay.11 4.2.1.5.1 Network switching delay.11 4.2.1.5.2 Propagation delay 11 4.2.1.6 De-jitter delay .11 4.2.2 Jitter 11 4.2.3 Packet loss 12 4.3 Parameter interaction and dependences12 5 Description of codecs features .13 5.1 Speech coding algorithm 13 5.1.1 Waveform codecs .13 5.1.2 Vocoders.13 5.
10、1.3 Hybrid codecs .13 5.2 Bit-rate13 5.2.1 Constant Bit Rate (CBR) 13 5.2.2 Variable Bit Rate (VBR).13 5.3 Discontinuous Transmission (DTX).14 5.3.1 Voice Activity Detection (VAD)14 5.3.2 Comfort Noise Generator (CNG)14 5.4 Packet Loss Concealment (PLC)14 5.5 Perceptual enhancement .14 6 Wideband co
11、decs14 6.1 Overview 14 6.2 Description of WideBand codecs .15 6.2.1 ITU-T Recommendation G.72215 6.2.1.1 Overview.15 6.2.1.2 Modes of operation .15 6.2.1.3 Encoder and decoder.15 6.2.1.4 Frame structure .16 6.2.2 AMR-WB .16 6.2.2.1 Overview.16 6.2.2.2 Modes of operation .17 6.2.2.3 Encoder and decod
12、er.17 6.2.2.4 Frame structure .18 6.2.3 Other wideband codecs.19 6.2.3.1 Speex WB .19 6.2.3.1.1 Overview .19 6.2.3.1.2 Modes of operation20 6.2.3.1.3 Encoder and decoder .20 6.2.3.1.4 Frame structure20 6.2.3.2 ITU-T Recommendation G.722.1 .21 ETSI ETSI EG 202 396-2 V1.1.1 (2006-09) 4 6.2.3.2.1 Overv
13、iew .21 6.2.3.2.2 Encoder and decoder .21 6.2.3.2.3 Modes of operation and frame size21 6.2.3.3 ITU-T Recommendation G.729 annex J (also known as G.729 EV) 21 6.2.3.4 L16 (ITU-T Recommendation H.245 Annex O).22 6.3 Wideband codec comparison22 7 Background noise transmission simulation22 7.1 General
14、description.22 7.2 Speech sequences .23 7.3 Noisy conditions.24 7.4 Noisy signal processing24 7.5 Network simulation 25 7.5.1 General description.25 7.5.2 Simulating network conditions .26 7.5.2.1 Delay and jitter emulation.27 7.5.2.2 Packet loss emulation27 7.5.3 Network conditions.28 7.5.3.1 Typic
15、al values of network conditions .28 7.6 Network simulation database .29 8 Speech sample database description and subjective scores production .29 8.1 Database description.29 8.2 Subjective scores collection .30 8.2.1 Requirements 30 8.2.2 Expert selection of samples for subjective testing 30 8.2.3 M
16、ethodology.31 8.2.4 STF 294 test results.32 9 Application of the material produced.35 9.1 Contribution of terminals and networks .35 9.2 Development of objective methods 36 Annex A (informative): Detailed STF 294 subjective test results.37 Annex B (informative): Complementary information on the prac
17、tical subjective test procedures 57 B.1 “Headphones“.57 B.2 Listening Levels .57 B.2.1 Mesaqin Laboratory .57 B.2.2 France Telecom Laboratory .57 B.2.3 Additional remarks.58 B.3 Sample Shortening / Extraction58 Annex C (informative): Instructions and Questions presented to listeners during subjectiv
18、e tests59 Annex D (informative): Bibliography.61 History 62 ETSI ETSI EG 202 396-2 V1.1.1 (2006-09) 5 Intellectual Property Rights IPRs essential or potentially essential to the present document may have been declared to ETSI. The information pertaining to these essential IPRs, if any, is publicly a
19、vailable for ETSI members and non-members, and can be found in ETSI SR 000 314: “Intellectual Property Rights (IPRs); Essential, or potentially Essential, IPRs notified to ETSI in respect of ETSI standards“, which is available from the ETSI Secretariat. Latest updates are available on the ETSI Web s
20、erver (http:/webapp.etsi.org/IPR/home.asp). Pursuant to the ETSI IPR Policy, no investigation, including IPR searches, has been carried out by ETSI. No guarantee can be given as to the existence of other IPRs not referenced in ETSI SR 000 314 (or the updates on the ETSI Web server) which are, or may
21、 be, or may become, essential to the present document. Foreword This ETSI Guide (EG) has been produced by ETSI Technical Committee Speech Processing, Transmission and Quality Aspects (STQ). The present document is a deliverable of ETSI Specialized Task Force (STF) 294 entitled: “Improving the qualit
22、y of eEurope WideBand (WB) speech applications by developing a standardized performance testing and evaluation methodology for background noise transmission “. The present document is part 2 of a multi-part deliverable covering Speech Quality performance in the presence of background noise, as ident
23、ified below: Part 1: “Background noise simulation technique and background noise database“; Part 2: “Background noise transmission -Network simulation - Subjective test database and results“; Part 3: “Background noise transmission - Objective test methods“. ETSI ETSI EG 202 396-2 V1.1.1 (2006-09) 6
24、1 Scope The present document aims at setting up and verifying a transmission network simulation environment using realistic network scenarios for laboratory use in the context of background noise transmission in WideBand (WB) audio conversational applications. Background noise is a problem in mostly
25、 all situations and conditions and needs to be taken into account in both terminals and networks. The document provides information about the transmission network impairments in packet based communication network since these effects tend to aggravate the consequences of background noise and need to
26、be considered carefully. The present document includes: Setup of simulation environment (network and signal processing in terminals) providing comparable network characteristics and traffic patterns as found in conditions of the real transmission networks, based on general information on network imp
27、airment types and codec features. Description of an example of network simulation database containing the results of applying some typical and realistic transmission network scenarios and traffic patterns in a selected variety of environments. Description of how to produce a speech sample database u
28、sing the setup given, and of the appropriate method to collect the corresponding subjective scores, for which an example of results, obtained for the purpose of ETSI STF 294, is given. The setup and the process mentioned above are meant to be applied on speech samples. Nevertheless, although the res
29、ulting speech database is described in the present document (see clause 8), its production is outside the scope of the present document. The setup, network simulation database and subjective test results as described in the document are applicable for: Simulation of network impairments in general (c
30、oncerns the setup only). Evaluation of the contribution of background noise performance of terminals and networks to the perceived overall quality. Development of objective method for the quantification of background noise transmission performance (as it is developed in EG 202 396-3). 2 References T
31、he following documents contain provisions which, through reference in this text, constitute provisions of the present document. References are either specific (identified by date of publication and/or edition number or version number) or non-specific. For a specific reference, subsequent revisions d
32、o not apply. For a non-specific reference, the latest version applies. Referenced documents which are not found to be publicly available in the expected location might be found at http:/docbox.etsi.org/Reference. 1 ITU-T Recommendation G.1010: “End-user multimedia QoS categories“. 2 ETSI TR 101 329-
33、1: “Telecommunications and Internet Protocol Harmonization Over Networks (TIPHON) Release 3; End-to-end Quality of Service in TIPHON systems; Part 1: General aspects of Quality of Service (QoS)“. 3 ETSI TR 101 329-6: “Telecommunications and Internet Protocol Harmonization Over Networks (TIPHON) Rele
34、ase 3; End-to-end Quality of Service in TIPHON systems; Part 6: Actual measurements of network and terminal characteristics and performance parameters in TIPHON networks and their influence on voice quality“. ETSI ETSI EG 202 396-2 V1.1.1 (2006-09) 7 4 ITU-T Recommendation G.114: “One-way transmissi
35、on time“. 5 ETSI TR 101 329-7: “Telecommunications and Internet Protocol Harmonization Over Networks (TIPHON) Release 3; End-to-end Quality of Service in TIPHON systems; Part 7: Design guide for elements of a TIPHON connection from an end-to-end speech transmission performance point of view“. 6 ITU-
36、T Recommendation G.722: “7 kHz audio-coding within 64 kbit/s“. 7 ITU-T Recommendation G.726: “40, 32, 24, 16 kbit/s Adaptive Differential Pulse Code Modulation (ADPCM)“. 8 ETSI TS 126 171: “Digital cellular telecommunications system (Phase 2+); Universal Mobile Telecommunications System (UMTS); AMR
37、speech codec, wideband; General description (3GPP TS 26.171 version 6.0.0 Release 6)“. 9 ITU-T Recommendation G.722.2: “Wideband coding of speech at around 16 kbit/s using Adaptive Multi-Rate Wideband (AMR-WB)“. 10 ETSI TS 126 071: “Digital cellular telecommunications system (Phase 2+); Universal Mo
38、bile Telecommunications System (UMTS); AMR speech Codec; General description (3GPP TS 26.071 Release 6)“. 11 ETSI EN 300 726: “Digital cellular telecommunications system (Phase 2+) (GSM); Enhanced Full Rate (EFR) speech transcoding (GSM 06.60 Release 1999)“. 12 ITU-T Recommendation G.729: “Coding of
39、 speech at 8 kbit/s using conjugate-structure algebraic-code-excited linear-prediction (CS-ACELP)“. 13 ITU-T Recommendation G.723.1: “Dual rate speech coder for multimedia communications transmitting at 5.3 and 6.3 kbit/s“. 14 ETSI TS 126 191: “Digital cellular telecommunications system (Phase 2+);
40、Universal Mobile Telecommunications System (UMTS); AMR speech codec, wideband; Error concealment of lost frames (3GPP TS 26.191 Release 6)“. 15 ETSI TS 126 192: “Digital cellular telecommunications system (Phase 2+); Universal Mobile Telecommunications System (UMTS); Mandatory Speech Codec speech pr
41、ocessing functions AMR Wideband Speech Codec; Comfort noise aspects (3GPP TS 26.192 Release 6)“. 16 ETSI TS 126 193: “Digital cellular telecommunications system (Phase 2+); Universal Mobile Telecommunications System (UMTS); Speech codec speech processing functions; Adaptive Multi-Rate - Wideband (AM
42、R-WB) speech codec; Source controlled rate operation (3GPP TS 26.193 Release 6)“. 17 ETSI TS 126 194: “Digital cellular telecommunications system (Phase 2+); Universal Mobile Telecommunications System (UMTS); Mandatory Speech Codec speech processing functions AMR Wideband speech codec; Voice Activit
43、y Detector (VAD) (3GPP TS 26.194 Release 6)“. 18 ETSI TS 126 201: “Digital cellular telecommunications system (Phase 2+); Universal Mobile Telecommunications System (UMTS); AMR speech codec, wideband; Frame structure (3GPP TS 26.201 Release 6)“. 19 ITU-T Recommendation G.722.1: “Low-complexity codin
44、g at 24 and 32 kbit/s for hands-free operation in systems with low frame loss“. 20 IETF RFC 1890: “RTP Profile for Audio and Video Conferences with Minimal Control“. 21 ITU-T Recommendation P.56: “Objective measurement of active speech level“. 22 ETSI EG 202 396-1: “Speech processing, Transmission a
45、nd Quality Aspects (STQ); Speech Quality performance in the presence of background noise Part 1: Background noise simulation technique and background noise database“. 23 ITU-T Recommendation P.341: “Transmission characteristics for wideband (150-7000 Hz) digital hands-free telephony terminals“. ETSI
46、 ETSI EG 202 396-2 V1.1.1 (2006-09) 8 24 ITU-T Recommendation G.191: “Software tools for speech and audio coding standardization“. 25 ITU-T Recommendation P.835: “Subjective test methodology for evaluating speech communication systems that include noise suppression algorithm“. 26 Mattila, V.: “Objec
47、tive Measures for the Characterization of the Basic Functioning of Noise Suppression Algorithms“, MESAQIN 2003, Measurement of Speech and Audio Quality in Networks, May 2003, Prague, ISBN: 80-01-02822-4, pp. 5-15. 27 ITU-T Recommendation P.831: “Subjective performance evaluation of network echo canc
48、ellers“. 28 ITU-T Recommendation P.800: “Methods for subjective determination of transmission quality“. 29 ITU-T Recommendation H.245: “Control protocol for multimedia communication, annex O“. 30 ITU-T Recommendation P.57: “Artificial ears“. 31 ITU-T Recommendation P.58: “Head and torso simulator fo
49、r telephonometry“. 3 Abbreviations For the purposes of the present document, the following abbreviations apply: ABR Average Bit Rate ACELP Algebraic Code Excited Linear Prediction coder AMR-NB Adaptive Multi-Rate-NarrowBand AMR-WB Adaptive Multi-Rate-WideBand CBR Constant Bit Rate CELP Code Excited Linear Prediction CI Confidence Interval CNG Comfort Noise Generator DSL Digital Subscriber Line DTMF Dual Tone Multi Frequency DTX Discontinuous TransmissionEFR Enhanced Full Rate FSF Free Software Found