1、 ETSI TR 103 138 V1.4.1 (2016-09) Speech and multimedia Transmission Quality (STQ); Speech samples and their use for QoS testing floppy3TECHNICAL REPORT ETSI ETSI TR 103 138 V1.4.1 (2016-09) 2 Reference RTR/STQ-00215m Keywords QoS, quality, speech ETSI 650 Route des Lucioles F-06921 Sophia Antipolis
2、 Cedex - FRANCE Tel.: +33 4 92 94 42 00 Fax: +33 4 93 65 47 16 Siret N 348 623 562 00017 - NAF 742 C Association but non lucratif enregistre la Sous-Prfecture de Grasse (06) N 7803/88 Important notice The present document can be downloaded from: http:/www.etsi.org/standards-search The present docume
3、nt may be made available in electronic versions and/or in print. The content of any electronic and/or print versions of the present document shall not be modified without the prior written authorization of ETSI. In case of any existing or perceived difference in contents between such versions and/or
4、 in print, the only prevailing document is the print of the Portable Document Format (PDF) version kept on a specific network drive within ETSI Secretariat. Users of the present document should be aware that the document may be subject to revision or change of status. Information on the current stat
5、us of this and other ETSI documents is available at https:/portal.etsi.org/TB/ETSIDeliverableStatus.aspx If you find errors in the present document, please send your comment to one of the following services: https:/portal.etsi.org/People/CommiteeSupportStaff.aspx Copyright Notification No part may b
6、e reproduced or utilized in any form or by any means, electronic or mechanical, including photocopying and microfilm except as authorized by written permission of ETSI. The content of the PDF version shall not be modified without the written authorization of ETSI. The copyright and the foregoing res
7、triction extend to reproduction in all media. European Telecommunications Standards Institute 2016. All rights reserved. DECTTM, PLUGTESTSTM, UMTSTMand the ETSI logo are Trade Marks of ETSI registered for the benefit of its Members. 3GPPTM and LTE are Trade Marks of ETSI registered for the benefit o
8、f its Members and of the 3GPP Organizational Partners. GSM and the GSM logo are Trade Marks registered and owned by the GSM Association. ETSI ETSI TR 103 138 V1.4.1 (2016-09) 3 Contents Intellectual Property Rights 4g3Foreword . 4g3Modal verbs terminology 4g3Introduction 4g31 Scope 5g32 References 5
9、g32.1 Normative references . 5g32.2 Informative references 5g33 Abbreviations . 6g34 Devices and network access . 6g34.1 Mobile devices . 6g34.2 ISDN/PSTN 7g34.3 Test scenarios . 7g34.3.1 General aspects . 7g34.3.2 Narrowband telephony and narrowband test scenario 7g34.3.3 Wideband telephony and sup
10、er-wideband test scenario . 8g35 Speech samples 9g35.1 General aspects . 9g35.2 Pre-filtering of speech signals 9g35.2.1 Emulation of handsets . 9g35.2.2 Filter for narrow-band test scenarios 9g35.2.2.1 IRS send Filter 9g35.2.2.2 MSIN Filter. 10g35.2.2.3 Recommended filters to use in narrowband mobi
11、le test scenarios 11g35.2.3 Filter for wideband and super-wideband telephony test scenarios 11g35.2.3.1 14 kHz bandpass . 11g35.2.3.2 Recommendation ITU-T P.341 . 11g35.2.3.3 Recommended filters to use in super-wideband mobile test scenarios . 11g35.2.4 Reference signals 12g35.3 Audio level . 12g35.
12、3.1 Nominal level 12g35.3.2 Level adjustment with Recommendation ITU-T P.56 12g35.3.3 Input level at test devices 12g36 Scenarios 13g36.1 Narrowband-Measurement Land to Mobile . 13g36.2 Narrowband-Measurement Mobile to Land . 13g36.3 Mobile to Mobile 14g36.3.1 Narrowband 14g36.3.2 Wideband and super
13、-wideband . 14g37 Synopsis . 15g3Annex A: Coefficients for the reconstruction lowpass filter 16g3Annex B: Bibliography 17g3Annex C: Speech Samples . 18g3C.1 Introduction 18g3C.2 Design. 18g3C.3 Example results 18g3C.4 Technical specification . 20g3History 22g3ETSI ETSI TR 103 138 V1.4.1 (2016-09) 4
14、Intellectual Property Rights IPRs essential or potentially essential to the present document may have been declared to ETSI. The information pertaining to these essential IPRs, if any, is publicly available for ETSI members and non-members, and can be found in ETSI SR 000 314: “Intellectual Property
15、 Rights (IPRs); Essential, or potentially Essential, IPRs notified to ETSI in respect of ETSI standards“, which is available from the ETSI Secretariat. Latest updates are available on the ETSI Web server (https:/ipr.etsi.org/). Pursuant to the ETSI IPR Policy, no investigation, including IPR searche
16、s, has been carried out by ETSI. No guarantee can be given as to the existence of other IPRs not referenced in ETSI SR 000 314 (or the updates on the ETSI Web server) which are, or may be, or may become, essential to the present document. Foreword This Technical Report (TR) has been produced by ETSI
17、 Technical Committee Speech and multimedia Transmission Quality (STQ). Modal verbs terminology In the present document “should“, “should not“, “may“, “need not“, “will“, “will not“, “can“ and “cannot“ are to be interpreted as described in clause 3.2 of the ETSI Drafting Rules (Verbal forms for the e
18、xpression of provisions). “must“ and “must not“ are NOT allowed in ETSI deliverables except when used in direct citation. Introduction Conducting drive test in multi technology environment presents a challenge to all parties. And the complexity and variance of the different scenarios need to be brok
19、en down to handy instructions for those who actually configure and conduct the measurements, such as Network Operators, Service Providers, Equipment Vendors and Regulatory Authorities. ETSI ETSI TR 103 138 V1.4.1 (2016-09) 5 1 Scope The present document introduces and explains the use and applicatio
20、n of speech samples to determine the objective listening quality (LQO) in narrowband (NB), wideband (WB) and super-wideband (SWB) for different scenarios such as connections between fixed networks and mobile terminals. 2 References 2.1 Normative references Normative references are not applicable in
21、the present document. 2.2 Informative references References are either specific (identified by date of publication and/or edition number or version number) or non-specific. For specific references, only the cited version applies. For non-specific references, the latest version of the reference docum
22、ent (including any amendments) applies. NOTE: While any hyperlinks included in this clause were valid at the time of publication, ETSI cannot guarantee their long term validity. The following referenced documents are not necessary for the application of the present document but they assist the user
23、with regard to a particular subject area. i.1 Recommendation ITU-T P.48: “Specification for an intermediate reference system“. i.2 Recommendation ITU-T P.800: “Methods for subjective determination of transmission quality“. i.3 Recommendation ITU-T P.830: “Subjective performance assessment of telepho
24、ne-band and wideband digital codecs“. i.4 Recommendation ITU-T P.862: “Perceptual evaluation of speech quality (PESQ): An objective method for end-to-end speech quality assessment of narrow-band telephone networks and speech codecs“. i.5 Recommendation ITU-T P.862.1: “Mapping function for transformi
25、ng P.862 raw result scores to MOS-LQO“. i.6 Recommendation ITU-T P.862.2: “Wideband extension to Recommendation P.862 for the assessment of wideband telephone networks and speech codecs“. i.7 Recommendation ITU-T P.862.3: “Application guide for objective quality measurement based on Recommendations
26、P.862, P.862.1 and P.862.2“. i.8 Recommendation ITU-T P.863: “Perceptual objective listening quality assessment (POLQA)“. i.9 Recommendation ITU-T P.863.1: “Application Guide for the Recommendation ITU-T P.863“. i.10 Recommendation ITU-T G.711: “Pulse code modulation (PCM) of voice frequencies“. i.1
27、1 Recommendation ITU-T G.191: “Software tools for speech and audio coding standardization“. i.12 Recommendation ITU-T P.341: “Transmission characteristics for wideband digital loudspeaking and hands-free telephony terminals“. i.13 Recommendation ITU-T P.56: “Objective measurement of active speech le
28、vel“. i.14 Recommendation ITU-T P.501: “Test signals for use in telephonometry“. ETSI ETSI TR 103 138 V1.4.1 (2016-09) 6 3 Abbreviations For the purposes of the present document, the following abbreviations apply: AMR Adaptive Multi-Rate codec AMR-WB Adaptive Multi-Rate codec Wide Band ASL Active Sp
29、eech Level EFR Enhance Full Rate codec EVS Enhanced Voice Services, speech codec FIR Finite Impulse Response filter IRS Intermediate Reference SystemISDN Integrated Services Digital Network LQO Listening Quality Objective MOS Mean Opinion Score MSIN Mobile Station Input filter NB Narrow Band NTP Net
30、work Terminating Point OVL Overload point PBX Private Branch Exchange PC Personal Computer PCM Pulse Code Modulation PSTN Public Switch Telecommunication Network SWB Super Wide Band VoLTE Voice over LTE WB Wide Band 4 Devices and network access 4.1 Mobile devices There are only a few devices and acc
31、ess interfaces that play a role in end-to-end mobile network testing. In end-to-end testing a test connection between two endpoints is established. This determines the access interfaces and devices. The mobile device is not a pure access device to the mobile network. It contains complex components f
32、or speech processing and becomes therefore an important contributor to the overall quality measured in the test connection. Mobile devices do not have a standardized audio interface, neither digital nor analogue. As common practice the headset connector of the mobile device is used as access interfa
33、ce for audio insertion and capturing. As a pre-condition for audio insertion and capturing, the measurement equipment has to match to the devices headset connector in impedance and level. It has to be noted that in this setup the mobile devices are used in headset mode. Devices apply individual audi
34、o profiles, means individual settings in filtering, amplification and noise- and echo treatment for connected headphones or the use of the internal microphone. Often there is a third mode that applies when a handsfree loudspeaker set is connected. Since the audio processing in headphone mode is diff
35、erent from the use of internal microphone, such a test connection emulates a user with a headphone (personal handsfree kit) connected by wire to the headphone connector. ETSI ETSI TR 103 138 V1.4.1 (2016-09) 7 4.2 ISDN/PSTN ISDN or (analogue) PSTN interfaces are not directly belonging to the mobile
36、network but they are usually used as defined endpoint of the test connection. As access point to the ISDN or PSTN network a real consumer telephone device is not used but rather an ISDN or PSTN interface module as e.g. a PC card. It enables an electrical connection to the network for audio transmiss
37、ion and processes all the signalling information. The interface module or PC card is usually accessed with a digitalized speech signal in PCM format. The format is preferably 16 bit or 13 bit linear PCM sampled at 8 kHz or 16 kHz. Some interfaces expect 8 bit A-Law PCM that can be used in case of IS
38、DN but is not recommended for PSTN, since it will cause an additional A-Law compression step in the test connection. NOTE: The A-Law signal would be decompressed and fed as analogue signal in the local loop, where the regular A-Law compression will be at the digital NTP or the PBX. Today, ISDN/PSTN
39、channels are narrow-band only. Thus, a transmission to an ISDN/PSTN end-point is always restricted to narrowband despite that the airlink can use AMR-WB. The transition to narrowband is part of the gateway to the ISDN/PSTN. 4.3 Test scenarios 4.3.1 General aspects The analogue circuits of almost all
40、 mobile devices are able to process wideband or fullband speech. Whether a call is transmitting narrowband or wideband or above speech depends on the wideband coding capability of the phone, the network and call setup. The subscriber cannot control whether the phone connects in narrowband, in wideba
41、nd or in super wideband. The established channel determines the transmission bandwidth of the channel that can be narrowband, wideband, super-wideband or even fullband. 4.3.2 Narrowband telephony and narrowband test scenario The conventional narrowband or normal-band telephony is traditionally using
42、 a pass-band from 300 Hz to 3 400 Hz. In digital transmission the technical limit is given by the Nyquist frequency due to sampling at 4 kHz upper audio transmission limit; there is no limit at the lower boundary. Todays narrowband speech codecs as EFR or AMR are also able to encode an audio band up
43、 to 4 kHz. Despite that fact, in practice a dedicated filtering is applied to the signal. Usually, there is a bandpass that is wider than the traditional pass-band but still limiting at the lower and upper range. The actual transmission characteristic is depending on the phone manufacturer and the s
44、etting of the phone. There are no binding limits or characteristics. Testing narrowband is not tied to a narrowband channel. Narrowband testing means that the listening quality is estimated as listening through a conventional handset, the objective quality model filters the signal with such a band-p
45、ass and compares the speech signal to an ideal narrowband reference signal too. This restriction to a narrowband bandpass is applied despite the fact of the signal bandwidth passed through the channel. For testing a narrowband scenario using a mobile access device there are two setups: 1) Insertion
46、of a signal that exceeds the traditional narrowband bandwidth, e.g. 50 Hz to 3 800 Hz or even 50 Hz to 8 000 or 50 Hz to 14 000 Hz. In this case, the limitation of the signal is done by the device and the channel, while the device usually limits at most. At the receiving side, the recorded speech si
47、gnal is compared to an ideal narrowband signal (at a bandwidth of 50 Hz to 3 800 Hz). In this test case the filter characteristic of the mobile device used has a significant influence on the estimated quality, since all restrictions to the reference bandwidth are considered as degradation. The predi
48、cted MOS describes the overall quality as it is perceived by the particular device and the channel; the score is device dependent. 2) Insertion of a signal that emulates a traditional sending path that is close to the defined passband of 300 Hz to 3 400 Hz. Therefore the test speech signal is filter
49、ed with a bandpass filter as e.g. IRSsend or MSIN. Usually, those filters are narrower than the phones characteristic. The phones band limitations will not affect significantly the speech signal anymore. By using such a setup, the filter characteristic of the particular phone becomes less influencing. The bandwidth of the signal at receiving side is than widely dominated by the applied pre-filtering and widely the same for all devices. The estimated score becomes less phone dependent. ETSI ETSI TR 103 138 V1.4.1 (2016-09) 8 The approach (1) is re