1、II 3400855 0334334 375 REPORT ETR 305 August 1996 Source: ETSI TC-SMG Reference: DTR/SMG-020655 ICs: 33.060.50 Key words: EFR, digital cellular telecommunications system, Global System for Mobile communications (GSM), speech GLOBAL SYSTEM FOR MOBILE COMMUNICATIONS Digital cellular telecommunications
2、 system; Performance characterization of the GSM Enhanced Full Rate (EFR) speech codec (GSM 06.55) ETSI European Telecommunications Standards Institute ETSI Secretariat Postai address: F-O6921 Sophia Antipolis CEDEX - FRANCE Office address: 650 Route des Lucioles - Sophia Antipolis - Valbonne - FRAN
3、CE X.400: c=fr, a=atlas, p=etsi, s=secretarat - internet: secretariatQetsi.fr Tel.: +33 92 94 42 O0 - Fax: +33 93 65 47 16 Copyright Notification: No part may be reproduced except as authorized by written permission. The copyright and the foregoing restriction extend to reproduction in all media. O
4、European Telecommunications Standards Institute 1996. All rights reserved. - Page 2 ETR 305 (GSM 06.55 version 5.0.0): August 1996 ETSI ETR*305 96 H 3400855 0134135 201 m I Whilst every care has been taken in the preparation and publication of this document, errors in content, typographical or other
5、wise, may occur. If you have comments concerning its accuracy, please write to “ETSI Editing and Committee Support Dept.“ at the address shown on the title page. Page 3 ETR 305 (GSM o.55 version 5.0.0): August 1996 ETSI ETR%305 96 3400855 O334336 148 I Contents Foreword . 5 Introduction 5 1 2 3 4 5
6、6 7 8 9 10 11 12 13 14 15 Scope 7 References 7 Abbreviations . 7 Quality under error (EPO . EP3) and tandeming conditions (Exp Number 1 and Exp Number 5) 8 Quality under background noise conditions (Exp Number 2 and Exp Number 3) . 9 Talker dependency (Exp Number 4) . 9 DTX system . 9 7.1 7.1.1 Test
7、 procedure 9 7.1.2 Speech channel activity . 9 7.1.3 Level compensation 9 7.1.5 DTWCNI Informal Expert Listening tests . 10 7.2.1 Introduction 10 7.2.2 Test environment . 10 7.2.3 Results 10 Channel activity in DTX mode 9 7.1.4 Interleaving compensation 9 Estimated mean TDMA channel activity 10 7.2
8、Performance with DTMF tones . 10 8.1 Introduction 10 8.2 Test environment . 10 8.3 Results . 11 Network information tones 11 Performance with special input signals . 13 10.1 Music signais . 13 10.2 Noise signals 13 Performance with different languages . 14 Delay . 15 Frequency response 18 13.1 Intro
9、duction 18 13.2 Test environment . 18 13.3 Results . 18 Complexity . 18 Summary of the results from the subjective testing 19 Annex A: A.l Summary of Results (lab by lab) . 22 Quality under Error and tandeming conditions 22 ETSI ETR*305 96 3400855 0134137 084 Page 4 ETR 305 (GSM 06.55 version 5.0.0)
10、: August 1996 A.2 Quality under Background noise conditions . 26 A.3 Quality for Talker Dependency (DMOS and SD) 28 History . 29 This ETSI Technical Report (ETR) has been produced by the Special Mobile Group (SMG) Technical Committee of the European Telecommunications Standards Institute (ETSI). ETR
11、s are informative documents resulting from ETSI studies which are not appropriate for European Telecommunication Standard (ETS) or Interim European Telecommunication Standard (I-ETS) status. An ETR may be used to publish material which is either of an informative nature, relating to the use or the a
12、pplication of ETSs or I-ETSs, or which is immature and not yet suitable for formal adoption as an ETS or an I-ETS. Introduction The SMG2-Speech experts Group (SEG) started its activity early in 1995 for the standardization of an Enhanced Full Rate speech codec. The Group produced a test plan for the
13、 first phase of testing (pre- selection phase) which is described in permanent document SEG-4 (ETSI SMG2 SEG: SEG-4 (v 1.0) “A Subjective Pre-Selection Test Plan for the Enhanced Full Rate Speech Coding Algorithm”) to assess the performance of the submitted candidates. This test plan is based on the
14、 general knowledge coming from past ITU-T and ETSI activities on codec evaluation (GSM half rate and ITU-T 8 kbitis recent exercises for instance). At the end of this Pre-selection Phase, SMG decided to standardize the PCS1900 codec, known as the US-1 codec and no formal characterisation testing has
15、 been performed for the selected codec. This document therefore reports the results from the Pre-selection and Verification Phase of testing only. Consequently, the results reported here are less detailed, and the confidence intervals for them are wider, than those obtained for the GSM half rate sta
16、ndardization (ETR 229, 3) where specific and detailed characterisation testing was performed. In addition, not all laboratories followed the same pre-selection test plan, further complicating the interpretation of the results. The following experiments included in SEG-4 were carried out by several l
17、aboratories in the Pre-selection Phase: - - - - - Experiment 1 : Quality under error and tandeming conditions (A-law, Modified IRS) Experiment 2 : Quality under background noise conditions (Vehicular noise, UPCM, NolRS) Experiment 3 : Quality under background noise conditions (Background music, UPCM
18、, NolRS) Experiment 4 : Talker Dependency (UPCM, NolRS) Experiment 5 : Quality under high error conditions -EP3 (A-law, Modified IRS) A practical indirect method of performance comparison between different results was adopted utilising the Modulated Noise Reference Unit (MNRU) (see note) as a refere
19、nce degradation. The MNRU provides the additional function of allowing normalisation of results across different laboratories carrying out the same experiment, through the conversion of MOS scores to Equivalent Q (dB). The Q (dB) values introduced in a test normally range from O to 50 dB. In SEG-4,
20、both Experiment#l and Experiment#5 on error conditions covers this range, the other experiments do not. NOTE: The MNRU is a device designed for producing speech correlated noise that sounds subjectively like the quantising noise produced by log-companded PCM codecs. The device is subjectively calibr
21、ated for Mean Opinion Scores (MOS) against Q dB (where Q is the ratio of the speech to speech-correlated noise power). The Equivalent Q of the codecs under test can be found from the corresponding MOS on the calibration curve of the MNRU (S-shaped curve). Only four laboratories ran tests which follo
22、wed the Pre-selection Test Plan described in SEG-4 (BT/labl , CNET/lab2, Tele Denmarkllab3, NECAab4). MOTOROWlab5 participated in the Pre-selection Phase but their experiments did not comply with SEG-4. Tl/labS ran one experiment only from SEG-4. Results produced by COMSATAab6 following a NOKIA-desi
23、gned test plan are part of standardization of the codec in North America and NOKINlab7 performed complementary experiments during the ETSI Preselection Phase. As no further analysis have been undertaken to allow the averaging of scores across the different laboratories, results are reported in the a
24、nnex on a Iaboratory-by-laboratory basis. For error and tandeming conditions, results are reported in terms of Equivalent Q (dB) values. For background noise .Page 6 ETR 305 (GSM 06.55 version 5.0.0): August 1996 conditions and talker dependency, results are reported in terms of DMOS values with eit
25、her Confidence Interval (Ci) or Standard Deviation (SD) as there is insufficient data available to normalise across laboratories via MNRU conditions. ETSI ETR*305 76 I 3400855 OL34L39 957 = The quality performance of the EFR codec is compared to High and Low references introduced in permanent docume
26、nts SEG-3 (ETSI SMG2 CEG: SEG-3 “Selection Criteria for the Enhanced Full Rate Speech Coding Algorithm - Speech Quality Requirements“) and SEG-4 (ETSI SMG2 SEG: SEG-4 (v 1 .O) “A Subjective Pre-Selection Test Plan for the Enhanced Full Rate Speech Coding Algorithm“, Section 7). These references were
27、 chosen as representative of the “minimum“ and “objective“ performance targets respectively, and are reported in table 1. Table 1: References per condition : High Ref., Low Ref. and 6.728 A figure showing the general trend of the EFR behaviour for error conditions in noise-free environment, compared
28、 to the high (G.728) and low (TCH-FS) references is added to individual laboratories quantitative results (figure 15). The general quality performance of the EFR codec is summariced in table 15. In the Verification Phase, the behaviour of the EFR codec under the following test conditions was tested:
29、 Behaviour of the DTX System; Performance with DTMF tones; Performance with network information tones; Performance with special input signals; Performance with music signals; Performance with noise signals; Performance with different languages; Delay of the TCH-EFR; Frequency response; Complexity. T
30、he results of these tests are also included in this report under the respective clauses. Furthermore, the EFR codec was checked for correct functioning for the following items: - Test of overload point; - SID frame encoding; - Muting behaviour; - Idle channel behaviour No artefact or malfunctioning
31、was detected for these items. ETSI ETRa305 76 3L100855 OL34L40 679 m Page 7 ETR 305 (GSM 06.55 version 5.0.0): August 1996 1 Scope This ETR gives background information on the performance of the GSM enhanced full rate speech codec. Experimental results from the Pre-selection and Verification tests c
32、arried out during the standardization process by the SEG (Speech Expert Group) are reported to give a more detailed picture of the behaviour of the GSM enhanced full rate speech codec under different conditions of operation. 2 References This ETR incorporates by dated and undated reference, provisio
33、ns from other publications. These references are cited at the appropriate places in the text and the publications are listed hereafter. For dated references, subsequent amendments to or revisions of any of these publications apply to this ETR only when incorporated in it by amendment or revision. Fo
34、r undated references, the latest edition of the publication referred to applies. ADPCM ACR BSC BTS CA CI CNI CRC DIA DAT DCR DSP DTMF DTX EFR ESP FR GBER GSM HR I RS ITU-T MNRU Mod. IRS MOPS MOS MS GSM 03.05 (ETR 102): “Digital cellular telecommunications system; Technical performance objectives,“ G
35、SM 03.50 (ETS-300-540): “Transmission planning aspects of the speech service in the GSM Public Land Mobile Network (PLMN) system“. GSM 06.08 (ETR 229): “Digital cellular telecommunications system; Half rate speech; Performance of the GSM half rate speech codec“. GSM 06.1 O (ETS-300-580-2): “Digital
36、cellular telecommunications system; Full rate speech transcoding“. GSM 06.20 (ETS 300-581 -2): “Digital cellular telecommunications system; Half rate speech transcoding“. Abbreviations Analogue to Digital Adaptive Differential Pulse Code Modulation Absolute Category Rating Base Station Controller Ba
37、se Transceiver Station Carrier-to-Interferer ratio Confidence Interval Comfort Noise Insertion Cyclic Redundancy Check Digital to Analogue Digital Audio Tape Degradation Category Rating Digital Signal Processor Dual Tone Multi Frequency Discontinuous Transmission for power consumption and interferen
38、ce reduction Enhanced Full Rate Product of E (Efficiency), S (Speed) and P (Percentage of Power) of the DSP Full Rate Average gross bit error rate Global System for Mobile communications Half Rate Intermediate Reference System, No IRS= rather flat International Telecommunication Union - Telecommunic
39、ations Standardization Sector Modulated Noise Reference Unit Modified IRS Million of Operation per Seconds Mean Opinion Score Mobile Station Page 8 ETR 305 (GSM 06.55 version 5.0.0): August 1996 ETSI ETRs305 96 U 3400855 0134141 505 = MSC PCM PSTN Q SD SEG SID SMG TCH-EFS TCH-FS TCH-HS TDMA TMOPS UP
40、CM VAD WMOPS Mobile Switching Centre Pulse Code Modulation PublicSwitched Telecommunications Network Speech-to-speech correlated noise power ratio in dB Standard Deviation Speech Expert Group Silence Descriptor Special Mobile Group Traffic CHannel Enhanced Full rate Speech Traffic CHannel Full rate
41、Speech Traffic CHannel Half rate Speech Time Division Multiple Access True Million of Operation per Seconds Uniform or Linear PCM Voice Activity Detector Weighted Million of Operations per Seconds Four different Error Patterns (EPO, EP1 , EP2 and EP3) were used, where: EPO EP1 EP2 EP3 without channe
42、l errors C/I=lO dB; 5% GBER (well inside a cell) C/I= 7 dB; 8% GBER (at a cell boundary) CA= 4 dB; 13% GBER (outside a cell) 4 Quality under error (EPO - EP3) and tandeming conditions (Exp Number 1 and Exp Number 5) A listening-only test was adopted using the Absolute Category Rating (ACR) method. T
43、he results are reported in terms of Equivalent Q (dB) values and Differential Q values (which compare the codec results to the High and Low references). For error and tandeming conditions, results are available from eight laboratories (lab1 to lab8). Tables of results on a lab-by-lab basis are shown
44、 in the annex of the ETR (table A.l to table A.1.8), negative values indicating worse performance than the reference. In general, across all laboratories, the EFR codec performs better than the reference TCH-FS for clear speech (EPO), for error conditions EP1 and EP2 and for tandeming under error EP
45、1 conditions. For severe error condition (EP3), the performance is worse than TCH-FS in one laboratory. The EFR is equivalent to the reference G.728 (high reference) for clear speech in all laboratories. Under error conditions, the high reference threshold for severe error condition (EP3) is not met
46、 in all laboratories while the threshold for EP1 and EP2 is met for, roughly, half of the laboratories. Under tandeming, the clear condition was tested in only one laboratory where it was compared to another standard G.721; the results indicate that the performance of the EFR (EPO tandem) is equival
47、ent to that of G.721 (EPO). For tandeming under error condition EP1, equivalence with TCH-FS (EP1) without tandeming is demonstrated in all laboratories except one. Additional results coming from one lab only can be found in table A.1.6 (effect of input levels, other error conditions, tandeming with
48、 other standards). The advantage of the EFR compared to the actual TCH-FS is not independent of the quality of the network. As channel errors increase, this advantage is reduced. The general trend of the EFR behaviour in error conditions is shown in Figure 15. Page 9 ETR 305 (GSM 06.55 version 5.0.0
49、): August 1996 5 Quality under background noise conditions (Exp Number 2 and Exp ETSI ETR*305 96 3400855 0134142 441 Number 3) . This was assessed with a listening-only test, using the Degradation Category Rating (DCR) method. The results are reported for the EFR codec, the Reference G.728 and the TCH-FS codec in terms of DMOS values with Confidence Interval (Ci). Six laboratories (labl to lab4, lab6 and lab7) performed this experiment, the first four complying with SEG-4 (see table A.2.1 and table A.2.2). For each laboratory, the differences i