1、 STD-ITU-T RECMN P-BGD-ENGL L77b 48b259L Ob205b qL3 m INTERNATIONAL TELECOMMUNICATION UNION lTU=T TELECOMMUNICATION STAN DARD IZAT ION SECTOR OF TU P.800 (08196) SERIES P: TELEPHONE TRANSMISSION QUALITY Methods for objective and subjective assessment of quality Methods for subjective determination o
2、f transmission quality ITU-T Recommendation P.800 (Previously CCITT Recommendation) STD-ITU-T RECMN P-BOO-ENGL L79b LiBb2571 Ob205b7 35T = Vocabulary and effects of transmission parameters on customer opinion of P.l O-P.29 transmission quality Subscribers lines and sets P.30-P.39 Transmission standa
3、rds P.40-P.49 Objective measuring apparatus P.50-P.59 P.300-P.399 P.500-P.599 ITU-T P-SERIES RECOMMENDATIONS TELEPHONE TRANSMISSION QUALITY Measurements related to speech loudness P.70-P.79 I For further details, please refer to ITU-T List of Recommendations. STD*ITU-T RECMN P-800-ENGL L99b LiBb259L
4、 Ob20570 O71 1TU-T RECOMMENDATION P.800 METHODS FOR SUBJECTIVE DETERMINATION OF TRANSMISSION QUALITY Summary This Recommendation describes methods and procedures for conducting subjective evaluations of transmission quality. The main revision encompassed by this version of this Recommendation is the
5、 addition of an annex describing the Comparison Category Rating (CCR) procedure. Other modifications have been made to align this Recommendation with recent revision of Recommendation P. 830. Source ITU-T Recommendation P.800 was revised by -T Study Group 12 (1993-1996) and was approved under the WT
6、SC Resolution No. 1 procedure on the 30th of August 1996. Keywords Absolute Category Rating, Comparison Category Rating, conversational test, Degradation Category Rating, listening test, subjective evaluation, Subjective testing FOREWORD FU (International Telecommunication Union) is the United Natio
7、ns Specialized Agency in the field of telecommunications. The FU Telecommunication Standardization Sector (lTU-T) is a permanent organ of the ITU. The ITU-T is responsible for studying technical, operating and tariff questions and issuing Recommendations on them with a view to standardizing telecomm
8、unications on a worldwide basis. The World Telecommunication Standardization Conference (WTSC), which meets every four years, establishes the topics for study by the FU-T Study Groups which, in their turn, produce Recommendations on these topics. The approval of Recommendations by the Members of the
9、 lTU-T is covered by the procedure laid down in WTSC Resolution No. 1 (Helsinki, March 1-12, 1993). in some areas of information technology which fall within ITU-Ts purview, the necessary standards are prepared on a collaborative basis with IS0 and IEC. NOTE In this Recommendation, the expression “A
10、dministration” is used for conciseness to indicate both a telecommunication administration and a recognized operating agency. O ITU 1996 All rights reserved. No part of this publication may be reproduced or utilized in any form or by any means, electronic or mechanical, including photocopying and mi
11、crofilm, without permission in writing from the ITW. 11 Recommendation P.800 (08196) STD*ITU-T RECMN P-BOO-ENGL 199b - 48b2571 Ob20572 744 1 2 3 4 5 6 6.1 6.2 6.3 6.4 CONTENTS Scope References Definitions . Abbreviations . Conventions . Recommended methods . Conversation-opinion tests Listening-opin
12、ion tests . Interview and survey tests Other tests Annex A . Conversation-opinion tests . A . 1 A . 2 A.3 A.4 Test facilities A . 1.1 Physical conditions . A . 1.2 Establishing the connection A . 1.3 Monitoring Experiment design . Conversation task . Test procedure A.4.1 Eligibility of subjects A.4.
13、2 Opinion scale A.4.3 Instructions to subjects . A.4.4 Data collection A.4.5 Treatment of results Annex B . Listening tests . Absolute Category Rating (ACR) B.l Source recordings . B . 1.1 Recording environment B . 1.2 Sending system . B . 1.3 Recording system B . 1.4 Speech material . B.1.5 Recordi
14、ng procedure . B.1.6 Talkers B.1.7 Speech levels B . 1.8 Calibration signal Recommendation P.800 (08/96) 1 1 2 3 3 3 3 3 5 5 5 5 5 10 10 10 10 11 11 11 12 13 13 14 14 14 14 14 14 15 16 16 16 iii B . 2 B.3 B . 4 Selection of circuit conditions . B.2.1 Speech input and listening levels B.2.2 Talkers B
15、.2.3 Reference conditions B.2.4 Other conditions . Design of experiment . B.4.1 Listening environment B.4.3 Listening level Opinion scales recommended by the ITU-T B.4.6 Instructions to subjects . Statistical analysis and reporting of results . Listening test procedure . B.4.2 Listening system . B.4
16、.4 Listeners B.4.5 B.4.7 Annex C . Quantal-Response Detectability Tests Annex D . Degradation Category Rating (DCR) method Degradation Category Rating (DCR) procedure D . 1 Introduction D.2 D.2.1 Speech samples . D.2.2 Reference conditions D.2.3 Stimulus presentation . D.2.4 Test instructions D.3 St
17、atistical analysis Annex E - Comparison Category Rating (CCR) method . E.l Introduction E.2 Quality reference E.3 MNRU references E.4 Presentation to listeners . ES Data analysis Annex F . The threshold method for comparison of transmission systems with a reference system . F.l Introduction F.2 Test
18、ing procedure . F.3 Presentation of signals . F.4 Speech sources . F.5 Listening environment . F.6 Listeners . Page 16 16 16 17 17 17 17 17 17 18 18 18 19 20 20 22 22 22 22 22 22 23 23 23 23 24 24 24 24 25 25 26 26 27 27 27 iv Recommendation P.800 (08196) Page . F.7 Reliability 27 Bibliography 28 Re
19、commendation P.800 (08/96) V Introduction Modem telecommunication networks provide a wide array of voice services using many transmission systems. In particular, the rapid deployment of digital technologies has led to an increased need for evaluating the transmission characteristics of new transmiss
20、ion equipment. In many circumstances, it is necessary to determine the subjective effects of some new transmission equipment or modification to the transmission characteristics of a telephone network. This Recommendation describes methods for obtaining subjective evaluations of transmission systems
21、and components. Recommendation G.113 contains useful information on the impairments that can occur. Recommendation P. 1 1 discusses the effects that transmission impairments may have on the users of telecommunication networks and services. The methods described in this Recommendation may be used to
22、estimate the equipment impairment factors (eifs) or quantization distortion units (qdus) that are described in Recommendation G.113. vi Recommendation P.800 (08196) STD-ITU-T RECMN PmBOO-ENGL L77b m 4862571 Ob20576 57T m Recommendation P.8001 METHODS FOR SUBJECTIVE DETERMINATION OF TRANSMISSION QUAL
23、ITY (Amended at Helsinki, 1993; revised in Geneva, 1996) 1 Scope This Recommendation contains advice to Administrations on conducting subjective tests of transmission quality in their own laboratories. It does not however deal with types of tests described in detail in other ITU-T Recommendations an
24、d documentation, namely: a) determination of Reference and Relative Equivalents - see Handbook on Telephonometry, Geneva, 1993; b) determination of Loudness Ratings - see Recommendation P.78; c determination of Articulation Ratings (A.E.N. values) - see Handbook on Telephonometry, Geneva, 1993. Neit
25、her does it deal with the various kinds of specialized tests used in the course of developing items of telephone equipment, for the purpose of diagnosing faults and shortcomings, such as Diagnostic Rhyme Tests i and other tests dedicated to the study of specific aspects of speech output. This Recomm
26、endation gives the approved methods which are considered to be suitable for determining how satisfactorily given telephone connections may be expected to perform. The methods indicated here are intended to be generally applicable whatever the form of degradation factors present. Examples of degradin
27、g factors include: loss (often frequency dependent); circuit noise; transmission errors (random bit errors as well as erased frames that occur in systems such as mobile communications); environmental noise; sidetone; talker echo; non-linear distortion of various kinds including low bit-rate encoding
28、; propagation time; harmful effects of voice-operated devices; distortions of the time scale arising from packet switching; and time-varying degradations of the communication channel, including those arising in loudspeaking sets. Combinations of two or more of such factors also have to be catered fo
29、r. Further guidance for specific applications is available in Recommendations P.830 (digital speech codecs), P.84 (DCMEPCME), and P.85 (speech output devices). 2 References The following Recommendations and other references contain provisions that, through reference in this text, constitute provisio
30、ns of this Recommendation. At the time of publication, the editions indicated are valid. All Recommendations and other references are subject to revision; all users of this Recommendation are therefore encouraged to investigate the possibility of applying the most recent edition of the Recommendatio
31、ns listed below. A list of the currently valid U-T Recommendations is regularly published. - IEC Publication 1260: 1995, Electroacoustics - Octave-band andfiactional - Octave-band filters. IEC Publication 581-5: 1981, High fidelis audio equipment and systems; Minimum performance requirements - Part
32、5: Microphones. - Formerly Recommendation P.80. Recommendation P.800 (08196) - 1 IEC Publication 65 1 : 1979, Sound level meters. (Amendment 1-1 993) (Corrigendum March i 994). IS0 266: 1975, Acoustics - Preferredfrequencies for measurements. IS0 1996-1: 1982, Acoustics - Description and measurement
33、 of environmental noise - Part 1: Basic quantities and procedures. IS0 1996-2: Acoustics - Description and measurement of environmental noise - Part 2: Acquisition of data pertinent to land use. IS0 1996-3: 1987, Acoustics - Description and measurement of environmental noise - Part 3: Application to
34、 noise limits. ITU-T Recommendation G. 1 13 ( 1996), Transmission impairments. CCITT Recommendation G.722 (1 988), 7 kHz audio-coding within 64 kbit/s. CCITT Recommendation G.726 (1990), 40, 32, 24 and 16 kbit/s Adaptive Diflerential Pulse Code Modulation (ADPCM). CCITT Recommendation G.728 (1 992),
35、 Coding of speech at 16 kbit/s using low-delay code excited linear prediction. U-T Recommendation G.729 (1996), Coding of speech at 8 kbit/s using Conjugate- Structure Alge braic-Code-Excited Linear-Prediction (CS-A CEL P) . U-T Recommendation P. 10 (1993), Vocabulary of terms on telephone transmiss
36、ion qua that auxiliary facilities such as dialling and ringing are provided; and that faithful records of the output of each test are kept. Detailed description of the method, considerations and precautions are found in Annex A. 6.2 Listening-opinion tests Listening-opinion tests are not expected to
37、 reach the same standard of realism as conversation tests, and the restrictions are therefore less severe in some respects; but the artificiality that has to be accepted brings with it a necessity for strict control of many things which in conversation tests are allowed to find their own equilibrium
38、. The recommended test method for listening-only tests is the “Absolute Category Rating“ (ACR) method described in Annex B, which is in conformance with the Category Judgement method recommended for conversation tests (see Annex A), and adopted partly for the same reasons. Recommendation P.800 (08/9
39、6) 3 STD-ITU-T RECMN PeBOO-ENGL 277b Li8b2571 Ob20577 2T7 = Category ratings are applied to short groups of unrelated sentences, each of which has been passed through a number of standard processes as well as the processes under test. This method is well- established, and has been applied to analogu
40、e and digital telephone connections and to telecommunications devices, such as digital codecs. In the work leading to Recommendations G.726 32 kbit/s ADPCM, G.728, G.729, and G.722, for example, laboratories in different countries performed subjective tests by the same method on the same physical co
41、nditions and on identical transmission systems, and the results showed a high degree of consistency. Other methods commonly used are the Quantal-Response Detectability Method, Degradation Category Rating (DCR), Comparison Category Rating (CCR) and the Threshold Method. Annex C describes Quantal-Resp
42、onse Detectability Tests, which are suitable for evaluating .threshold values of certain quantities and their associated probabilities. For example, the level above which single-frequency interference has a given probability of being objectionable or detectable, or the probability that crosstalk in
43、a given range of levels is intelligible, can best be determined by this method. An alternative to the Absolute Category Rating method is the Degradation Category Rating (DCR) method which is described in detail in Annex D. The DCR method compares the system under test with a high quality fixed refer
44、ence and the degradation (from “Inaudible“ to “Very annoying“) is rated on a five-point scale. This method is suitable when the impairment (especially digital impairments) is small. It may therefore be particularly useful for evaluating similar digital speech processing algorithms. Thus, the DCR met
45、hod may serve as a means for system optimization once it has been shown by the methods of Annexes A and B that the worst-case connection incorporating the degradation in question is within acceptable limits. Annex E describes a variation of the DCR procedure called the Comparison Category Rating (CC
46、R) method. As in the DCR, the CCR method compares the system under test with a high quality fixed reference (in the CCR case on a scale from “Much Better“ to “Much Worse“). This procedure may be particularly suitable for systems that improve the quality of the input speech (e.g. noise cancellation s
47、ystems). The Threshold method, also suitable for system optimization, is described in AnnexF. By direct comparison of the system under test with a reference system, such as the Modulated Noise Reference Unit (MNRU, as described in Recommendation P.810), it is possible to equate the value of the refe
48、rence condition (Q for digital processes) which equals the performance of the system under test. Information on other types of subjective test methods, which include scaling methods, can be found in 2.6 of the Handbook on Telephonornetry. Listening tests have direct applications in the assessment of
49、 physical transmission systems which are essentially unidirectional. Examples include broadcast circuits, public address systems and recorded announcement systems in which listening degradations such as loss, noise and distortion may be present. Results of listening-only tests can be applied, but only with certain reservations, to the prediction of the assessment for conversation conducted over a two-way system, such as a connection in a public switched telephone network. The provisos are that the effects of the following additional factors are duly taken into account: