1、 INTERNATIONAL TELECOMMUNICATION UNION ITU-T P.832TELECOMMUNICATION STANDARDIZATION SECTOR OF ITU (05/2000) SERIES P: TELEPHONE TRANSMISSION QUALITY, TELEPHONE INSTALLATIONS, LOCAL LINE NETWORKS Methods for objective and subjective assessment of quality Subjective performance evaluation of hands-fre
2、e terminals ITU-T Recommendation P.832 (Formerly CCITT Recommendation) ITU-T P-SERIES RECOMMENDATIONS TELEPHONE TRANSMISSION QUALITY, TELEPHONE INSTALLATIONS, LOCAL LINE NETWORKS Vocabulary and effects of transmission parameters on customer opinion of transmission quality Series P.10 Subscribers lin
3、es and sets Series P.30 P.300 Transmission standards Series P.40 Objective measuring apparatus Series P.50 P.500 Objective electro-acoustical measurements Series P.60 Measurements related to speech loudness Series P.70 Methods for objective and subjective assessment of quality Series P.80 P.800 Audi
4、ovisual quality in multimedia services Series P.900 For further details, please refer to the list of ITU-T Recommendations. ITU-T P.832 (05/2000) i ITU-T Recommendation P.832 Subjective performance evaluation of hands-free terminals Summary This ITU-T Recommendation describes methods and procedures
5、for conducting subjective performance evaluations of hands-free terminals. The use of hands-free terminals in communication has numerous advantages for the telephone users, especially for all “non-traditional“ types of terminals such as car phones, computer/laptop-type terminals and others. Due to t
6、he complex acoustical situation a big variety of signal processing which may be non-linear and/or time variant is expected. ITU-T Recommendation P.340 describes measurement techniques for hands-free terminals, ITU-T Recommendation P.581 describes the use of the HATS for the evaluation of terminals,
7、ITU-T Recommendations P.501 and P.502 describe measurement signals and analysis procedures. Using these methods a minimum performance of hands-free terminals should be ensured. However, there is always the possibility that those tests do not address fully the impact of all kinds of signal processing
8、 in a hands-free terminal and their impact on speech transmission quality. Subjective testing is a commonly used method of assessing the performance of terminals, including digital speech codecs, voice-operated signal processing, echo cancellation, noise reduction and other types of signal processin
9、g. This ITU-T Recommendation defines methods for the subjective evaluation all kinds of hands-free terminals. Source ITU-T Recommendation P.832 was prepared by ITU-T Study Group 12 (1997-2000) and approved under the WTSC Resolution 1 procedure on 18 May 2000. Keywords Hands-free terminals, speech tr
10、ansmission quality, subjective performance. ii ITU-T P.832 (05/2000) FOREWORD The International Telecommunication Union (ITU) is the United Nations specialized agency in the field of telecommunications. The ITU Telecommunication Standardization Sector (ITU-T) is a permanent organ of ITU. ITU-T is re
11、sponsible for studying technical, operating and tariff questions and issuing Recommendations on them with a view to standardizing telecommunications on a worldwide basis. The World Telecommunication Standardization Conference (WTSC), which meets every four years, establishes the topics for study by
12、the ITU-T study groups which, in turn, produce Recommendations on these topics. The approval of ITU-T Recommendations is covered by the procedure laid down in WTSC Resolution 1. In some areas of information technology which fall within ITU-Ts purview, the necessary standards are prepared on a collab
13、orative basis with ISO and IEC. NOTE In this Recommendation, the expression “Administration“ is used for conciseness to indicate both a telecommunication administration and a recognized operating agency. INTELLECTUAL PROPERTY RIGHTS ITU draws attention to the possibility that the practice or impleme
14、ntation of this Recommendation may involve the use of a claimed Intellectual Property Right. ITU takes no position concerning the evidence, validity or applicability of claimed Intellectual Property Rights, whether asserted by ITU members or others outside of the Recommendation development process.
15、As of the date of approval of this Recommendation, ITU had not received notice of intellectual property, protected by patents, which may be required to implement this Recommendation. However, implementors are cautioned that this may not represent the latest information and are therefore strongly urg
16、ed to consult the TSB patent database. Ge3 ITU 2001 All rights reserved. No part of this publication may be reproduced or utilized in any form or by any means, electronic or mechanical, including photocopying and microfilm, without permission in writing from the ITU. ITU-T P.832 (05/2000) iii CONTEN
17、TS Page 1 General 1 1.1 Scope. 1 1.2 References. 1 1.3 Terms and definitions . 2 1.4 Abbreviations 2 2 Overview on test procedures. 3 3 General considerations 4 3.1 Hands-free parameters to evaluate 4 3.2 General considerations about test equipment and calibration. 5 3.3 Selection of subjects . 5 4
18、Conversational test procedure. 6 4.1 Purpose 6 4.1.1 Benefits 7 4.1.2 Drawbacks . 7 4.2 Test parameters . 7 4.3 Test set-up. 8 4.4 Description of test procedure 9 4.5 Reference conditions. 10 5 Double talk test procedure 10 5.1 Purpose 10 5.1.1 Benefits 10 5.1.2 Drawbacks . 10 5.2 Test parameters .
19、10 5.3 Test set-up. 12 5.4 Description of test procedure 13 5.5 Reference conditions. 13 6 Third-party listening test procedure 13 6.1 Purpose 13 6.1.1 Benefits 14 6.1.2 Drawbacks . 14 6.2 Test parameters and scaling 15 6.3 Test set-up and recording parameters . 16 6.4 Description of test procedure
20、19 6.5 Reference conditions. 21 iv ITU-T P.832 (05/2000) Page Annex A Corpus of the source signals 21 A.1 Size and parameters of the corpus 21 A.2 Design of each script. 22 ITU-T P.832 (05/2000) 1 ITU-T Recommendation P.832 Subjective performance evaluation of hands-free terminals 1 General 1.1 Scop
21、e This ITU-T Recommendation describes procedures to be used to assess the subjective performance of hands-free terminals. The methods defined here may be used to assess the extent to which a hands-free terminal operates effectively for speech. This ITU-T Recommendation does not define specific value
22、s for hands-free terminal parameters (e.g. convergence time of echo cancellers) to yield satisfactory subjective performance. The procedures defined here may also be appropriate for evaluating the subjective performance of other types of terminals and signal processing devices. A complete subjective
23、 evaluation of hands-free telephones can be performed by the combination of three types of tests: conversational test, double talk test and third-party listening test (listening only test). In general the evaluation of hands-free phones performance must take into account conversational interactions
24、between subjects; a conversational test is the only type of subjective test which allows such an evaluation. If a more detailed evaluation of a hands-free terminal is needed, it is recommended to perform double talk test and/or third-party listening tests additionally. 1.2 References The following R
25、ecommendations and other references contain provisions which, through reference in this text, constitute provisions of this Recommendation. At the time of publication, the editions indicated were valid. All Recommendations and other references are subject to revision; all users of this Recommendatio
26、n are therefore encouraged to investigate the possibility of applying the most recent edition of the Recommendations and other references listed below. A list of the currently valid ITU-T Recommendations is regularly published. 1 ITU-T Recommendation P.10 (1998), Vocabulary of terms on telephone tra
27、nsmission quality and telephone sets. 2 ITU-T Recommendation P.340 (2000), Transmission characteristics and speech quality parameters of hands-free terminals. 3 ITU-T Recommendation P.501 (2000), Test signals for use in telephonometry. 4 ITU-T Recommendation P.502 (2000), Objective test methods for
28、speech communication systems using complex test signals. 5 ITU-T Recommendation P.51 (1996), Artificial mouth. 6 ITU-T Recommendation P.56 (1993), Objective measurement of active speech level. 7 ITU-T Recommendation P.57 (1996), Artificial ears. 8 ITU-T Recommendation P.58 (1996), Head and torso sim
29、ulator for telephonometry. 9 ITU-T Recommendation P.800 (1996), Methods for subjective determination of transmission quality. 2 ITU-T P.832 (05/2000) 10 ITU-T Recommendation P.810 (1996), Modulated noise reference unit (MNRU). 11 ITU-T Recommendation P.830 (1996), Subjective performance assessment o
30、f telephone-band and wideband digital codecs. 12 ITU-T Recommendation P.581 (2000), Use of head and torso simulator (HATS) for hands-free terminal testing. 13 ITU Handbook on Telephonometry, 2nd edition, Geneva 1992. 1.3 Terms and definitions This ITU-T Recommendation defines the following terms: 1.
31、3.1 double talk: When near-end and far-end speech occur simultaneously at a given point, typically the terminal under test. 1.3.2 near end: The end of a network connection to which the HFT, whose characteristics are evaluated, is attached. 1.3.3 far end: The end of the network which is opposite to t
32、he near end. 1.3.4 syllable clipping or temporal clipping: Loss of speech energy caused by voice/speech activated devices. For echo cancellers, the primary source of temporal clipping is the NLP. In this instance, clipping does not refer to amplitude limiting. 1.3.5 third-party listening test: A lis
33、tening-only subjective test (see ITU-T Recommendation P.800) in which the listener hears as an “ear witness“ the acoustical recordings of the connection under evaluation. In conventional listening-only tests, the listener is positioned at one end of the connection under study. 1.3.6 conversation tes
34、t: A subjective test in which two participants have a conversation, as described in Annex A/P.800 and in the Handbook on Telephonometry. 1.3.7 double talk test: A subjective test in which the participants are forced to talk simultaneously while simultaneously listening for impairments (e.g. echo). 1
35、.3.8 untrained subject: See 3.3.1. 1.3.9 experienced subject: See 3.3.2. 1.3.10 experts: See 3.3.3. 1.3.11 ear signal: Signal recorded in the ear canal of a listeners ear. 1.4 Abbreviations This ITU-T Recommendation uses the following abbreviations: ACR Absolute Category Rating DCR Degradation Categ
36、ory Rating DMOS Degradation Mean Opinion Score GAT Group Audio Terminal HATS Head And Torso Simulator (Recommendation P.58) HFT Hands-free Terminal LRGP Loudness Rating Guard Position MNRU Modulated Noise Reference Unit ITU-T P.832 (05/2000) 3 MOS Mean Opinion Score MRP Mouth Reference Point NLP Non
37、-Linear Processor 2 Overview on test procedures The test procedures suitable for the assessment of speech quality performance of hands-free terminals can be classified into three categories: 1) Conversational tests (see clause 4). 2) Double talk tests (see clause 5). 3) Third-party listening tests (
38、see clause 6). NOTE Although headsets allow a “hands-free“ operation and may be tested in general using the same procedures, they are not covered by this Recommendation. In order to give guidance for the selection of the appropriate test procedure, information about advantages and/or disadvantages o
39、f a specific test procedure is listed in Table 1. Table 1/P.832 Advantages and disadvantages of different test procedures Advantages Disadvantages Conversational tests Very close to a real conversation Preparation time is relatively short (compared to third-party listening tests) Subjects tend to ha
40、ve different behaviour in a conversation (due to culture, personality, etc.) which creates more response variability in assessing speech quality aspects Since subjects have to concentrate on both running the conversation and taking care of the quality performance, they may be less sensitive to perfo
41、rmance or quality Devices under test and simulation tools must be available at the testing lab and must run in real time Double talk tests Preparation time is relatively short (compared to third-party listening tests) Evaluation of double talk capability in more detail than in conversational tests D
42、ue to standardized dialogue structures, individual behaviour depending on culture and/or personality affecting double talk is reduced Subjects have to concentrate of both reading their text and taking care of the quality performance Devices under test and simulation tools must be available at the te
43、sting lab and must run in real time 4 ITU-T P.832 (05/2000) Table 1/P.832 Advantages and disadvantages of different test procedures (concluded) Advantages Disadvantages Third-party listening tests Evaluation of specific speech quality parameters Processing and assessment of offline simulations Speec
44、h processing is reproducible under the same test conditions Efficient listening test management in listening labs (e.g. 6 or 8 persons in a listening group) Application of standardized test and evaluation procedures Subjects are not actively involved in the conversation Speech processing requires me
45、asurement and recording equipment Preparation of a third-party listening test is more time consuming than conversational or double talk test 3 General considerations Unless otherwise noted, the general considerations described in this clause apply to each of the test methods described in clauses 4-6
46、. 3.1 Hands-free parameters to evaluate The capability for evaluation of a specific set of speech quality aspects requires different levels of experience of the subjects that conduct a dedicated test procedure. Table 2 provides the parameters to be evaluated by different levels of experience of the
47、subjects (see 3.3.1 and 3.3.2). Table 2/P.832 Parameters to be evaluated by different levels of experience of the subjects Parameter Types of subjects Conversational tests Overall quality Difficulties in talking or hearing Dialogue capability Speech sound quality Transmission of background noise Var
48、iations of loudness during single and/or double talk Impairments caused by echoes during single and/or double talk Untrained subjects: Evaluation of overall impressions (see clause 4), typically only a few (overall) parameters can be judged at one time. Experienced subjects: More detailed evaluation
49、 (see clause 4). Double talk tests Overall speech quality Speech sound quality Dialogue capability Transmission of background noise Completeness of speech transmission Variations of loudness during single and/or double talk Impairments caused by echoes during single and/or double talk Untrained subjects: Ratings typical for the average telephone user. Experienced subjects: Detailed information about individual degradations. ITU-T P.832 (05/2000) 5 Table 2/P.832 Parameters to be evaluated by different levels of experience of the subjects (concluded) Para