1、 I n t e r n a t i o n a l T e l e c o m m u n i c a t i o n U n i o n ITU-T P.913 TELECOMMUNICATION STANDARDIZATION SECTOR OF ITU (03/2016) SERIES P: TERMINALS AND SUBJECTIVE AND OBJECTIVE ASSESSMENT METHODS Audiovisual quality in multimedia services Methods for the subjective assessment of video q
2、uality, audio quality and audiovisual quality of Internet video and distribution quality television in any environment Recommendation ITU-T P.913 ITU-T P-SERIES RECOMMENDATIONS TERMINALS AND SUBJECTIVE AND OBJECTIVE ASSESSMENT METHODS Vocabulary and effects of transmission parameters on customer opi
3、nion of transmission quality Series P.10 Voice terminal characteristics Series P.30 P.300 Reference systems Series P.40 Objective measuring apparatus Series P.50 P.500 Objective electro-acoustical measurements Series P.60 Measurements related to speech loudness Series P.70 Methods for objective and
4、subjective assessment of speech quality Series P.80 P.800 Audiovisual quality in multimedia services Series P.900 Transmission performance and QoS aspects of IP end-points Series P.1000 Communications involving vehicles Series P.1100 Models and tools for quality assessment of streamed media Series P
5、.1200 Telemeeting assessment Series P.1300 Statistical analysis, evaluation and reporting guidelines of quality measurements Series P.1400 Methods for objective and subjective assessment of quality of services other than voice services Series P.1500 For further details, please refer to the list of I
6、TU-T Recommendations. Rec. ITU-T P.913 (03/2016) i Recommendation ITU-T P.913 Methods for the subjective assessment of video quality, audio quality and audiovisual quality of Internet video and distribution quality television in any environment Summary Recommendation ITU-T P.913 describes non-intera
7、ctive subjective assessment methods for evaluating the one-way overall video quality, audio quality or audiovisual quality for applications such as Internet video and distribution quality video. These methods can be used for several different purposes including, but not limited to, comparing the qua
8、lity of multiple devices, comparing the performance of a device in multiple environments, and subjective assessment where the quality impact of the device and the audiovisual material is confounded. History Edition Recommendation Approval Study Group Unique ID* 1.0 ITU-T P.913 2014-01-13 9 11.1002/1
9、000/12106 2.0 ITU-T P.913 2016-03-15 9 11.1002/1000/12775 _ * To access the Recommendation, type the URL http:/handle.itu.int/ in the address field of your web browser, followed by the Recommendations unique ID. For example, http:/handle.itu.int/11.1002/1000/11830-en. ii Rec. ITU-T P.913 (03/2016) F
10、OREWORD The International Telecommunication Union (ITU) is the United Nations specialized agency in the field of telecommunications, information and communication technologies (ICTs). The ITU Telecommunication Standardization Sector (ITU-T) is a permanent organ of ITU. ITU-T is responsible for study
11、ing technical, operating and tariff questions and issuing Recommendations on them with a view to standardizing telecommunications on a worldwide basis. The World Telecommunication Standardization Assembly (WTSA), which meets every four years, establishes the topics for study by the ITU-T study group
12、s which, in turn, produce Recommendations on these topics. The approval of ITU-T Recommendations is covered by the procedure laid down in WTSA Resolution 1. In some areas of information technology which fall within ITU-Ts purview, the necessary standards are prepared on a collaborative basis with IS
13、O and IEC. NOTE In this Recommendation, the expression “Administration“ is used for conciseness to indicate both a telecommunication administration and a recognized operating agency. Compliance with this Recommendation is voluntary. However, the Recommendation may contain certain mandatory provision
14、s (to ensure, e.g., interoperability or applicability) and compliance with the Recommendation is achieved when all of these mandatory provisions are met. The words “shall“ or some other obligatory language such as “must“ and the negative equivalents are used to express requirements. The use of such
15、words does not suggest that compliance with the Recommendation is required of any party. INTELLECTUAL PROPERTY RIGHTSITU draws attention to the possibility that the practice or implementation of this Recommendation may involve the use of a claimed Intellectual Property Right. ITU takes no position c
16、oncerning the evidence, validity or applicability of claimed Intellectual Property Rights, whether asserted by ITU members or others outside of the Recommendation development process. As of the date of approval of this Recommendation, ITU had not received notice of intellectual property, protected b
17、y patents, which may be required to implement this Recommendation. However, implementers are cautioned that this may not represent the latest information and are therefore strongly urged to consult the TSB patent database at http:/www.itu.int/ITU-T/ipr/. ITU 2016 All rights reserved. No part of this
18、 publication may be reproduced, by any means whatsoever, without the prior written permission of ITU. Rec. ITU-T P.913 (03/2016) iii Table of Contents Page 1 Scope . 1 1.1 Limitations 1 2 References . 1 3 Definitions 2 3.1 Terms defined elsewhere 2 3.2 Terms defined in this Recommendation . 2 4 Abbr
19、eviations and acronyms 3 5 Conventions 4 6 Source stimuli . 4 6.1 Source signal recordings . 4 6.2 Video considerations 5 6.3 Audio considerations 7 6.4 Audiovisual considerations 7 6.5 Duration of stimuli . 7 6.6 Number of source stimuli . 8 7 Test methods, rating scales and allowed changes . 8 7.1
20、 List of methods . 8 7.2 Acceptable changes to the methods 11 7.3 Discouraged but acceptable changes to the methods . 12 8 Environment . 12 8.1 Controlled environment 12 8.2 Public environment . 13 8.3 Viewing distance 13 9 Subjects . 13 9.1 Number of subjects . 13 9.2 Subject population 13 9.3 Samp
21、ling subjects . 14 9.4 Sampling techniques . 14 10 Experimental design . 15 10.1 Size of the experiment and subject fatigue . 15 10.2 Special considerations for transmission error, rebuffering and audiovisual synchronization impairments 15 10.3 Special considerations for longer stalling events . 15
22、10.4 Pre-pilot testing and pilot testing 16 10.5 Study design . 16 11 Experiment implementation 17 11.1 Informed consent 17 11.2 Overview of subject screening . 18 iv Rec. ITU-T P.913 (03/2016) Page 11.3 Optional pre-screening of subjects . 18 11.4 Post-screening of subjects 19 11.5 Instructions and
23、 training . 19 11.6 Study duration, sessions and breaks . 20 11.7 Stimuli play mechanism . 21 11.8 Voting . 23 11.9 Questionnaire or interview . 24 12 Data analysis . 24 12.1 Documenting the experiment . 25 12.2 Calculate MOS or DMOS 25 12.3 Evaluating objective metrics 25 12.4 Significance testing,
24、 subject bias and standard deviation of scores . 25 12.5 Ratings from multiple laboratories . 26 13 Elements of subjective test reporting 27 13.1 Documenting the test design 27 13.2 Documenting the subjective testing 27 13.3 Data analysis . 28 13.4 Additional information . 29 Annex A Method for post
25、-experimental screening of subjects using Pearson linear correlation . 30 A.1 Screen by PVS 30 A.2 Screen by PVS and HRC 31 Appendix I Sample informed consent form 32 Appendix II Sample instructions. 33 Bibliography. 34 Rec. ITU-T P.913 (03/2016) v Introduction ITU-T P.910, b-ITU-T P.911 and ITU-R B
26、T.500-13 have been successfully used for many years to perform video quality and audiovisual quality subjective assessments. These Recommendations were initially designed around the paradigm of a fixed video service that transmits video over a reliable link to an immobile cathode ray tube (CRT) tele
27、vision located in a quiet and non-distracting environment, such as a living room or office. These Recommendations have been updated and expanded as technology shifted, and they have proved to be valuable and useful for the displays and questions addressed in their original scopes. However, the initi
28、al premise of these Recommendations does not include the new paradigms of Internet video and distribution quality television. One new paradigm of video watching is an on-demand video service transmitted over an unreliable link to a variety of mobile and immobile devices located in a distracting envi
29、ronment, using liquid crystal displays (LCDs) and other flat-screen devices. This new paradigm impacts key characteristics of the subjective test, such as the viewing environment, the listening environment and the questions to be answered. Users of Internet video and distribution quality television
30、are moving from one device to another and from one environment to another throughout the day, perhaps even observing the same video using multiple devices. For example, someone might start watching a sporting event on their computer using Internet protocol television (IPTV), move to an over-the-air
31、broadcast in their living room when the IPTV connection displays a rebuffering event and then switch to a mobile Internet device (MID) or even a smart phone when leaving the house. Thus, subjective quality assessments into Internet video and distribution quality television pose unique questions that
32、 are not considered in the existing Recommendations. These questions may require situation-specific modifications to the subjective scale (e.g., presentation of additional information defining what “good“ means in this context). Consider the pristine viewing environment defined by ITU-R BT.500-13, w
33、ith its exact lighting conditions and non-distracting walls. The intention is to remove the impact of the viewing and listening environment from the experiment. For some subjective audiovisual quality experiments, this is not appropriate. First, consider an experiment that investigates the quality o
34、f service observed by video-conferencing users in an office with fluorescent lights and the steady hum of a computer. Second, consider an experiment that analyses a communications device for emergency personnel. A highly distracting background may be a critical element of the experimental design (e.
35、g., to simulate video watched inside a moving fire truck with sirens blaring). The impact of environment is an integral part of these experiments. These questions and environments cannot be accommodated with the existing subjective assessment Recommendations. Modifying these Recommendations would re
36、duce the value of the intended experiments and paradigms addressed therein. The main differences in this Recommendation when compared to existing ITU subjective assessment Recommendations are: 1) inclusion of multiple testing environment options (e.g., pristine laboratory environment, simulated offi
37、ce within a laboratory, public environment); 2) flexibility for the user to modify the subjective scale (e.g., modified words, added information); 3) applicability for interaction effects that confound the data (e.g., evaluating a device that can only accept compressed material, impact of mobility o
38、n quality perception); 4) mandatory reporting requirement (e.g., choices made where this Recommendation allows for flexibility, experimental variables that cannot be separated due to the experiment design); and 5) inclusion of multiple display technologies (e.g., flat screen, 2D, 3D). Rec. ITU-T P.9
39、13 (03/2016) 1 Recommendation ITU-T P.913 Methods for the subjective assessment of video quality, audio quality and audiovisual quality of Internet video and distribution quality television in any environment 1 Scope This Recommendation describes methods to be used for subjective assessment of the a
40、udiovisual quality of Internet video and distribution quality. This may include assessment of visual quality only, audio quality only or the overall audiovisual quality. This Recommendation may be used to compare audiovisual device performance in multiple environments and to compare the quality impa
41、ct of multiple audiovisual devices. It is appropriate for subjective assessment of devices where the quality impact of the device and the material is confounded. It is appropriate for a wide variety of display technologies, including flat screen, 2D, 3D, multi-view and autostereoscopic. The devices
42、and usage scenarios of interest herein are Internet video and distribution quality television. The focus is on the quality perceived by the end user. 1.1 Limitations This Recommendation does not address the specialized needs of broadcasters and contribution quality television. This Recommendation is
43、 not intended to be used in the evaluation of audio-only stimuli alone, but rather audiovisual subjective assessments that may or may not include audio-only sessions. Caution should be taken when examining adaptive streaming impairments, due to the slow variations in quality within one stimulus over
44、 a long period of time. 2 References The following ITU-T Recommendations and other references contain provisions which, through reference in this text, constitute provisions of this Recommendation. At the time of publication, the editions indicated were valid. All Recommendations and other reference
45、s are subject to revision; users of this Recommendation are therefore encouraged to investigate the possibility of applying the most recent edition of the Recommendations and other references listed below. A list of the currently valid ITU-T Recommendations is regularly published. The reference to a
46、 document within this Recommendation does not give it, as a stand-alone document, the status of a Recommendation. ITU-T J.340 Recommendation ITU-T J.340 (2010), Reference algorithm for computing peak signal to noise ratio of a processed video sequence with compensation for constant spatial shifts, c
47、onstant temporal shift, and constant luminance gain and offset. ITU-T P.78 Recommendation ITU-T P.78 (1996), Subjective testing method for determination of loudness ratings in accordance with Recommendation P.76. ITU-T P.800 Recommendation ITU-T P.800 (1996), Methods for subjective determination of
48、transmission quality. ITU-T P.800.2 Recommendation ITU-T P.800.2 (2013), Mean opinion score interpretation and reporting. ITU-T P.910 Recommendation ITU-T P.910 (2008), Subjective video quality assessment methods for multimedia applications. ITU-T P.916 Recommendation ITU-T P.916 (2016), Information
49、 and guidelines for assessing and minimizing visual discomfort and visual fatigue from 3D video. 2 Rec. ITU-T P.913 (03/2016) ITU-T P.1401 Recommendation ITU-T P.1401 (2012), Methods, metrics and procedures for statistical evaluation, qualification and comparison of objective quality prediction models. ITU-R BS.1534-3 Recommendation ITU-R BS.1534-1 (2015), Method for the subjective assessment of intermediate quality level of coding systems. ITU-R BT.500-13 Recommendation ITU-R BT.500-13 (2012), Methodology for