1、 Recommendation ITU-R BT.2021-1 (02/2015) Subjective methods for the assessment of stereoscopic 3DTV systems BT Series Broadcasting service (television) ii Rec. ITU-R BT.2021-1 Foreword The role of the Radiocommunication Sector is to ensure the rational, equitable, efficient and economical use of th
2、e radio-frequency spectrum by all radiocommunication services, including satellite services, and carry out studies without limit of frequency range on the basis of which Recommendations are adopted. The regulatory and policy functions of the Radiocommunication Sector are performed by World and Regio
3、nal Radiocommunication Conferences and Radiocommunication Assemblies supported by Study Groups. Policy on Intellectual Property Right (IPR) ITU-R policy on IPR is described in the Common Patent Policy for ITU-T/ITU-R/ISO/IEC referenced in Annex 1 of Resolution ITU-R 1. Forms to be used for the submi
4、ssion of patent statements and licensing declarations by patent holders are available from http:/www.itu.int/ITU-R/go/patents/en where the Guidelines for Implementation of the Common Patent Policy for ITU-T/ITU-R/ISO/IEC and the ITU-R patent information database can also be found. Series of ITU-R Re
5、commendations (Also available online at http:/www.itu.int/publ/R-REC/en) Series Title BO Satellite delivery BR Recording for production, archival and play-out; film for television BS Broadcasting service (sound) BT Broadcasting service (television) F Fixed service M Mobile, radiodetermination, amate
6、ur and related satellite services P Radiowave propagation RA Radio astronomy RS Remote sensing systems S Fixed-satellite service SA Space applications and meteorology SF Frequency sharing and coordination between fixed-satellite and fixed service systems SM Spectrum management SNG Satellite news gat
7、hering TF Time signals and frequency standards emissions V Vocabulary and related subjects Note: This ITU-R Recommendation was approved in English under the procedure detailed in Resolution ITU-R 1. Electronic Publication Geneva, 2015 ITU 2015 All rights reserved. No part of this publication may be
8、reproduced, by any means whatsoever, without written permission of ITU. Rec. ITU-R BT.2021-1 1 RECOMMENDATION ITU-R BT.2021-1 Subjective methods for the assessment of stereoscopic 3DTV systems (2012-2015) Scope This Recommendation provides methodologies for the assessment of stereoscopic 3DTV system
9、s including general test methods, the grading scales and the viewing conditions. The ITU Radiocommunication Assembly, considering a) that a large amount of information has been collected about the methods used in various laboratories for the assessment of critical performance characteristics of 3DTV
10、 systems; b) that examination of these methods shows that there exists a considerable measure of agreement between the different laboratories about a number of aspects of the tests; c) that the adoption of standardized methods is of importance in the exchange of information between various laborator
11、ies; d) that the introduction of 3DTV services might require the development of new image formats, image processing and transmission techniques, whose performance will need to be evaluated though subjective methodologies, recommends 1 that the general methods of test, the grading scales and the view
12、ing conditions for the assessment of stereoscopic 3DTV picture quality, described in the following Annex 1 should be used for laboratory experiments and whenever possible for operational assessments. Annex 1 1 Assessment (perceptual) dimensions Stereoscopic 3DTV exploits the characteristics of the h
13、uman binocular visual system by recreating the conditions that bring about the perception of the relative depth of objects in the visual scene. The main requirement of current stereoscopic imaging is the capture of at least two views of the same scene from two horizontally aligned cameras. The image
14、s of the objects depicted in the scene will have different relative positions in the left- and right-view. This difference in relative positions in the two views is typically called image disparity (or parallax), and it is usually expressed in pixels, physical distances (e.g. mm), or relative measur
15、es (e.g. percentage of screen width). Image disparity should be distinguished from angular (retinal) disparity. In fact, the same image disparity information would produce different angular (retinal) disparities with different viewing distances. The magnitude and direction of the perception of depth
16、 is based on the magnitude and direction of the retinal disparities elicited by the stereoscopic image. 2 Rec. ITU-R BT.2021-1 Assessment factors generally applied to monoscopic television pictures, such as resolution, colour rendition, motion portrayal, overall quality, sharpness, etc. could be app
17、lied to stereoscopic television systems as well. In addition, there would be many factors peculiar to stereoscopic television systems. These might include factors such as depth resolution, which is the spatial resolution in depth direction, depth motion, that is, whether motion or movement along dep
18、th direction is reproduced smoothly and spatial distortions. Two well-known examples of the latter are the puppet theatre effect, i.e. when objects are perceived as unnaturally large or small, and the cardboard effect, i.e. when objects are perceived stereoscopically but they appear unnaturally thin
19、. We can identify three basic perceptual dimensions which collectively affect the quality of experience provided by a stereoscopic system: picture quality, depth quality, and visual comfort. Some researchers have argued that the psychological impact of stereoscopic imaging technologies might also be
20、 measured in terms of more general concepts such as naturalness and sense of presence. Primary perceptual dimensions Picture quality refers the perceived quality of the picture provided by the system. This is a main determinant of the performance of a video system. Picture quality is mainly affected
21、 by technical parameters and errors introduced by, for example, encoding and/or transmission processes. Depth quality refers to the ability of the system to deliver an enhanced sensation of depth. The presence of monocular cues, such as linear perspective, blur, gradients, etc., conveys some sensati
22、on of depth even in standard 2D images. However, stereoscopic 3D images contain also disparity information which provides additional depth information and thus an enhanced sense of depth as compared to 2D. Visual (dis)comfort refers to the subjective sensation of (dis)comfort that can be associated
23、with the viewing of stereoscopic images. Improperly captured or improperly displayed stereoscopic images could be a serious source of discomfort. Additional perceptual dimensions Naturalness refers to the perception of the stereoscopic image as being a truthful representation of reality (i.e. percep
24、tual realism). The stereoscopic image may present different types of distortions which make it less natural. For example, stereoscopic objects are sometimes perceived as unnaturally large or small (puppet theatre effect), or they appear unnaturally thin (cardboard effect). Sense of presence refers t
25、o the subjective experience of being in one place or environment even when one is situated in another. This Recommendation presents information regarding methods and procedures for the assessment of the three primary dimensions: picture quality, depth quality and visual comfort, outlined above. Meth
26、odologies for the assessment of naturalness and sense of presence are not included in the present Recommendation, but they are planned for inclusion at a later stage. 2 Subjective methodologies Recommendation ITU-R BT.500 outlines numerous methodologies for the assessment of picture quality. In all
27、methods, a set of video sequences, which have been processed with the systems (e.g. an algorithm with different parameters; an encoding technology at different bit rates; different transmission scenarios; etc.) under investigation, is shown to a panel of viewers in a series of judgment trials. In ea
28、ch trial, the viewers are asked to assess a relevant characteristic (e.g. picture quality) of the video sequence(s) using a prescribed scale. The various methods differ one from the other mostly in terms of the mode of presentation, i.e. the way the video sequences are presented to the viewers, and
29、the scale used by the viewers to rate those sequences. Rec. ITU-R BT.2021-1 3 The test images are binocular stereo images selected on the basis of the items described in 4. The assessors assess the following three items: picture quality: The effect on resolution of stereoscopic 3D images by a system
30、 having a path between test images and the monitor used for displaying the images to be assessed; depth quality: The effect on depth perception with respect to stereoscopic 3D images by a system having a path between test images and the monitor used for displaying the images to be assessed; visual c
31、omfort: The effect on ease-of-viewing with respect to stereoscopic 3D images by a system having a path between test images and the monitor used for displaying the images to be assessed. This Recommendation includes six methods from Recommendation ITU-R BT.500; these methods have been successfully us
32、ed in the last two decades to address relevant research issues related to the picture quality, depth quality and visual comfort of stereoscopic imaging technologies. The methods are: the single-stimulus (SS) method; the double-stimulus impairment scale (DSIS) method; the double-stimulus continuous q
33、uality scale (DSCQS) method; the stimulus-comparison (SC) method; the single-stimulus continuous quality evaluation (SSCQE) method; the simultaneous double stimulus for continuous evaluation (SDSCE) method. When appropriate, the methods have been used in a slightly modified form, e.g. different scal
34、es for visual comfort. The mode of presentation and scales associated with method for the assessment of the picture quality, depth quality and visual comfort are summarized in Tables 1, 2 and 3, respectively. A short description of each methodology is presented next in this section. Methodological e
35、lements which are common to all methods are presented in the following sections. 2.1 Single stimulus (SS) method The procedure consists of a series of judgement trials which might be divided, when appropriate, into several test sessions separated by breaks. In each trial, only one “Test” video seque
36、nce, i.e. a sequence that has been processed with a system under investigation, is presented and rated independently on the prescribed scale. 2.1.1 Trial structure of the SS method In each trial, the presentation of the “Test” video sequence to be assessed is preceded and followed by the presentatio
37、n of a mid-grey field. The preceding mid-grey field may contain a fixation target, e.g. the trial number, at zero disparity and should last 3 s. The following mid-grey field may contain a reminder to rate, e.g. the word “vote now”, and should last enough time for the viewer to provide a rating (e.g.
38、 10 s). The duration of the “Test” video sequence should generally be around 10 s1. The structure of a typical SS trial is shown in Fig. 1. 1 Some researchers have advocated the use of sequences of longer duration mostly based on the assumption that the full appreciation of stereoscopic content take
39、s a longer time than the appreciation of normal monoscopic (2D) content. To date, there is little empirical evidence in favour or against such claim. 4 Rec. ITU-R BT.2021-1 2.1.2 Grading scales of the SS method For picture quality assessment, two labeled scales can be used: the discrete five-grade s
40、cale and the standard ITU continuous quality scale (see Table 1). The quality labels are “Excellent”, “Good”, “Fair”, “Poor” and “Bad”. The same scales can be used for depth quality assessment (see Table 2). In this case, the viewers are asked to assess the quality of the depth representation rather
41、 than the quality of the picture itself. For the assessment of visual comfort, two labeled scales can be used: a discrete five-grade scale and a continuous comfort scale (see Table 3). The comfort labels are “Very comfortable”, “Comfortable”, “Mildly uncomfortable”, “Uncomfortable”, and “Extremely u
42、ncomfortable”. 2.1.3 Opinion score data of the SS method The rating provided for each sequence under examination is termed “opinion score”. The mean of such scores, generally obtained for each system under investigation, is termed the mean opinion score (MOS). The “Reference” video sequences, which
43、are versions of the test sequences that have not undergone any processing (see 8), may be included in the sequences set. The inclusion of the “Reference” allows computing the “difference opinion score”, which is the arithmetic difference between the ratings given to the “Test” and “Reference” versio
44、ns of each sequence in the study. The mean of the difference opinion scores obtained for each system under investigation is termed the difference mean opinion score (DMOS). FIGURE 1 Single stimulus method Trial structure B T . 2 0 2 1 - 01G rey Seq u en ceu n d er t es tG reyT i me“Tr i a l # ” “V o
45、 t e” = 3 s = 1 0 s 1 0 s2.2 The double-stimulus impairment scale (DSIS) method (the EBU method) The double-stimulus (EBU) method is cyclic in that the assessor is first presented with an unimpaired reference, then with the same picture impaired. Following this, the assessor is asked to vote on the
46、second, keeping in mind the first. The assessor is presented with a series of pictures or sequences in random order in sessions that last up to half an hour and with random impairments covering all required combinations. The unimpaired picture is included in the pictures or sequences to be assessed.
47、 The mean score for each test condition and test picture is calculated at the end of the series of sessions. The method uses an impairment scale, in which the stability of results is usually greater for smaller impairments than those that are larger. Although the method has sometimes been used with
48、limited Rec. ITU-R BT.2021-1 5 ranges of impairments, it is more appropriately used with a full range of impairments. The generalized arrangement for the test system should be that shown in Fig. 2. 2.2.1 Presentation of the test material A test session is comprised of a number of presentations. Ther
49、e are two variants to the structure of presentations, I and II outlined below. Variant I: The reference picture or sequence and the test picture or sequence are presented only once as is shown in Fig. 3a). Variant II: The reference picture or sequence and the test picture or sequence are presented twice as is shown in Fig. 3b). Variant II, which is more time consuming than variant I, may be applied if the discrimination of very small impairments is required or moving sequences are under test. FIGURE 2 General arrangement for test system for DSIS method B T . 2