1、Li855212 0543125 873 INTERNATIONAL TELECOMMUNICATION UNION HANDBOOK SUBJECTIVE ASSESSMENT METHODOLOGY IN TELEVISION RADIOCOMM u N ICATION Bu REAU Geneva, 1996 ail 4855232 0543126 70T M THE RADIOCOMMUNICATION SECTOR OF THE ITU The role of the Radiocommunication Sector is to ensure the rational, equit
2、able, efficient and economical use of the radio-frequency spectrum by all radiocommunication services, including satellite services and carry out studies without limit of frequency range on the basis of which Recommendations are adopted. The regulatory and policy functions of the Radiocommunication
3、Sector are performed by World and Regional Radiocommunication Conferences and Radiocommunication Assemblies supported by Study Groups. Contact address for inquiries about radiocommunication matters: TTU Radioconmunication Bureau Place des Nations CH-121 1 Geneva 20 S wi tzerland Telephone +41 227305
4、800 Fax +41 227305785 In ternet brmail itu.ch X.400 S=brmail; P=itu; A=400net; C=ch Contact address for orders of ITU publications: ITU Sales and Marketing Service Place des Nations CH - 12 1 1 Geneva 20 Switzerland Telephone +4 1 22 730 6 14 1 English Telephone +4 I 22 730 6 142 French Tekphone +41
5、 22 730 6143 Spanish Fax +41 227305194 Telex 421 000 uit ch Telegram TTU GENEVE Internet salesitu.ch x.400 S=sales; P=itu; A=400net; C=ch Z2 ITU 1996 All rights reserved. No part of this publication may be reproduced or utilized in any form or by any means, electronic or mechanical, including photoc
6、opying and microfilm, without written permission from the ITU. 4855232 0543327 b4b Ip INTERNATIONAL TELECOMMUNICATION UN ION HANDBOOK SUBJECTIVE ASSESSMENT METHODOLOGY IN TELEVISION RADIOCOMMUN ICATIN 6U REAU Geneva, 1996 ITU-R HANDBOOK ON SUBJECTIVE ASSESSMENT METHODOLOGY IN TELEVISION (1996) m 485
7、5232 0543327 4b9 m ITU-R HANDBOOK ON SUBJECTIVE ASSESSMENT METHODOLOGY IN TELEVISION (1996) TABLE OF CONTENTS PART 1 - General methods, viewing conditions, and data processing methods 1. Introduction 2. General methods of assessment 3. 4. Experimental results Analysis and presentation of results PAR
8、T 2 - Application of subjective evaluation methods to particular types of television systems 1. 2. Assessment of high-definition television 3. Assessment of digital coding systems Assessment of Alpha-numeric and graphic systems Previous page is blank. PART 1 CHAPTER 1 1 General methods, viewing cond
9、itions, and data processing methods 1.1 Introduction As readers will find said in many ways in this handbook, subjective evaluations of picture quality are an unavoidable part of the research and development of television systems. There is no other way to reliably compare and evaluate television sys
10、tems. A great wealth of know-how and experience has been developed throughout the world in subjective assessment methodology. The single worldwide forum at which scientists and engineers can discuss their work and agree common methods is the ITU Radiocommunication Sector, Study Group 11, Working Par
11、ty 11E. This volume has been assembled by members of 11E. Like all fields of science and technology, the state of the art in subjective evaluation methodology continues to advance. Nevertheless, an attempt has been made in this handbook to assemble material which will have a usefully long lifetime a
12、nd relevance. The Structure of the handbook is in two parts. The text moves from the general to the particular. In Part 1, general methods of performing subjective assessments are described. These techniques apply whatever quality window is being evaluated - low definition, conventional definition,
13、enhanced definition, high definition, and whatever technology is used for signal processing-analogue or digital. In the following chapter of Part 1 methods for processing results are outlined. Once again, these are largely generic tools which can be generally applied. Finally, the last chapter of Pa
14、rt 1, gives some experimental results of conventional quality Subjective evaluations, which have been used, for example, for defining objective measurement methods such as test signals. This is not strictly speaking a generic chapter, but as will be seen, this material does not directly concern the
15、topics covered in Part2 of the handbook, and therefore this chapter finds a more comfortable home in Part 1. In Part 2, application specific elements are considered. Three particular applications are given: digital systems, high-definition television, and alphanumeric system evaluations. A handbook
16、like this can only be part of the story. Television systems continue to evolve, and brings new challenges for the science of picture quality evaluation. However, whatever, turn of events there are, much of the material iii this handbook will the basis for the evaluations. We hope you find it useful
17、David Wood Special Rapporteur WP 1 1E Previous page is blank. i texthandbookubj-a“l doc - Bl 4855ZL2 0543131 O77 W -4- PART 1 CHAPTER 2 2 General methods of assessment The goal of subjective testing is to establish, by empirical means, a basis for informed decision-making in television design and ma
18、intenance. As such, it is essential that the methods and measures used yield results that are both valid (representative of opinions during normal viewing) and reliable (repeatable across viewers and occasions). It should be noted that reliability does not imply validity. The design of experiments h
19、as been well considered and documented; the amount of data which needs to be collected depends upon such interrelated factors as the confidence level which is needed in the answer, the standard deviation in the measurements, and the relative magnitude of the effect which it is required to detect. Ho
20、wever, although the purpose of the study constrains the choice of method and of judgement criterion, it may not be obvious which judgement criterion should be measured. In general, however, if an experimenter is causing different amounts of degradation to a picture, the difference between the origin
21、al (unimpaired) picture and the impaired one is relevant criterion and, therefore, an impairment scale should be used. Conversely, if an experimenter is not causing a picture to be degraded (e.g. assessments of different scanning algorithms), there is no unimpaired reference, and a quality scale is
22、the relevant criterion. Nevertheless, it is not inappropriate to use a quality scale when assessing impairments to a picture. Here it is a matter of the question being asked: how annoying?, or which is better?, or how much better? The question being asked often will determine which scale or method i
23、s best suited to the problem. The following sections summarise recommended methods and the principles of their use. 2.1 Common features Some basic features are common to all subjective procedures when they are used to measure Television picture quality. These are described below, even if possible va
24、riations around these outlines are reported in the detailed description of the procedures. 2.1.1 Viewing conditions The assessors viewing conditions should be arranged as follows: For conventional television: i.ttcxtihandbookaubj-senslpan I doc General conditions Ratio of viewing distance to picture
25、 height Peak luminance Ratio of luminance of inactive tube screen to peak luminance 50.02 Ratio of the luminance of the screen, when displaying only black level in a completely dark room to that corresponding to peak white Ratio of luminance of background behind picture monitor to peak luminance of
26、picture Other room illumination Chromaticity of background Ratio of solid angle subtended by that part of the background which satisfies this specification to that subtented by the picture 4H and 6“ 70 cdm 2 approx. 0.01 approx. O. 1 5 low D65 29 Special conditions Typical number of assessors at 4H
27、per monitor Typical number of assessors at 6H per monitor Monitor2 high quality 22“-26“ screen size (50cm-60cm) Display brightness and contrast Typical number of assessors per monitor 2 (for half of the sessions) 3 (for the other half) as above set up via PLUGE signal 5 (2 at 4H and 3 at 6H for the
28、first session, 3 at 4H and 2 at 6H for the next session and so on) f) Nature of viewing room(s) A room, 3 sides draped in white, 4th side (rear) draped in grey. 6H is the preferred distance for assessments of conventional systems (625/50,525/60), however using assessors at 4H also is acceptable, pro
29、vided either the results are given separately or there is clearly no significant difference in the means obtained. Where more than one viewing room is used, monitors should be carefully matched aI 4855232 0543133 74T m -6- For high-definition television: Condition Item Values (I) Ratio of viewing di
30、stance to picture height 3 Peak luminance on the screen (cdm2) (2) 150-250 Ratio of luminance of inactive tube screen (beams CU 50.02 off) to peak luminance (3) Ratio of the luminance of the screen when displayin approximately 0.01 only black level in a completely dark room , to tha corresponding to
31、 peak white (4) Ratio of luminance of background behind pictur approximately O. 15 monitor to peak luminance of picture Illumination from other sources (5) low Chromaticity of background D65 Angle subtented by that part of the background whic 53OHx83“W satisfies the specification above (0. This shou
32、ld b preserved for all observers Arrangement of observers within 2 30“ horizontally from th centre of the display. The vertica limit is under study Display size (7) 1,4 m (55 in) Values b and j are derived from former ITU-R definitions of HDTV . As it may not be possible currently to achieve these c
33、onditions fully for tests, alternative values are given on an interim basis. It should be recognised, however, that the results of tests conducted under the interim conditions may not be, in general comparable with those obtained in situations in which former ITU-R definitions apply. Peak luminance
34、on the screen corresponding to the video signal with 100% amplitude. Values 270cd/m2 should be used until the specified level becomes technically feasible. This item could be influenced by the room illumination, as well as the contrast range of the display. Black level corresponds to the video signa
35、l with 0% amplitude. Room illumination should be set in order to make it possible to satis the conditions c and e. A minimum of 28“ high x 48“ wide is recommended. Values 276.2 cm (30“) should be used if displays of the specified size are not available. Note: the relative quality appropriate to LDTV
36、, SDTV, EDTV, and HDTV. An ITU-R Recommendation has also been prepared on the subject of Enhanced Television and on i texrUiandbookaubj-assenSlpan I doc 88 4855212 0543334 B8b E -7- 2.1.2 Source signals The source signal provides the reference picture directly, and the input for the system under tes
37、t. It should be of optimum quality for the television standard used. The absence of defects (at the viewing distance used) in the reference part of the presentation pair in double stmulus test is crucial to obtaining stable results. Digitally stored pictures and sequences are the most reproducible s
38、ource signals, and these are therefore the preferred type. They can exchanged between laboratories, to make system comparisons more meaningful. The D-1 4:2:2 tape format (Recommendation ITU-RBR.657) should provide a basis for the exchange of conventional quality source pictures and sequences when su
39、ch machines are widely and economically available. computer tape formats are also possible. Currently 35 min slide-scanners provide a preferred source for still pictures. The resolution available is adequate for evaluation of conventional television. The colorimetry and other characteristics of film
40、 give a different subjective appearance to studio camera pictures. If this affects the results, direct studio sources should be used, although this is often much less convenient. As a general rule, slide-scanners should be adjusted picture by picture for best possible subjective picture quality, sin
41、ce this would be the situation in practice in programme production. Assessments of downstream processing capacity are often made with colour-matte. In studio operations, colour-matte is very sensitive to studio lighting. Assessments should therefore preferably use a special colour-matte slide pair,
42、which will consistently give high-quality results. Movement can be introduced into the foreground slide if needed. 2.1.3 Selection of test material Some test parameters may give rise to a similar magnitude of impairments for most pictures or sequences. In such cases, results obtained with a small nu
43、mber of pictures or sequences (e.g. two) may still provide a meaningful evaluation of a system. However, parameters for bandwidth compression systems frequently have an impact which depends heavily on the scene or sequence content. In such cases, there will be, for the totality of programme hours, a
44、 statistical distribution of impairment probability and picture or sequence content. Without knowing the form of this distribution, which is often the case, the selection of test material and the interpretation of results must be done very carefully. In general, it is essential to include critical m
45、aterial in subjective evaluations, because it is possible to take this into account when interpreting results, but it is not possible to extrapolate from non-critical material. In cases where scene or sequence content affects results, the material should be chosen to be “critical but not unduly so“
46、for the system under test. The phrase “not unduly so“ implies that the pictures could still conceivably form part of normal programme hours. At least four items should, in such cases, be used: for example, half of which are definitely critical, and half of which are moderately critical. A number of
47、organisations have developed test pictures and sequences. It is hoped to organise these in the framework of the ITU-R in the future. A 4:2:2 D1 test tape containing stills and moving pictures is available from the EBU. The ITU-R has proposed material for assessing digital systems where bit-rate redu
48、ction is applied to Recommendation ITU-R BT.601 signals. The evaluation of many systems needs to include the capacity for various downstream processing operations, such as colour-matte. In such cases, the colour-matte system needs to be included in both the direct and test system signal paths. These
49、 signals can then be included in the assessment presentations. With this method it is important however to avoid reference pictures or sequences which are in themselves impaired. If it is of interest to evaluate the additional deterioration caused to an already impaired picture, both should be used as test sequences. i:tonhuidbookaubj-smlpanI doc 4855232 0543135 712 H -8- 2.1.4 Observers At least 15 observers should be used. They should be non-expert, in the sense that they are not directly concerned with television picture quality as part of their normal work