1、 Reference number ISO 20462-2:2005(E) ISO 2005INTERNATIONAL STANDARD ISO 20462-2 First edition 2005-11-01 Photography Psychophysical experimental methods for estimating image quality Part 2: Triplet comparison method Photographie Mthodes psychophysiques exprimentales pour estimer la qualit dimage Pa
2、rtie 2: Mthode comparative du triplet ISO 20462-2:2005(E) PDF disclaimer This PDF file may contain embedded typefaces. In accordance with Adobes licensing policy, this file may be printed or viewed but shall not be edited unless the typefaces which are embedded are licensed to and installed on the c
3、omputer performing the editing. In downloading this file, parties accept therein the responsibility of not infringing Adobes licensing policy. The ISO Central Secretariat accepts no liability in this area. Adobe is a trademark of Adobe Systems Incorporated. Details of the software products used to c
4、reate this PDF file can be found in the General Info relative to the file; the PDF-creation parameters were optimized for printing. Every care has been taken to ensure that the file is suitable for use by ISO member bodies. In the unlikely event that a problem relating to it is found, please inform
5、the Central Secretariat at the address given below. ISO 2005 All rights reserved. Unless otherwise specified, no part of this publication may be reproduced or utilized in any form or by any means, electronic or mechanical, including photocopying and microfilm, without permission in writing from eith
6、er ISO at the address below or ISOs member body in the country of the requester. ISO copyright office Case postale 56 CH-1211 Geneva 20 Tel. + 41 22 749 01 11 Fax + 41 22 749 09 47 E-mail copyrightiso.org Web www.iso.org Published in Switzerland ii ISO 2005 All rights reservedISO 20462-2:2005(E) ISO
7、 2005 All rights reserved iii Contents Page Foreword iv Introduction v 1 Scope. 1 2 Terms and definitions. 1 3 Two-step psychophysical method 2 4 Experimental procedure. 3 4.1 Step 1. 3 4.2 Step 2. 3 Annex A (informative) Comparison between a paired comparison and a triplet comparison technique . 4
8、Annex B (informative) Number of sample combinations for triplet comparison 6 Annex C (informative) Standard portrait images . 8 Annex D (informative) Performance of the triplet comparison method. 12 Annex E (informative) Scheffes method 17 Annex F (informative) Conversion of Scheffes scale to JND. 2
9、2 Bibliography . 25 ISO 20462-2:2005(E) iv ISO 2005 All rights reservedForeword ISO (the International Organization for Standardization) is a worldwide federation of national standards bodies (ISO member bodies). The work of preparing International Standards is normally carried out through ISO techn
10、ical committees. Each member body interested in a subject for which a technical committee has been established has the right to be represented on that committee. International organizations, governmental and non-governmental, in liaison with ISO, also take part in the work. ISO collaborates closely
11、with the International Electrotechnical Commission (IEC) on all matters of electrotechnical standardization. International Standards are drafted in accordance with the rules given in the ISO/IEC Directives, Part 2. The main task of technical committees is to prepare International Standards. Draft In
12、ternational Standards adopted by the technical committees are circulated to the member bodies for voting. Publication as an International Standard requires approval by at least 75 % of the member bodies casting a vote. Attention is drawn to the possibility that some of the elements of this document
13、may be the subject of patent rights. ISO shall not be held responsible for identifying any or all such patent rights. ISO 20462-2 was prepared by Technical Committee ISO/TC 42, Photography. ISO 20462 consists of the following parts, under the general title Photography Psychophysical experimental met
14、hod for estimating image quality: Part 1: Overview of psychophysical elements Part 2: Triplet comparison method Part 3: Quality ruler method ISO 20462-2:2005(E) ISO 2005 All rights reserved v Introduction This part of ISO 20462 is necessary to provide a basis for visually assessing photographic imag
15、e quality in a precise, repeatable and efficient manner. This part of ISO 20462 is needed in order to evaluate various test methods or image processing algorithms that may be used in other international and industry standards. For example, it should be used to perform subjective evaluation of exposu
16、re series images from digital cameras as part of the work needed for future revisions of ISO 12232. The opportunities to create and observe images using different types of hard copy media and soft copy displays have increased significantly with advances in computer-based digital imaging technology.
17、As a result, there is a need to develop requirements for obtaining colour-appearance matches between images produced using various media and display technologies under a variety of viewing conditions. To develop the necessary requirements, organizations, including the CIE and the ICC, are developing
18、 methods to compensate for the effect of different viewing conditions, and to map colours optimally across disparate media having different colour gamuts. Such technical activities are often faced with the need to evaluate proposed methods or algorithms by visual assessment based on psychophysical e
19、xperiments. K.M. Braun et al. 1examined five viewing techniques for cross-media image comparisons in terms of sensitivity of scaling, and mental and physical stress for the observers. CIE TC1-27 “Specification of Colour Appearance for Reflective Media and Self-Luminous Display Comparisons” proposed
20、guidelines for conducting psychophysical experiments for the evaluation of colorimetric and colour-appearance models 6 . Accordingly, for the design and evaluation of digital imaging systems, it is of great importance to develop a methodology for subjective visual assessment, so that reliable and st
21、able results can be derived with minimum observer stress. When performing a psychophysical experiment, it is highly desirable to obtain results that are precise and reproducible. In order to derive statistically reliable results, large numbers of observers are required and careful attention should b
22、e paid to the experimental setup. Multiple (repeated) assessments are also useful. Observer stress during the visual assessment process can adversely affect the results. The order of image presentation, and the types of questions or questionnaires addressed by the observers, can also affect the resu
23、lts. Table 1 gives a comparison of three visual assessment techniques commonly used for image quality evaluation. The advantages of the category methods include low stress and high stability, since the observers task is to rank each image using typically five or seven categories. However, its scalab
24、ility within a category is less precise. One of the most common techniques for image quality assessment is the paired comparison method. This method is particularly suited to assessing image quality when precise scalability is required. However, a serious problem with the paired comparison method is
25、 that the number of samples to be examined is to be relatively limited. As the number of the samples increases, the number of combinations becomes extensive. This causes excessive observer stress, which can affect the accuracy and repeatability of the results. The third method, commonly known as mag
26、nitude scaling, is magnitude estimation. This method is extremely difficult when the psychophysical experiments are conducted using ordinary (non-expert) observers to perform the image quality assessment. Table 1 Comparison of typical psychophysical experimental methods Name of method Scalability St
27、ability Stress Category Low High Low Magnitude estimation Medium Low Medium Paired comparison High High High ISO 20462-2:2005(E) vi ISO 2005 All rights reservedG. Johnson et al. 3have proposed “A sharpness rule”, where the magnitude of sharpness was analyzed in terms of resolution, contrast, noise a
28、nd degree of sharpness-enhancement. Likewise, preferred skin colour may be considered not only from the viewpoint of chromaticity, but also with respect to the lightness, background and white point of the display media 4 . These examples show that image quality is not always evaluated by a single at
29、tribute, but may vary in combination with multiple attributes. In cases where a psychophysical experiment is designed for a new application, the experimenter may need to vary many attributes simultaneously during the course of the experiment. In these situations, the number of the samples to be exam
30、ined becomes excessively large, making it difficult to employ the paired comparison technique. INTERNATIONAL STANDARD ISO 20462-2:2005(E) ISO 2005 All rights reserved 1 Photography Psychophysical experimental methods for estimating image quality Part 2: Triplet comparison method 1 Scope This part of
31、 ISO 20462 defines a standard psychophysical experimental method for subjective image quality assessment of soft copy and hard copy still picture images. 2 Terms and definitions For the purposes of this document, the following terms and definitions apply. 2.1 just noticeable difference JND stimulus
32、difference that would lead to a 75:25 proportion of responses in a paired comparison task 2.2 psychophysical experimental method experimental technique for subjective evaluation of image quality or attributes thereof, from which stimulus differences in units of JNDs may be estimated cf. categorical
33、sort (2.5), paired comparison (2.3) and triplet comparison methods (2.4) 2.3 paired comparison method psychophysical method involving the choice of which of two simultaneously presented stimuli exhibits greater or lesser image quality or an attribute thereof, in accordance with a set of instructions
34、 given to the observer NOTE Two limitations of the paired comparison method are as follows. a) If all possible stimulus comparisons are done, as is usually the case, a large number of assessments are required for even modest numbers of experimental stimulus levels if N levels are to be studied, N(N
35、1)/2 paired comparisons are needed. b) If a stimulus difference exceeds approximately 1,5 JNDs, the magnitude of the stimulus difference cannot be directly estimated reliably because the response saturates as the proportions approach unanimity. However, if a series of stimuli having no large gaps ar
36、e assessed, the differences between more widely separated stimuli may be deduced indirectly by summing smaller, reliably determined (unsaturated) stimulus differences. The standard methods for transformation of paired comparison data to an interval scale (a scale linearly related to JNDs) perform st
37、atistically optimized procedures for inferring the stimulus differences, but they may yield unreliable results when too many of the stimulus differences are large enough ( 1,5 JNDs) that they produce saturated responses. ISO 20462-2:2005(E) 2 ISO 2005 All rights reserved2.4 triplet comparison psycho
38、physical method that involves the simultaneous scaling of three test stimuli with respect to image quality or an attribute thereof, in accordance with a set of instructions given to the observer 2.5 categorical sort method psychophysical method involving the classification of a stimulus into one of
39、several ordered categories, at least some of which are identified by adjectives or phrases that describe different levels of image quality or attributes thereof NOTE The application of adjectival descriptors is strongly affected by the range of stimuli presented, so that it is difficult to compare t
40、he results of one categorical sort experiment to another. Range effects and the coarse quantization of categorical sort experiments also hinder conversion of the responses to JND units. Given these limitations, it is not possible to unambiguously map adjectival descriptors to JND units, but it is wo
41、rth noting that in some experiments where a broad range of stimuli have been presented, the categories excellent, very good, good, fair, poor, and not worth keeping have been found to provide very roughly comparable intervals that average about six quality JNDs in width. 2.6 observer individual perf
42、orming the subjective evaluation task in a psychophysical method 3 Two-step psychophysical method This part of ISO 20462 defines a new psychophysical experimental method, which satisfies the following requirements: enables a large number of samples to be examined; provides precise scalability; provi
43、des low observer stress; suitable for ordinary (non-expert) observers; provides high repeatability of the results. The method comprises two steps. The first step is a “category step”, and the second step is a “triplet comparison step” which is newly developed for this purpose. The reason for applyin
44、g the “category step” is to reduce the number of the samples to an appropriate number which is determined by the purpose of each experiment. Typically this number is less than 27 samples. Category scaling using three categories, such as “favourable”, “acceptable” and “unacceptable” (or “acceptable”,
45、 “just acceptable” and “unacceptable”) is used for the first step, and samples are selected according to the number of samples required in the following step. If the number of test samples examined is relatively small, then the first step should be omitted, and the psychophysical experiment should s
46、tart directly from the second step. The second step is conducted in order to derive a precise scaling based on an interval scale. The present proposal is to use a newly developed triplet comparison method. In this method three samples are compared at a time, thereby achieving high assessment accurac
47、y while keeping the experimental scale realistic. NOTE If the normal paired comparison method were used with 21 samples, a total of 210 combinations would need to be examined. This is time-consuming and imposes excessive stress upon the observers. Furthermore, paired comparison methods require a sig
48、nificant number of observers in order that a precise scaling can be derived. This will result in an experiment that is excessively large and unrealizable. ISO 20462-2:2005(E) ISO 2005 All rights reserved 3 4 Experimental procedure 4.1 Step 1 Proceed as follows. a) Prepare the test images to be exami
49、ned. b) Observe each sample and rank it into 3 categories; “favourable”, “acceptable” and “unacceptable”. c) Count the number of test images in each category. d) Select the samples that will be used in Step 2 (4.2) from the upper category. It is recommended that the number of samples, N, be less than 27 in order to avoid observer stress during the experiment. The number of samples should obey the following equations: N = 6K + 1 or N = 6K + 3, (1) where N is the number of samples; K is an integer number