1、 g49g50g3g38g50g51g60g44g49g42g3g58g44g55g43g50g56g55g3g37g54g44g3g51g40g53g48g44g54g54g44g50g49g3g40g59g38g40g51g55g3g36g54g3g51g40g53g48g44g55g55g40g39g3g37g60g3g38g50g51g60g53g44g42g43g55g3g47g36g58for estimating image quality Part 2: Triplet comparison methodICS 37.040.01Photography Psychophysic
2、al experimental methods BRITISH STANDARDBS ISO 20462-2:2005Incorporating corrigendum no. 1BS ISO 20462-2:2005This British Standard was published under the authority of the Standards Policy and Strategy Committee on 30 January 2006 BSI 2007ISBN 978 0 580 59685 8Amendments issued since publicationAmd.
3、 No. Date Comments17318 Corrigendum No. 131 August 2007 Changes to Clause 3, 4.1, 4.2 and C.2Compliance with a British Standard cannot confer immunity from legal obligations.National forewordThis British Standard is the UK implementation of ISO 20462-2:2005, incorporating corrigendum July 2007.The U
4、K participation in its preparation was entrusted to Technical Committee CPW/42, Photography. A list of organizations represented on this committee can be obtained on request to its secretary.This publication does not purport to include all the necessary provisions of a contract. Users are responsibl
5、e for its correct application. Reference numberISO 20462-2:2005(E)INTERNATIONAL STANDARD ISO20462-2First edition2005-11-01Photography Psychophysical experimental methods for estimating image quality Part 2: Triplet comparison method Photographie Mthodes psychophysiques exprimentales pour estimer la
6、qualit dimage Partie 2: Mthode comparative du triplet BS ISO 20462-2:2005ii iiiContents Page Foreword iv Introduction v 1 Scope . 1 2 Terms and definitions. 1 3 Two-step psychophysical method 2 4 Experimental procedure. 3 4.1 Step 1 . 3 4.2 Step 2 . 3 Annex A (informative) Comparison between a paire
7、d comparison and a triplet comparison technique . 4 Annex B (informative) Number of sample combinations for triplet comparison 6 Annex C (informative) Standard portrait images . 8 Annex D (informative) Performance of the triplet comparison method. 12 Annex E (informative) Scheffes method 17 Annex F
8、(informative) Conversion of Scheffes scale to JND. 22 Bibliography . 25 BS ISO 20462-2:2005iv Foreword ISO (the International Organization for Standardization) is a worldwide federation of national standards bodies (ISO member bodies). The work of preparing International Standards is normally carrie
9、d out through ISO technical committees. Each member body interested in a subject for which a technical committee has been established has the right to be represented on that committee. International organizations, governmental and non-governmental, in liaison with ISO, also take part in the work. IS
10、O collaborates closely with the International Electrotechnical Commission (IEC) on all matters of electrotechnical standardization. International Standards are drafted in accordance with the rules given in the ISO/IEC Directives, Part 2. The main task of technical committees is to prepare Internatio
11、nal Standards. Draft International Standards adopted by the technical committees are circulated to the member bodies for voting. Publication as an International Standard requires approval by at least 75 % of the member bodies casting a vote. Attention is drawn to the possibility that some of the ele
12、ments of this document may be the subject of patent rights. ISO shall not be held responsible for identifying any or all such patent rights. ISO 20462-2 was prepared by Technical Committee ISO/TC 42, Photography. ISO 20462 consists of the following parts, under the general title Photography Psychoph
13、ysical experimental method for estimating image quality: Part 1: Overview of psychophysical elements Part 2: Triplet comparison method Part 3: Quality ruler method BS ISO 20462-2:2005vIntroduction This part of ISO 20462 is necessary to provide a basis for visually assessing photographic image qualit
14、y in a precise, repeatable and efficient manner. This part of ISO 20462 is needed in order to evaluate various test methods or image processing algorithms that may be used in other international and industry standards. For example, it should be used to perform subjective evaluation of exposure serie
15、s images from digital cameras as part of the work needed for future revisions of ISO 12232. The opportunities to create and observe images using different types of hard copy media and soft copy displays have increased significantly with advances in computer-based digital imaging technology. As a res
16、ult, there is a need to develop requirements for obtaining colour-appearance matches between images produced using various media and display technologies under a variety of viewing conditions. To develop the necessary requirements, organizations, including the CIE and the ICC, are developing methods
17、 to compensate for the effect of different viewing conditions, and to map colours optimally across disparate media having different colour gamuts. Such technical activities are often faced with the need to evaluate proposed methods or algorithms by visual assessment based on psychophysical experimen
18、ts. K.M. Braun et al.1examined five viewing techniques for cross-media image comparisons in terms of sensitivity of scaling, and mental and physical stress for the observers. CIE TC1-27 “Specification of Colour Appearance for Reflective Media and Self-Luminous Display Comparisons” proposed guideline
19、s for conducting psychophysical experiments for the evaluation of colorimetric and colour-appearance models6. Accordingly, for the design and evaluation of digital imaging systems, it is of great importance to develop a methodology for subjective visual assessment, so that reliable and stable result
20、s can be derived with minimum observer stress. When performing a psychophysical experiment, it is highly desirable to obtain results that are precise and reproducible. In order to derive statistically reliable results, large numbers of observers are required and careful attention should be paid to t
21、he experimental setup. Multiple (repeated) assessments are also useful. Observer stress during the visual assessment process can adversely affect the results. The order of image presentation, and the types of questions or questionnaires addressed by the observers, can also affect the results. Table
22、1 gives a comparison of three visual assessment techniques commonly used for image quality evaluation. The advantages of the category methods include low stress and high stability, since the observers task is to rank each image using typically five or seven categories. However, its scalability withi
23、n a category is less precise. One of the most common techniques for image quality assessment is the paired comparison method. This method is particularly suited to assessing image quality when precise scalability is required. However, a serious problem with the paired comparison method is that the n
24、umber of samples to be examined is to be relatively limited. As the number of the samples increases, the number of combinations becomes extensive. This causes excessive observer stress, which can affect the accuracy and repeatability of the results. The third method, commonly known as magnitude scal
25、ing, is magnitude estimation. This method is extremely difficult when the psychophysical experiments are conducted using ordinary (non-expert) observers to perform the image quality assessment. Table 1 Comparison of typical psychophysical experimental methods Name of method Scalability Stability Str
26、ess Category Low High Low Magnitude estimation Medium Low Medium Paired comparison High High High BS ISO 20462-2:2005vi G. Johnson et al.3have proposed “A sharpness rule”, where the magnitude of sharpness was analyzed in terms of resolution, contrast, noise and degree of sharpness-enhancement. Likew
27、ise, preferred skin colour may be considered not only from the viewpoint of chromaticity, but also with respect to the lightness, background and white point of the display media4. These examples show that image quality is not always evaluated by a single attribute, but may vary in combination with m
28、ultiple attributes. In cases where a psychophysical experiment is designed for a new application, the experimenter may need to vary many attributes simultaneously during the course of the experiment. In these situations, the number of the samples to be examined becomes excessively large, making it d
29、ifficult to employ the paired comparison technique. BS ISO 20462-2:20051Photography Psychophysical experimental methods for estimating image quality Part 2: Triplet comparison method 1 Scope This part of ISO 20462 defines a standard psychophysical experimental method for subjective image quality ass
30、essment of soft copy and hard copy still picture images. 2 Terms and definitions For the purposes of this document, the following terms and definitions apply. 2.1 just noticeable difference JND stimulus difference that would lead to a 75:25 proportion of responses in a paired comparison task 2.2 psy
31、chophysical experimental method experimental technique for subjective evaluation of image quality or attributes thereof, from which stimulus differences in units of JNDs may be estimated cf. categorical sort (2.5), paired comparison (2.3) and triplet comparison methods (2.4) 2.3 paired comparison me
32、thod psychophysical method involving the choice of which of two simultaneously presented stimuli exhibits greater or lesser image quality or an attribute thereof, in accordance with a set of instructions given to the observer NOTE Two limitations of the paired comparison method are as follows. a) If
33、 all possible stimulus comparisons are done, as is usually the case, a large number of assessments are required for even modest numbers of experimental stimulus levels if N levels are to be studied, N(N 1)/2 paired comparisons are needed. b) If a stimulus difference exceeds approximately 1,5 JNDs, t
34、he magnitude of the stimulus difference cannot be directly estimated reliably because the response saturates as the proportions approach unanimity. However, if a series of stimuli having no large gaps are assessed, the differences between more widely separated stimuli may be deduced indirectly by su
35、mming smaller, reliably determined (unsaturated) stimulus differences. The standard methods for transformation of paired comparison data to an interval scale (a scale linearly related to JNDs) perform statistically optimized procedures for inferring the stimulus differences, but they may yield unrel
36、iable results when too many of the stimulus differences are large enough ( 1,5 JNDs) that they produce saturated responses. BS ISO 20462-2:20052 2.4 triplet comparison psychophysical method that involves the simultaneous scaling of three test stimuli with respect to image quality or an attribute the
37、reof, in accordance with a set of instructions given to the observer 2.5 categorical sort method psychophysical method involving the classification of a stimulus into one of several ordered categories, at least some of which are identified by adjectives or phrases that describe different levels of i
38、mage quality or attributes thereof NOTE The application of adjectival descriptors is strongly affected by the range of stimuli presented, so that it is difficult to compare the results of one categorical sort experiment to another. Range effects and the coarse quantization of categorical sort experi
39、ments also hinder conversion of the responses to JND units. Given these limitations, it is not possible to unambiguously map adjectival descriptors to JND units, but it is worth noting that in some experiments where a broad range of stimuli have been presented, the categories excellent, very good, g
40、ood, fair, poor, and not worth keeping have been found to provide very roughly comparable intervals that average about six quality JNDs in width. 2.6 observer individual performing the subjective evaluation task in a psychophysical method 3 Two-step psychophysical method This part of ISO 20462 defin
41、es a new psychophysical experimental method, which satisfies the following requirements: enables a large number of samples to be examined; provides precise scalability; provides low observer stress; suitable for ordinary (non-expert) observers; provides high repeatability of the results. The method
42、comprises two steps. The first step is a “category step”, and the second step is a “triplet comparison step” which is newly developed for this purpose. The reason for applying the “category step” is to reduce the number of the samples to an appropriate number which is determined by the purpose of ea
43、ch experiment. Typically this number is less than 27 samples. Category scaling using three categories, such as “favourable”, “acceptable” and “unacceptable” (or “acceptable”, “just acceptable” and “unacceptable”) is used for the first step, and samples are selected according to the number of samples
44、 required in the following step. If the number of test samples examined is relatively small, then the first step should be omitted, and the psychophysical experiment should start directly from the second step. The second step is conducted in order to derive a precise scaling based on an interval sca
45、le. The present proposal is to use a newly developed triplet comparison method. In this method three samples are compared at a time, thereby achieving high assessment accuracy while keeping the experimental scale realistic. NOTE If the normal paired comparison method were used with 21 samples, a tot
46、al of 210 combinations would need to be examined. This is time-consuming and imposes excessive stress upon the observers. Furthermore, paired comparison methods require a significant number of observers in order that a precise scaling can be derived. This will result in an experiment that is excessi
47、vely large and unrealizable. BS ISO 20462-2:2005NOTE See Annex A. 34 Experimental procedure 4.1 Step 1 Proceed as follows. a) Prepare the test images to be examined. b) Observe each sample and rank it into 3 categories; “favourable”, “acceptable” and “unacceptable”. c) Count the number of test image
48、s in each category. d) Select the samples that will be used in Step 2 (4.2) from the upper category. It is recommended that the number of samples, N, be less than 27 in order to avoid observer stress during the experiment. The number of samples should obey the following equations: N = 6K + 1 or N =
49、6K + 3, (1) where N is the number of samples; K is an integer number. NOTE 1 It is possible to use 5 or 7 categories in the case of many samples. 4.2 Step 2 Proceed as follows. a) Create combinations of samples for use in the triplet comparison step. Each combination shall consist of three samples. If the total number of the samples selected for the triplet comparison step satisfies Equation (1), then it is possible to arrange each combination of samples such that each pair of samples will only ever