1、Designation: E 2263 04Standard Test Method forPaired Preference Test1This standard is issued under the fixed designation E 2263; the number immediately following the designation indicates the year oforiginal adoption or, in the case of revision, the year of last revision. A number in parentheses ind
2、icates the year of last reapproval. Asuperscript epsilon (e) indicates an editorial change since the last revision or reapproval.1. Scope1.1 This document covers a procedure for determiningpreference between two products using a two-alternativeforced-choice, which may or may not include the option o
3、fchoosing no preference.1.2 A paired preference test determines whether there is astatistically significant preference between two products for agiven population of respondents. The target population must becarefully considered.1.3 This method establishes preference in a single evalua-tion context.
4、Replicated tests will not be covered within thescope of this document.1.4 Paired preference testing can address overall preferenceor preference for a specified sensory attribute.1.5 The method does not directly determine the magnitudeof preference.1.6 This method does not address whether or not twos
5、amples are perceived as different. See Test Method E 2164.1.7 A paired preference test is a simple task for respondents,and can be used with populations that have minimal reading orcomprehension skills, or both.1.8 Preference is not an intrinsic attribute of the product,such as hue is, but is a subj
6、ective measure relating torespondents affective or hedonic response. It differs frompaired comparison testing which measures objective character-istics of the product. Preference results are always dependenton the population sampled.1.9 This standard does not purport to address all of thesafety conc
7、erns, if any, associated with its use. It is theresponsibility of the user of this standard to establish appro-priate safety and health practices and determine the applica-bility of regulatory limitations prior to use.2. Referenced Documents2.1 ASTM Standards:2E 253 Terminology Relating to Sensory E
8、valuation of Ma-terials and ProductsE 456 Terminology Relating to Quality and StatisticsE 1858 Test Method for Determining Oxidation InductionTime of Hydrocarbons by Differential Scanning Calorim-etryE 1871 Practice for Serving Protocol for Sensory Evalua-tion of Foods and Beverages2E 2164 Test Meth
9、od for Directional Difference Test22.2 ASTM Publication:Manual 26 Sensory Testing Methods, 2nd Edition22.3 ISO Standard:ISO 5495 Sensory AnalysisMethodologyPaired Com-parison33. Terminology3.1 For definition of terms relating to sensory analysis, seeTerminology E 253, and for terms relating to stati
10、stics, seeTerminology E 456.3.2 Definitions of Terms Specific to This Standard:3.2.1 a (alpha) riskthe probability of concluding that apreference exists when, in reality, one does not. (Also knownas Type I Error or significance level.)3.2.2 b (beta) riskthe probability of concluding that nopreferenc
11、e exists when, in reality, one does. (Also known asType II Error.)3.2.3 common responsesfor a one-sided test, the numberof respondents selecting the product that is expected to bepreferred. For a two-sided test, the largest number of respon-dents selecting either product.3.2.4 one-sided testa test i
12、n which the researcher has an apriori interest concerning the direction of the preference. Inthis case, the alternative hypothesis will express that a specificproduct is preferred over another product (that is, A B or A65 % represents “large” values.8.1.5 For example, if a researcher is planning a t
13、est tosupport a superior preference claim for a product over themajor competitors product, the researcher might choose thefollowing values for the test-sensitivity parameters: a = 0.05, b= 0.20, and Pmax= 60 %. The test is one-sided because theresearcher is only interested in the situation where the
14、ir productis preferred.8.2 Having defined the required sensitivity for the test using8.1, use Table X1.1 to determine the number of respondentsnecessary for a one-sided test, or Table X1.2 to determine thenumber of respondents necessary for two-sided test. Select thesection of the table correspondin
15、g to the selected Pmaxvalueand the column corresponding to the selected b value. Theminimum required number of respondents is found in the rowcorresponding to the selected value of a. Alternatively, TableX1.1 can be used to develop a set of values for Pmax, a, and bthat provide acceptable sensitivit
16、y while maintaining the num-ber of respondents within practical limits.8.2.1 Using the values from the example in 8.1.5, theresearcher would use the section of Table X1.1 correspondingto Pmax= 60 % and the column corresponding to b = 0.20. Inthe row corresponding to a = 0.05, it is found that 158res
17、pondents will be needed for the test.8.3 Often in practice, the number of respondents is deter-mined by project constraints (for example, duration of theexperiment, number of available respondents, quantity ofsample, budgetary restraints). The power of the test should thenbe computed. For this purpo
18、se, the following parameters needto be defined: a, observed Pmax, and the number of respon-dents, n. The observed Pmaxcorresponds to the observedproportion of common responses, n is determined by the testrealization, and a should be fixed by the experimenter prior tothe test conduct. With this infor
19、mation, an exact powercomputation can be achieved using appropriate software.However, an approximate value can be inferred by reverselookup using Table X1.1 or Table X1.2, depending on whetherthe alternative is one- or two-sided. First, use the value of Pmaxclosest to the observed one to select a gr
20、oup of rows, thenselect among these rows the one corresponding to the selectedvalue of a. Finally, select the cell having the number ofassessors closest to the actual number of assessors. Thecorresponding column heading will give a close estimate of theactual power of the test (1-b). Lower sample si
21、zes will reducethe power of the test.9. Procedure9.1 Paired preference can be used in either CLT (CentralLocation Test) or IHUT (Inhome Use Test) designs. Thefollowing discussion focuses on CLT testing procedures, how-ever, randomizations and data analyses would be similar forIHUTs.9.2 Prepare servi
22、ng order worksheet and ballot in advanceof the test to ensure a balanced order of presentation of the twosamples. Balance the serving sequences of the samples (ABand BA) across all respondents. Serving order worksheetsshould also include complete sample identification informationeither by product na
23、me or coded reference for double blindstudies. See Appendix X1.E22630439.3 It is critical to the validity of the test that respondentscannot differentiate the samples based on the way they arepresented. For example, in a test evaluating flavor differences,one should avoid any subtle differences in t
24、emperature orappearance caused by factors such as the time sequence ofpreparation. Code the vessels containing the samples in auniform manner, using three digit numbers chosen at randomfor each test. Prepare samples out of sight and in an identicalmanner, that is, same apparatus, same vessels, same
25、quantitiesof sample (see Practice E 1871).9.4 Present the pair of samples simultaneously if possible,following the same spatial arrangement for each assessor (on aline to be sampled always from left to right, or from front toback, and so forth). Respondents are typically allowed to tryeach sample mo
26、re than once. If the conditions of the samplesrestrict retrying the samples (for example, if samples are bulky,leave an aftertaste, or show slight differences in appearancethat cannot be masked), present the samples sequentially anddo not allow repeated evaluations.9.5 It is not recommended that mor
27、e than one question beasked about the samples, because the selection the assessor hasmade on the initial question may bias the reply to subsequentquestions. Responses to additional questions may be obtainedthrough separate tests for acceptance, degree of difference, andso forth (see Manual 26). A se
28、ction soliciting comments maybe included following the initial preference question.9.6 The paired preference test can be either forced-choice orhave the option of no preference.9.6.1 When using the paired preference test as a forced-choice procedure, respondents are not allowed the option ofreportin
29、g “no preference.” An assessor who has no preferencefor either of the samples should be instructed to randomlyselect one of the samples, and can indicate in the commentssection that they had no preference.10. Analysis and Interpretation of Results10.1 The procedure used to analyze the results of a p
30、airedpreference test depends on whether or not a “no preference”option is allowed.10.1.1 If a forced choice procedure is used, analyze asdetailed in 10.2.10.1.2 If a “no preference” option is allowed, then there arevarious ways to handle the data depending on the test objec-tives. Typically the no p
31、reference data is split in some mannerbetween “A” and “B.” Regardless of how the no preferencedata is handled, it is always important to report the percentageof no preference responses and take those into account for yourfinal action steps.10.1.2.1 For Ad Claim testing for superiority, “no prefer-en
32、ce” responses go against your companys product superior-ity. Therefore, those responses are given to the competitiveproduct.10.1.2.2 For Ad Claim testing for parity, “no preference”responses are arguments against the competitive product supe-riority. Therefore, those responses are given to your comp
33、anysproduct.10.1.2.3 For cost reduction or ingredient/supplier changes,“no preference” responses are split between current and testproduct.10.1.2.4 For product improvement, “no preference” re-sponses are handled similarly to an ad claim superiority claimand given to the current (not “improved”) prod
34、uct.10.1.2.5 For comparison of formulation options, wherethere is no control or current product, no preference responsesare split equally between the two products. It is important toalso report the percentage of no preference responses and takethose into account for your final action step.10.2 Analy
35、sis for PreferenceDifferent analyses are useddepending on if the number of respondents is equal to orgreater than planned or fewer than planned.10.2.1 When the actual number of respondents is equal to orgreater than planned, refer to Table X1.3 (one-tailed) or TableX1.4 (two-tailed) to analyze the d
36、ata. If the number of commonresponses is equal to or greater than the number given in thetable, conclude that there is a preference between the products.If the number of common responses is fewer than the numbergiven in the table, conclude that there is no preference. Theconclusions, 9preference9 or
37、 9no preference,9 are based on thepredetermined a, b, and Pmaxlevels.10.2.2 When the actual number of respondents is fewer thanplanned, then the data analysis is the same as 10.2.1 above.Understand that the b-risk is now larger than the value chosenbecause a smaller number of respondents participate
38、d in thetest.10.3 Analysis for ParityDifferent analyses are used de-pending on if the number of respondents is equal to or greaterthan planned or fewer than planned.10.3.1 When the actual number of respondents is equal to orgreater than planned, then the analysis is conducted as outlinedin 10.2.1.10
39、.3.2 When the number of respondents is fewer thanplanned, then data analysis consists of calculating a confidenceinterval. A confidence interval is calculated because the a, b,and Pmaxlevels are different in parity preference testing. Thecalculations are as follows, where c = the number of commonres
40、ponses, and n = the total number of respondents:Proportion of common responsesPc15 c/nScstandard deviation of Pc!5=Pc1 2 Pc! / nConfidence Limit 5 Pc1 zbSc10.3.3 zbis the critical value of the standard normaldistribution. Values of zbfor some commonly used values ofb-risk are:b-risk zb0.50 0.0000.40
41、 0.2530.20 0.8420.10 1.2820.05 1.6450.01 2.3260.001 3.090Given the values chosen for b and Pmax, if the confidencelimit is less than Pmax, then conclude that there is parity (that is,no more than Pmaxof the population would have a preferenceat the b-level of significance). If the confidence limit is
42、 greaterthan Pmax, then conclude that the products are not at parity.E2263044Understand that the a-risk is larger than the value chosen whena smaller number of respondents than planned participate in thetest.10.4 If desired, calculate a two-sided confidence interval onthe proportion of common respon
43、ses.11. Report11.1 Report the test objective, the results, and the conclu-sions. The following additional information is recommended:11.1.1 The purpose of the test and the nature of thetreatment studied;11.1.2 Full identification of the samples: origin, method ofpreparation, quantity, shape, storage
44、 prior to testing, servingsize, and temperature. (Sample information should communi-cate that all storage, handling, and preparation was done insuch a way as to yield samples that differed only in the variableof interest, if at all.);11.1.3 The number of respondents, recruitment criteria, thenumber
45、of selections of each sample, and the result of thestatistical analysis;11.1.4 Respondents: age, gender, frequency of product us-age: typical/usual product consumption in the category (forexample, brand loyal or rotators);11.1.5 Any information or instructions given to the assessorin connection with
46、 the test; including how the product wasidentified when presented;11.1.6 The test environment: use of booths, simultaneous orsequential presentation, light conditions, whether the identityof samples was disclosed after the test and the manner in whichthis was done; and11.1.7 The location and date of
47、 the test and name of the testadministrator.12. Precision and Bias12.1 Because results of paired preference tests are a func-tion of individual preferences, a general statement regardingthe precision of results that is applicable to all populations ofrespondents cannot be made. Unless the demographi
48、cs of thetest population are matched to U.S. census, results are notprojected to the total U.S. population. However, adherence tothe recommendations stated in this standard should increase thereproducibility of results and minimize bias.13. Keywords13.1 paired preference; preference; sensory; test m
49、ethodAPPENDIXES(Nonmandatory Information)X1. EXAMPLE 1PRODUCT IMPROVEMENT: FORCED CHOICE PROCEDUREX1.1 BackgroundX1.1.1 A beverage manufacturer wants to determine if anew chocolate flavoring “A” is preferred over the currentchocolate flavor “B” in a milk alternative beverage prior tofielding a more expensive in-home consumer test. It wasdecided to force a choice between the two flavors.X1.2 Test ObjectiveX1.2.1 To determine if chocolate flavoring “A” is preferredover “B” in a milk alternative beverage. This is a one-tailedtest.X1.3 Number of RespondentsX1.3.1 To protect the