ASTM E2943-2014 Standard Guide for Two-Sample Acceptance and Preference Testing with Consumers《两个样本验收和消费者偏好试验的标准指南》.pdf

资源描述

1、Designation: E2943 14Standard Guide forTwo-Sample Acceptance and Preference Testing withConsumers1This standard is issued under the fixed designation E2943; the number immediately following the designation indicates the year oforiginal adoption or, in the case of revision, the year of last revision.

2、 A number in parentheses indicates the year of last reapproval. Asuperscript epsilon () indicates an editorial change since the last revision or reapproval.INTRODUCTIONThis guide is intended to be used by sensory consumer and marketing research professionals(referred to as the “researcher” or “resea

3、rch professional”) as an aid to understanding issues associatedwith and to conducting two-sample acceptance and preference tests with consumers. This guideincludes a general summary of considerations and practices for conducting hedonic tests followed byspecific considerations and practices for both

4、 acceptance and preference testing, including pros andcons of each method. Final sections consider the incorporation of both acceptance and preferencetesting into the research plan and discuss potential lack of linkage in output/results between them. Aflowchart outlining summary of these methods and

5、 references for further reading are also included.1. Scope1.1 This guide covers acceptance and preference measureswhen each is used in an unbranded, two-sample, product test.Each measure, acceptance, and preference, may be used aloneor together in a single test or separated by time. This guidecovers

6、 how to establish a products hedonic or choice statusbased on sensory attributes alone, rather than brand,positioning, imagery, packaging, pricing, emotional-culturalresponses, or other nonsensory aspects of the product. Themost commonly used measures of acceptance and preferencewill be covered, tha

7、t is, product liking overall as measured bythe nine-point hedonic scale and preference measured bychoice, either two-alternative forced choice or two-alternativewith a “no preference” option.1.2 Three of the biggest challenges in measuring a productshedonic (overall liking or acceptability) or choic

8、e status(preference selection) are determining how many respondentsand who to include in the respondent sample, setting up thequestioning sequence, and interpreting the data to make prod-uct decisions.1.3 This guide covers:1.3.1 Definition of each type of measure,1.3.2 Discussion of the advantages a

9、nd disadvantages ofeach,1.3.3 When to use each,1.3.4 Practical considerations in test execution,1.3.5 Risks associated with each,1.3.6 Relationship between the two when administered inthe same test, and1.3.7 Recommended interpretations of results for productdecisions.1.4 The intended audience for th

10、is guide is the sensoryconsumer professional or marketing research professional (“theresearcher”) who is designing, executing, and interpreting datafrom product tests with acceptance or choice measures, or both.1.5 Only two-sample product tests will be covered in thisguide. However, the issues and r

11、ecommended practices raisedin this guide often apply to multi-sample tests as well. Detailedcoverage of execution tactics, optional types of scales, variousapproaches to data analysis, and extensive discussions of thereliability and validity of these measures are all outside of thescope of this guid

12、e.1.6 UnitsThe values stated in SI units are to be regardedas the standard. No other units of measurement are included inthis standard.1.7 This standard does not purport to address all of thesafety concerns, if any, associated with its use. It is the1This guide is under the jurisdiction of ASTM Comm

13、ittee E18 on SensoryEvaluation and is the direct responsibility of Subcommittee E18.04 on Fundamen-tals of Sensory.Current edition approved Sept. 1, 2014. Published September 2014. DOI:10.1520/E2943-14.Copyright ASTM International, 100 Barr Harbor Drive, PO Box C700, West Conshohocken, PA 19428-2959

14、. United States1responsibility of the user of this standard to establish appro-priate safety and health practices and determine the applica-bility of regulatory limitations prior to use.2. Referenced Documents2.1 ASTM Standards:2E253 Terminology Relating to Sensory Evaluation of Mate-rials and Produ

15、ctsE456 Terminology Relating to Quality and StatisticsE1871 Guide for Serving Protocol for Sensory Evaluation ofFoods and BeveragesE1958 Guide for Sensory Claim SubstantiationE2263 Test Method for Paired Preference TestE2299 Guide for Sensory Evaluation of Products by Chil-dren and Minors3. Terminol

16、ogy3.1 For definitions of terms relating to sensory analysis, seeTerminology E253, and for terms relating to statistics, seeTerminology E456.3.2 Definitions of Terms Specific to This Standard:3.2.1 (alpha) risk, nprobability of concluding that adifference in liking or preference exists, when, in rea

17、lity, onedoes not.3.2.1.1 DiscussionAlso known as Type I error or signifi-cance level.3.2.2 (beta) risk, nprobability of concluding that nodifference in liking or preference exists, when, in reality, onedoes.3.2.2.1 DiscussionAlso known as Type II error.3.2.3 hedonic continuum, nhypothesized underly

18、ing con-tinuous dimension measured by acceptance scales.3.2.3.1 DiscussionIt is presumed to run from strong dis-liking through a neutral region and onto strong liking.3.2.4 labeled affective magnitude scale, nlabeled magni-tude scale (LMS) is a hybrid scaling technique using a verballylabeled line w

19、ith quasi-logarithmic spacing between each labeland the scale consists of a vertical line, which is marked withverbal anchors describing different intensities (for example,“weak,” “strong”).3.2.4.1 DiscussionTypically, subjects are instructed toplace a mark on the line where their perceived intensit

20、y ofsensation lies, with the upper limit of the scale being thestrongest imaginable sensation (1).33.2.5 Likert scale, nattitude scales that can be constructedin an “agree-disagree” format (2).3.2.5.1 DiscussionThe Likert-type scale calls for a gradedresponse to each statement. The response is usual

21、ly expressedin terms of the following five categories: strongly agree (SA),agree (A), undecided (U), disagree (D), and strongly disagree(SD). The individual statements are either clearly favorable orclearly unfavorable (2 and 3).3.2.6 Pmax,nused in forced choice preference measures; atest sensitivit

22、y parameter established before testing and usedalong with the selected values of and to determine thenumber of respondents needed in a study.3.2.6.1 DiscussionPmaxis the proportion of common re-sponses that the researcher wants the test to be able to detectwith a probability of 1 . For example, if a

23、 researcher wantsto have a 90 % confidence level of detecting a 60:40 split inpreference, then Pmax= 60 % and = 0.10.3.2.7 risk, npossible consequences to the researchersclient when the test leads to an incorrect conclusion.3.2.7.1 DiscussionRisk around decisions made based onresearch test results c

24、an be grouped into two types, looselycalled a “false positive” (when the test detects a difference thatdoes not exist) and a “false negative” when the study does notdetect a true difference. In the case of a false positive, thecompany spends development time and resources on an alter-native that doe

25、s not deliver the intended effect. In the case ofa false negative, the product developer or the company willmiss a product opportunity and waste resources developingalternatives.3.2.8 sequential monadic, adjrefers to the presentation orordering in which respondents evaluate products or stimuli.3.2.8

26、.1 DiscussionIn a sequential monadic test, the re-spondent is presented with one product at a time to evaluate.3.2.9 sign test, nstatistical hypothesis test that can be usedto compare two samples or a sample with a standard.3.2.9.1 DiscussionNo assumption is made about the shapeor parameters of the

27、population frequency distribution with thesign test and only the sign of the difference is considered.3.2.10 students t test, nstatistical hypothesis test used tocompare the means of two samples or a sample mean to astandard value.3.2.10.1 DiscussionIt is appropriate when the measure ofinterest is n

28、ormally distributed in small samples and, moregenerally, for continuous, unbounded, symmetric measure-ments when the sample size is larger. Assumptions include noties in the data.3.2.11 Type I error, nsee alpha risk.3.2.12 Type II error, nsee beta risk.3.2.13 Wilcoxon-Mann-Whitney test, WMW, nrank-b

29、asedindependent sampling alternative to the students t-test that isappropriate when the data are measured on a common continu-ous scale that is not normally distributed.3.2.13.1 DiscussionIn these situations, it can be moreefficient (increased statistical power to find a difference at agiven sample

30、size) than a students t-test. Like the studentst-test, it requires the assumption that the data have no ties.4. Summary of Guide4.1 This guide covers the similarities and differences be-tween acceptance and preference measures when used aloneand together in a two-sample test (see Fig. 1). The twomea

31、sures provide different information about respondents2For referenced ASTM standards, visit the ASTM website, www.astm.org, orcontact ASTM Customer Service at serviceastm.org. For Annual Book of ASTMStandards volume information, refer to the standards Document Summary page onthe ASTM website.3The bol

32、dface numbers in parentheses refer to a list of references at the end ofthis standard.E2943 142subjective responses to products and should be deployed tomeet different research or business objectives. Acceptancemeasures are recommended when there is a need to obtaininformation on intensity of liking

33、/disliking and determine therelative hedonic status of two products. Preference measuresare recommended when there is a need to obtain informationon choice behavior or determine an ordinal relationship be-tween two products. Correct sampling of respondents is criticalin both types of test. The resea

34、rcher shall carefully prepare theresearch learning plan and thoroughly review the pros and consof the specific research design chosen (that is, measuringacceptance, measuring preference, measuring both) against thedecision risks associated with each measurement. Acceptanceand preference measures, wh

35、ile imperfect, continue to beextremely useful in managing the risk in developing anddelivering new products to the marketplace.5. Significance and Use5.1 Acceptance and preference are the key measurementstaken in consumer product testing as either a new product ideais developed into testable prototy

36、pes or existing products areevaluated for potential improvements, cost reductions, or otherbusiness reasons. Developing products that are preferredoverall, or liked as well as, or better, on average, compared toa standard or a competitor, among a defined target consumergroup, is usually the main goa

37、l of the product developmentprocess. Thus, it is necessary to test the consumer acceptabilityor the preference of a product or prototype compared to otherprototypes or potential products, a standard product, or otherproducts in the market. The researcher, with input from her/hisstakeholders, has the

38、 responsibility to choose appropriatecomparison products and scaling or test methods to evaluatethem. In the case of a new-to-the-world product, there may ormay not be a relevant product for comparison. In this case, abenchmark score or rating may be used to determine accept-ability. A product or pr

39、ototype that is acceptable to the targetconsumer is one that meets a minimum criterion for liking, anda product that is preferred over an existing product has thepotential to be chosen more often than the less-preferredproduct by the consumer in the marketplace, when all otherfactors are equal.5.2 T

40、he external validity (the extent to which the results ofa study can be generalized) of both acceptance and preferencemeasures to manage decision risk at all stages of the develop-ment cycle is dependent on the ability of the researcher togeneralize the results from the respondent sample to the targe

41、tpopulation at large. This depends both upon the sample ofrespondents and the way the test is constructed. Within thecontext of a single test, acceptance measures tell the relativehedonic status of the two samples, quantitatively, as well aswhere on the hedonic continuum each of the samples falls, t

42、hatis, “disliked,” “neutral,” or “liked.” In contrast, preferencemeasures tell the relative choice status of two samples withina specific respondent group. Results from these measures canand will vary from test to test depending on the number andtype of respondents serving in each test, the size and

43、 nature ofthe sensory differences between the two samples, the methodof executing the test, and any error present in the test. Theidentification, control, measurement, and tracking of variablesthat may influence results across tests (for example, productionlocation, sample age, and storage condition

44、s) are the respon-sibility of the researcher.5.3 While measures of acceptance and preference are bothsubjective responses to products, and can be somewhat related,they provide different information. A product may be “accept-able” but still not be preferred by the consumer over otheralternatives, and

45、 conversely, a product may be preferred overanother but still not be acceptable to the consumer. These twoterms, therefore, should not be used interchangeably. When abipolar hedonic scale with multipoint options is used, theresearcher should specifically refer to “liking,” “acceptance,”or “hedonic r

46、atings.” When preference measures are used, theresearcher should refer to, “preference,” “product selection,” or“choice.” Research professionals themselves should be precisein their usage of the terms “acceptance” and “liking,” to referonly to scaling of liking. These researchers should use theterms

47、 “preference” and “choice” to refer to two (“Prefer A” or“Prefer B”) or three-choice (“Prefer A” or “Prefer B” or “NoPreference”) response options given in a preference test. Inaddition to having different meanings, the two measures alsodo not always provide similar results. This guide will cover th

48、esimilarities and differences in information each provides, someguidelines around implementation, and interpretation of find-ings. This guide will thus give users an understanding of theissues at hand when planning, designing, implementing, andinterpreting results from acceptance and preference test

49、s withconsumers.5.4 While both measures are commonly used to provideinformation for product development decisions and evaluatinga products competitive status, it is important to remember thatpricing, positioning, competitive options, product availability,and other marketplace factors also impact a products success.6. Hedonic Testing Steps in Planning and Conductingan Acceptance or Preference Test6.1 Decide on the Key Question to be Answered: Liking orChoice or BothBefore planning and implementing a test, theresearcher should determine

展开阅读全文