1、Designation: E2139 05 (Reapproved 2018)Standard Test Method forSame-Different Test1This standard is issued under the fixed designation E2139; the number immediately following the designation indicates the year oforiginal adoption or, in the case of revision, the year of last revision. A number in pa
2、rentheses indicates the year of last reapproval. Asuperscript epsilon () indicates an editorial change since the last revision or reapproval.1. Scope1.1 This test method describes a procedure for comparingtwo products.1.2 This test method does not describe the Thurstonianmodeling approach to this te
3、st.1.3 This test method is sometimes referred to as the simple-difference test.1.4 A same-different test determines whether two productsare perceived to be the same or different overall.1.5 The procedure of the test described in this test methodconsists of presenting a single pair of samples to each
4、 assessor.The presentation of multiple pairs would require differentstatistical treatment and it is outside of the scope of this testmethod.1.6 This test method is not attribute-specific, unlike thedirectional difference test.1.7 This test method is not intended to determine themagnitude of the diff
5、erence; however, statistical methods maybe used to estimate the size of the difference.1.8 This test method may be chosen over the triangle orduo-trio tests where sensory fatigue or carry-over are aconcern, or where a simpler task is needed.1.9 This standard may involve hazardous materials,operation
6、s, and equipment. This standard does not purport toaddress all of the safety concerns, if any, associated with itsuse. It is the responsibility of the user of this standard toestablish appropriate safety, health, and environmental prac-tices and determine the applicability of regulatory limitationsp
7、rior to use.1.10 This international standard was developed in accor-dance with internationally recognized principles on standard-ization established in the Decision on Principles for theDevelopment of International Standards, Guides and Recom-mendations issued by the World Trade Organization Technic
8、alBarriers to Trade (TBT) Committee.2. Referenced Documents2.1 ASTM Standards:2E253 Terminology Relating to Sensory Evaluation of Mate-rials and ProductsE456 Terminology Relating to Quality and StatisticsE1871 Guide for Serving Protocol for Sensory Evaluation ofFoods and Beverages2.2 ASTM Publicatio
9、ns:2Manual 26 Sensory Testing Methods, 2nd EditionSTP 758 Guidelines for the Selection and Training of Sen-sory Panel MembersSTP 913 Guidelines for Physical Requirements for SensoryEvaluation Laboratories2.3 ISO Standard:3ISO 5495 Sensory AnalysisMethodologyPaired Com-parison3. Terminology3.1 For de
10、finition of terms relating to sensory analysis, seeTerminology E253, and for terms relating to statistics, seeTerminology E456.3.2 Definitions of Terms Specific to This Standard:3.2.1 (alpha) riskprobability of concluding that a per-ceptible difference exists when, in reality, one does not (alsoknow
11、n as Type I Error or significance level).3.2.2 (beta) riskprobability of concluding that no per-ceptible difference exists when, in reality, one does (alsoknown as Type II Error).3.2.3 chi-square teststatistical test used to test hypotheseson frequency counts and proportions.3.2.4 (delta)test sensit
12、ivity parameter established priorto testing and used along with the selected values of , , andan estimated value of p1to determine the number of assessorsneeded in a study. Delta () is the minimum difference inproportions that the researcher wants to detect, where thedifference is = p2 p1. is not a
13、standard measure of1This test method is under the jurisdiction ofASTM Committee E18 on SensoryEvaluation and is the direct responsibility of Subcommittee E18.04 on Fundamen-tals of Sensory.Current edition approved Aug. 1, 2018. Published August 2018. Originallyapproved in 2005. Last previous edition
14、 approved in 2011 as E2139 05 (2011).DOI: 10.1520/E2139-05R18.2For referenced ASTM standards, visit the ASTM website, www.astm.org, orcontact ASTM Customer Service at serviceastm.org. For Annual Book of ASTMStandards volume information, refer to the standards Document Summary page onthe ASTM website
15、.3Available from American National Standards Institute (ANSI), 25 W. 43rd St.,4th Floor, New York, NY 10036, http:/www.ansi.org.Copyright ASTM International, 100 Barr Harbor Drive, PO Box C700, West Conshohocken, PA 19428-2959. United StatesThis international standard was developed in accordance wit
16、h internationally recognized principles on standardization established in the Decision on Principles for theDevelopment of International Standards, Guides and Recommendations issued by the World Trade Organization Technical Barriers to Trade (TBT) Committee.1sensory difference. The same value of may
17、 correspond todifferent sensory differences for different values of p1(see 9.5for an example).3.2.5 Fishers Exact Test (FET)statistical test of the equal-ity of two independent binomial proportions.3.2.6 p1proportion of assessors in the population whowould respond different to the matched sample pai
18、r. Based onexperience with using the same-different test and possibly withthe same type of products, the user may have a prioriknowledge about the value of p1.3.2.7 p2proportion of assessors in the population whowould respond different to the unmatched sample pair.3.2.8 power 1- (beta) riskprobabili
19、ty of concluding thata perceptible difference exists when, in reality, one of size does.3.2.9 productmaterial to be evaluated.3.2.10 sampleunit of product prepared, presented, andevaluated in the test.3.2.11 sensitivityterm used to summarize the performancecharacteristics of this test. The sensitivi
20、ty of the test is definedby the four values selected for , , p1, and .4. Summary of Test Method4.1 Clearly define the test objective in writing.4.2 Choose the number of assessors based on the sensitivitydesired for the test. The sensitivity of the test is in part relatedto two competing risks: the r
21、isk of declaring a difference whenthere is none (that is, -risk), and the risk of not declaring adifference when there is one (that is, -risk).Acceptable valuesof and vary depending on the test objective. The valuesshould be agreed upon by all parties affected by the results ofthe test.4.3 The two p
22、roducts of interest (A and B) are selected.Assessors are presented with one of four possible pairs ofsamples: A/A, B/B, A/B, and B/A. The total number of samepairs (A/Aand B/B) usually equals the total number of differentpairs (A/B and B/A). The assessors task is to categorize thegiven pair of sampl
23、es as same or different.4.4 The data are summarized in a two-by-two table wherethe columns show the type of pair received (same or different)and the rows show the assessors response (same or different).A Fishers Exact Test (FET) is used to determine whether thesamples are perceptibly different. Othe
24、r statistical methods thatapproximate the FET can sometimes be used.5. Significance and Use5.1 This overall difference test method is used when the testobjective is to determine whether a sensory difference exists ordoes not exist between two samples. It is also known as thesimple difference test.5.
25、2 The test is appropriate in situations where samples haveextreme intensities, give rapid sensory fatigue, have longlingering flavors, or cannot be consumed in large quantities, ora combination thereof.5.3 The test is also appropriate for situations where thestimulus sites are limited to two (for ex
26、ample, two hands, eachside of the face, two ears).5.4 The test provides a measure of the bias where judgesperceive two same products to be different.5.5 The test has the advantage of being a simple andintuitive task.6. Apparatus6.1 Carry out the test under conditions that prevent contactbetween asse
27、ssors until the evaluations have been completed,for example, booths that comply with STP 913.6.2 For food and beverage tests, sample preparation andserving sizes should comply with Practice E1871, or see Refs(1) or (2).47. Definition of Hypotheses7.1 This test can be characterized by a two-by-two ta
28、ble ofprobabilities according to the sample pair that the assessors inthe population would receive and their responses, as follows:Assessor Would ReceiveMatched Pair(AA or BB)Unmatched Pair(AB or BA)AssessorsResponseSame: 1 p11p2Different: p1p2=(=p1+ )Total: 1 1where p1and p2are the probabilities of
29、 responding differentfor those who would receive the matched pairs and theunmatched pairs, respectively.7.2 To determine whether the samples are perceptibly dif-ferent with a given sensitivity, the following one-sided statis-tical hypothesis is tested:Ho: p1= p2Ha: p10).Delta () will equal 0 and p1w
30、ill equal p2if there is nodetectable difference between the samples. This test addresseswhether or not is greater than 0. Thus, the hypothesis isone-sided because it is not of interest in this test to considerthat responding different to the matched pair could be morelikely than responding different
31、 to the unmatched pair.8. Assessors8.1 All assessors must be familiar with the mechanics of thesame-different test (the format, the task, and the procedure ofevaluation). Greater test sensitivity, if needed, may be achievedthrough selection of assessors who demonstrate above averageindividual sensit
32、ivity (see STP 758).8.2 In order to perform this test, assessors do not requirespecial sensory training on the samples in question. Forexample, they do not need to be able to recognize any specificattribute.4The boldface numbers in parentheses refer to the list of references at the end ofthis standa
33、rd.E2139 05 (2018)28.3 The assessors must be sampled from a homogeneouspopulation that is well-defined. The population must be chosenon the basis of the test objective. Defining characteristics of thepopulation can be, for example, training level, gender, experi-ence with the product, and so forth.9
34、. Number of Assessors9.1 Choose all the sensitivity parameters that are needed tochoose the number of assessors for the test. Choose the -riskand the -risk. Based on experience, choose the expected valuefor p1. Choose , p2 p1, the minimum difference in propor-tions that the researcher wants to detec
35、t. The most commonlyused values for -risk, -risk, p1and are = 0.05, = 0.20,p1= 0.3, and = 0.3. These values can be adjusted on acase-by-case basis to reflect the sensitivity desired versus thenumber of assessors.9.2 Having defined the required sensitivity (-risk, -risk,p1, and ), determine the corre
36、sponding sample size fromTable A1.1 (see Ref (3). This is done by first finding thesection of the table with a p1value corresponding to theproportion of assessors in the population who would responddifferent to the matched sample pair. Second, locate the totalsample size from the intersection of the
37、 desired , p2(or ),and values. In the case of the most commonly used valueslisted in 9.1, TableA1.1 indicates that 84 assessors are needed.The sample size n is based on the number of same and differentsamples being equal. The sample sizes listed are the totalsample size rounded up to the nearest num
38、ber evenly divisibleby 4 since there are four possible combinations of the samples.To determine the number of same and different pairs to prepare,divide n by two.9.3 If the user has no prior experience with the same-different test and has no specific expectation for the value of p1,then two options
39、are available. Either use p1= 0.3 and proceedas indicated in 9.2, or use the last section of Table A1.1. Thissection gives sample sizes that are the largest required, given , and , regardless of p1.9.4 Often in practice, the number of assessors is determinedby practical conditions (for example, dura
40、tion of theexperiment, number of available assessors, quantity of product,and so forth). However, increasing the number of assessorsincreases the likelihood of detecting small differences. Thus,one should expect to use larger numbers of assessors whentrying to demonstrate that products are similar c
41、ompared towhen one is trying to demonstrate that they are different.9.4.1 When the number of assessors is fixed, the power ofthe test (1-) may be calculated by establishing a value for p1,defining the required sensitivity for -risk and the , locatingthe number of assessors nearest the fixed amount,
42、and thenfollowing up the column to the listed -risk.9.5 If a researcher wants to be 90 % certain of detectingresponse proportions of p2= 60 % versus the expectedp1= 40 % with an -risk of 5 %, then = 0.60 0.40 = 0.20and = 0.10 or 90 % power. The number of assessors neededin this case is 232 (Table A1
43、.1). If a researcher wants to be90 % certain of detecting response proportions of p2=70%versus the expected p1= 50 % with an -risk of 5 %, then =0.70 0.50 = 0.20 and = 0.10 or 90 % power. The number ofassessors needed in this case is 224 (Table A1.1).10. Procedure10.1 Determine the number of assesso
44、rs needed for the testas well as the population that they should represent (forexample, assessors selected for a specific sensory sensitivity).10.2 It is critical to the validity of the test that assessorscannot identify the samples from the way in which they arepresented. One should avoid any subtl
45、e differences in tempera-ture or appearance, especially color, caused by factors such asthe time sequence of preparation. It may be possible to maskcolor differences using light filters, subdued illumination orcolored vessels. Prepare samples out of sight and in anidentical manner: same apparatus, s
46、ame vessels, same quanti-ties of product (see Practice E1871). The samples may beprepared in advance; however, this may not be possible for alltypes of products. It is essential that the samples cannot berecognized from the way they are presented.10.3 Prepare serving order worksheet and ballot in ad
47、vanceof the test to ensure a balanced order of sample presentation ofthe two products, A and B. One of four possible pairs (A/A,B/B,A/B, and B/A) is assigned to each assessor. Make sure thisassignment is done randomly. Design the test so that thenumber of same pairs equals the number of different pa
48、irs. Thepresentation order of the different pairs should be balanced asmuch as possible. Serving order worksheets should alsoinclude the identification of the samples for each set.10.4 Prepare the response ballots in a way consistent withthe product you are evaluating. For example, in a taste test,g
49、ive the following instructions: (1) you will receive twosamples. They may be the same or different; (2) evaluate thesamples from left to right; and (3) determine whether they arethe same or different.10.4.1 The researcher can choose to add an instruction to theballot indicating whether the assessor may re-evaluate thesamples or not.10.4.2 The ballot should also identify the assessor and dateof test, as well as a ballot number that must be related to thesample set identification on the wo