1、Designation: E2139 05 (Reapproved 2011)Standard Test Method forSame-Different Test1This standard is issued under the fixed designation E2139; the number immediately following the designation indicates the year oforiginal adoption or, in the case of revision, the year of last revision. A number in pa
2、rentheses indicates the year of last reapproval. Asuperscript epsilon () indicates an editorial change since the last revision or reapproval.1. Scope1.1 This test method describes a procedure for comparingtwo products.1.2 This test method does not describe the Thurstonianmodeling approach to this te
3、st.1.3 This test method is sometimes referred to as the simple-difference test.1.4 A same-different test determines whether two productsare perceived to be the same or different overall.1.5 The procedure of the test described in this test methodconsists of presenting a single pair of samples to each
4、 assessor.The presentation of multiple pairs would require differentstatistical treatment and it is outside of the scope of this testmethod.1.6 This test method is not attribute-specific, unlike thedirectional difference test.1.7 This test method is not intended to determine themagnitude of the diff
5、erence; however, statistical methods maybe used to estimate the size of the difference.1.8 This test method may be chosen over the triangle orduo-trio tests where sensory fatigue or carry-over are aconcern, or where a simpler task is needed.1.9 This standard may involve hazardous materials, opera-ti
6、ons, and equipment. This standard does not purport toaddress all of the safety concerns, if any, associated with itsuse. It is the responsibility of the user of this standard toestablish appropriate safety and health practices and deter-mine the applicability of regulatory limitations prior to use.2
7、. Referenced Documents2.1 ASTM Standards:2E253 Terminology Relating to Sensory Evaluation of Ma-terials and ProductsE456 Terminology Relating to Quality and StatisticsE1871 Guide for Serving Protocol for Sensory Evaluationof Foods and Beverages2.2 ASTM Publications:2Manual 26 Sensory Testing Methods
8、, 2nd EditionSTP 758 Guidelines for the Selection and Training ofSensory Panel MembersSTP 913 Guidelines for Physical Requirements for SensoryEvaluation Laboratories2.3 ISO Standard:3ISO 5495 Sensory AnalysisMethodologyPaired Com-parison3. Terminology3.1 For definition of terms relating to sensory a
9、nalysis, seeTerminology E253, and for terms relating to statistics, seeTerminology E456.3.2 Definitions of Terms Specific to This Standard:3.2.1 a (alpha) riskprobability of concluding that a per-ceptible difference exists when, in reality, one does not (alsoknown as Type I Error or significance lev
10、el).3.2.2 b (beta) riskprobability of concluding that no per-ceptible difference exists when, in reality, one does (alsoknown as Type II Error).3.2.3 chi-square teststatistical test used to test hypotheseson frequency counts and proportions.3.2.4 D (delta)test sensitivity parameter established prior
11、to testing and used along with the selected values of a, b, andan estimated value of p1to determine the number of assessorsneeded in a study. Delta (D) is the minimum difference inproportions that the researcher wants to detect, where thedifference is D = p2 p1. D is not a standard measure ofsensory
12、 difference. The same value of D may correspond todifferent sensory differences for different values of p1(see 9.5for an example).3.2.5 Fishers Exact Test (FET)statistical test of theequality of two independent binomial proportions.3.2.6 p1proportion of assessors in the population whowould respond d
13、ifferent to the matched sample pair. Based onexperience with using the same-different test and possibly withthe same type of products, the user may have a prioriknowledge about the value of p1.3.2.7 p2proportion of assessors in the population whowould respond different to the unmatched sample pair.1
14、This test method is under the jurisdiction ofASTM Committee E18 on SensoryEvaluation and is the direct responsibility of Subcommittee E18.04 on Fundamen-tals of Sensory.Current edition approved Aug. 1, 2011. Published August 2011. Originallyapproved in 2005. Last previous edition approved in 2005 as
15、 E213905. DOI:10.1520/E2139-05R11.2For referenced ASTM standards, visit the ASTM website, www.astm.org, orcontact ASTM Customer Service at serviceastm.org. For Annual Book of ASTMStandards volume information, refer to the standards Document Summary page onthe ASTM website.3Available from American Na
16、tional Standards Institute (ANSI), 25 W. 43rd St.,4th Floor, New York, NY 10036, http:/www.ansi.org.1Copyright ASTM International, 100 Barr Harbor Drive, PO Box C700, West Conshohocken, PA 19428-2959, United States.3.2.8 power 1-b (beta) riskprobability of concluding thata perceptible difference exi
17、sts when, in reality, one of size Ddoes.3.2.9 productmaterial to be evaluated.3.2.10 sampleunit of product prepared, presented, andevaluated in the test.3.2.11 sensitivityterm used to summarize the performancecharacteristics of this test. The sensitivity of the test is definedby the four values sele
18、cted for a, b, p1, and D.4. Summary of Test Method4.1 Clearly define the test objective in writing.4.2 Choose the number of assessors based on the sensitivitydesired for the test. The sensitivity of the test is in part relatedto two competing risks: the risk of declaring a difference whenthere is no
19、ne (that is, a-risk), and the risk of not declaring adifference when there is one (that is, b-risk).Acceptable valuesof a and b vary depending on the test objective. The valuesshould be agreed upon by all parties affected by the results ofthe test.4.3 The two products of interest (A and B) are selec
20、ted.Assessors are presented with one of four possible pairs ofsamples: A/A, B/B, A/B, and B/A. The total number of samepairs (A/Aand B/B) usually equals the total number of differentpairs (A/B and B/A). The assessors task is to categorize thegiven pair of samples as same or different.4.4 The data ar
21、e summarized in a two-by-two table wherethe columns show the type of pair received (same or different)and the rows show the assessors response (same or different).A Fishers Exact Test (FET) is used to determine whether thesamples are perceptibly different. Other statistical methods thatapproximate t
22、he FET can sometimes be used.5. Significance and Use5.1 This overall difference test method is used when the testobjective is to determine whether a sensory difference exists ordoes not exist between two samples. It is also known as thesimple difference test.5.2 The test is appropriate in situations
23、 where samples haveextreme intensities, give rapid sensory fatigue, have longlingering flavors, or cannot be consumed in large quantities, ora combination thereof.5.3 The test is also appropriate for situations where thestimulus sites are limited to two (for example, two hands, eachside of the face,
24、 two ears).5.4 The test provides a measure of the bias where judgesperceive two same products to be different.5.5 The test has the advantage of being a simple andintuitive task.6. Apparatus6.1 Carry out the test under conditions that prevent contactbetween assessors until the evaluations have been c
25、ompleted,for example, booths that comply with STP 913.6.2 For food and beverage tests, sample preparation andserving sizes should comply with Practice E1871, or see Refs(1) or (2).47. Definition of Hypotheses7.1 This test can be characterized by a two-by-two table ofprobabilities according to the sa
26、mple pair that the assessors inthe population would receive and their responses, as follows:Assessor Would ReceiveMatched Pair(AA or BB)Unmatched Pair(AB or BA)AssessorsResponseSame: 1 p11p2Different: p1p2=(=p1+ D)Total: 1 1where p1and p2are the probabilities of responding differentfor those who wou
27、ld receive the matched pairs and theunmatched pairs, respectively.7.2 To determine whether the samples are perceptibly dif-ferent with a given sensitivity, the following one-sided statis-tical hypothesis is tested:Ho: p1= p2Ha: p10).Delta (D) will equal 0 and p1will equal p2if there is nodetectable
28、difference between the samples. This test addresseswhether or not D is greater than 0. Thus, the hypothesis isone-sided because it is not of interest in this test to considerthat responding different to the matched pair could be morelikely than responding different to the unmatched pair.8. Assessors
29、8.1 All assessors must be familiar with the mechanics of thesame-different test (the format, the task, and the procedure ofevaluation). Greater test sensitivity, if needed, may be achievedthrough selection of assessors who demonstrate above averageindividual sensitivity (see STP 758).8.2 In order to
30、 perform this test, assessors do not requirespecial sensory training on the samples in question. Forexample, they do not need to be able to recognize any specificattribute.8.3 The assessors must be sampled from a homogeneouspopulation that is well-defined. The population must be chosenon the basis o
31、f the test objective. Defining characteristics of thepopulation can be, for example, training level, gender, experi-ence with the product, and so forth.9. Number of Assessors9.1 Choose all the sensitivity parameters that are needed tochoose the number of assessors for the test. Choose the a-riskand
32、the b-risk. Based on experience, choose the expected valuefor p1. Choose D, p2 p1, the minimum difference in propor-tions that the researcher wants to detect. The most commonlyused values for a-risk, b-risk, p1and D are a = 0.05, b = 0.20,4The boldface numbers in parentheses refer to the list of ref
33、erences at the end ofthis standard.E2139 05 (2011)2p1= 0.3, and D = 0.3. These values can be adjusted on acase-by-case basis to reflect the sensitivity desired versus thenumber of assessors.9.2 Having defined the required sensitivity (a-risk, b-risk,p1, and D), determine the corresponding sample siz
34、e fromTable A1.1 (see Ref (9). This is done by first finding thesection of the table with a p1value corresponding to theproportion of assessors in the population who would responddifferent to the matched sample pair. Second, locate the totalsample size from the intersection of the desired a, p2(or D
35、),and b values. In the case of the most commonly used valueslisted in 9.1, TableA1.1 indicates that 84 assessors are needed.The sample size n is based on the number of same and differentsamples being equal The sample sizes listed are the totalsample size rounded up to the nearest number evenly divis
36、ibleby 4 since there are four possible combinations of the samples.To determine the number of same and different pairs to prepare,divide n by two.9.3 If the user has no prior experience with the same-different test and has no specific expectation for the value of p1,then two options are available. E
37、ither use p1= 0.3 and proceedas indicated in 9.2, or use the last section of Table A1.1. Thissection gives samples sizes that are the largest required, givena, b, and D, regardless of p1.9.4 Often in practice, the number of assessors is determinedby practical conditions (for example, duration of the
38、 experi-ment, number of available assessors, quantity of product, andso forth) However, increasing the number of assessors in-creases the likelihood of detecting small differences. Thus, oneshould expect to use larger numbers of assessors when trying todemonstrate that products are similar compared
39、to when one istrying to demonstrate that they are different.9.4.1 When the number of assessors is fixed, the power ofthe test (1-b) may be calculated by establishing a value for p1,defining the required sensitivity for a-risk and the D, locatingthe number of assessors nearest the fixed amount, and t
40、henfollowing up the column to the listed b-risk.9.5 If a researcher wants to be 90 % certain of detectingresponse proportions of p2= 60 % versus the expectedp1= 40 % with an a-risk of 5 %, then D = 0.60 0.40 = 0.20and b = 0.10 or 90 % power. The number of assessors neededin this case is 232 (Table A
41、1.1). If a researcher wants to be90 % certain of detecting response proportions of p2=70%versus the expected p1= 50 % with an a-risk of 5 %, then D =0.70 0.50 = 0.20 and b = 0.10 or 90 % power. The numberof assessors needed in this case is 224 (Table A1.1).10. Procedure10.1 Determine the number of a
42、ssessors needed for the testas well as the population that they should represent (forexample, assessors selected for a specific sensory sensitivity).10.2 It is critical to the validity of the test that assessorscannot identify the samples from the way in which they arepresented. One should avoid any
43、 subtle differences in tempera-ture or appearance, especially color, caused by factors such asthe time sequence of preparation. It may be possible to maskcolor differences using light filters, subdued illumination orcolored vessels. Prepare samples out of sight and in anidentical manner: same appara
44、tus, same vessels, same quanti-ties of product (see Practice E1871). The samples may beprepared in advance; however, this may not be possible for alltypes of products. It is essential that the samples cannot berecognized from the way they are presented.10.3 Prepare serving order worksheet and ballot
45、 in advanceof the test to ensure a balanced order of sample presentation ofthe two products, A and B. One of four possible pairs (A/A,B/B,A/B, and B/A) is assigned to each assessor. Make sure thisassignment is done randomly. Design the test so that thenumber of same pairs equals the number of differ
46、ent pairs. Thepresentation order of the different pairs should be balanced asmuch as possible. Serving order worksheets should alsoinclude the identification of the samples for each set.10.4 Prepare the response ballots in a way consistent withthe product you are evaluating. For example, in a taste
47、test,give the following instructions: (1) you will receive twosamples. They may be the same or different; (2) evaluate thesamples from left to right; and (3) determine whether they arethe same or different.10.4.1 The researcher can choose to add an instruction to theballot indicating whether the ass
48、essor may re-evaluate thesamples or not.10.4.2 The ballot should also identify the assessor and dateof test, as well as a ballot number that must be related to thesample set identification on the worksheet.10.4.3 A section soliciting comments may be includedfollowing the initial forced-choice questi
49、on.10.4.4 The example of a ballot is provided in Fig. X2.2.10.5 When possible, present both samples at the same time,along with the response ballot. In some instances, the samplesmay be presented sequentially if required by the type ofproduct or the way they need to be presented, or both.This maybe the case, for example, for the evaluation of a fragrance in aroom where the assessor must change rooms to evaluate thesecond sample.10.6 Collect all ballots and tabulate results for analysis.11. Analysis and Interpretation of Results11.1 The data from the test