ASTM E2139-2005 Standard Test Method for Same-Different Test《同异试验标准试验方法》.pdf

资源描述

1、Designation: E 2139 05Standard Test Method forSame-Different Test1This standard is issued under the fixed designation E 2139; the number immediately following the designation indicates the year oforiginal adoption or, in the case of revision, the year of last revision. A number in parentheses indica

2、tes the year of last reapproval. Asuperscript epsilon (e) indicates an editorial change since the last revision or reapproval.1. Scope1.1 This test method describes a procedure for comparingtwo products.1.2 This test method does not describe the Thurstonianmodeling approach to this test.1.3 This tes

3、t method is sometimes referred to as the simple-difference test.1.4 A same-different test determines whether two productsare perceived to be the same or different overall.1.5 The procedure of the test described in this test methodconsists of presenting a single pair of samples to each assessor.The p

4、resentation of multiple pairs would require differentstatistical treatment and it is outside of the scope of this testmethod.1.6 This test method is not attribute-specific, unlike thedirectional difference test.1.7 This test method is not intended to determine themagnitude of the difference; however

5、, statistical methods maybe used to estimate the size of the difference.1.8 This test method may be chosen over the triangle orduo-trio tests where sensory fatigue or carry-over are aconcern, or where a simpler task is needed.1.9 This standard may involve hazardous materials, opera-tions, and equipm

6、ent. This standard does not purport toaddress all of the safety concerns, if any, associated with itsuse. It is the responsibility of the user of this standard toestablish appropriate safety and health practices and deter-mine the applicability of regulatory limitations prior to use.2. Referenced Do

7、cuments2.1 ASTM Standards:2E 253 Terminology Relating to Sensory Evaluation of Ma-terials and ProductsE 456 Terminology Relating to Quality and StatisticsE 1871 Practice for Serving Protocol for Sensory Evalua-tion of Foods and Beverages2.2 ASTM Publications:2Manual 26 Sensory Testing Methods, 2nd E

8、ditionSTP 758 Guidelines for the Selection and Training ofSensory Panel MembersSTP 913 Guidelines for Physical Requirements for SensoryEvaluation Laboratories2.3 ISO Standard:3ISO 5495 Sensory AnalysisMethodologyPaired Com-parison3. Terminology3.1 For definition of terms relating to sensory analysis

9、, seeTerminology E 253, and for terms relating to statistics, seeTerminology E 456.3.2 Definitions of Terms Specific to This Standard:3.2.1 a (alpha) riskprobability of concluding that a per-ceptible difference exists when, in reality, one does not (alsoknown as Type I Error or significance level).3

10、.2.2 b (beta) riskprobability of concluding that no per-ceptible difference exists when, in reality, one does (alsoknown as Type II Error).3.2.3 chi-square teststatistical test used to test hypotheseson frequency counts and proportions.3.2.4 D (delta)test sensitivity parameter established priorto te

11、sting and used along with the selected values of a, b, andan estimated value of p1to determine the number of assessorsneeded in a study. Delta (D) is the minimum difference inproportions that the researcher wants to detect, where thedifference is D = p2 p1. D is not a standard measure ofsensory diff

12、erence. The same value of D may correspond todifferent sensory differences for different values of p1(see 9.5for an example).3.2.5 Fishers Exact Test (FET)statistical test of theequality of two independent binomial proportions.3.2.6 p1proportion of assessors in the population whowould respond differ

13、ent to the matched sample pair. Based onexperience with using the same-different test and possibly withthe same type of products, the user may have a prioriknowledge about the value of p1.3.2.7 p2proportion of assessors in the population whowould respond different to the unmatched sample pair.1This

14、test method is under the jurisdiction ofASTM Committee E18 on SensoryEvaluation of Materials and Products and is the direct responsibility of Subcom-mittee E18.04 on Fundamentals of Sensory.Current edition approved Nov. 1, 2005. Published November 2005.2For referenced ASTM standards, visit the ASTM

15、website, www.astm.org, orcontact ASTM Customer Service at serviceastm.org. For Annual Book of ASTMStandards volume information, refer to the standards Document Summary page onthe ASTM website.3Available from American National Standards Institute (ANSI), 25 W. 43rd St.,4th Floor, New York, NY 10036.1

16、Copyright ASTM International, 100 Barr Harbor Drive, PO Box C700, West Conshohocken, PA 19428-2959, United States.3.2.8 power 1-b (beta) riskprobability of concluding thata perceptible difference exists when, in reality, one of size Ddoes.3.2.9 productmaterial to be evaluated.3.2.10 sampleunit of pr

17、oduct prepared, presented, andevaluated in the test.3.2.11 sensitivityterm used to summarize the performancecharacteristics of this test. The sensitivity of the test is definedby the four values selected for a, b, p1, and D.4. Summary of Test Method4.1 Clearly define the test objective in writing.4.

18、2 Choose the number of assessors based on the sensitivitydesired for the test. The sensitivity of the test is in part relatedto two competing risks: the risk of declaring a difference whenthere is none (that is, a-risk), and the risk of not declaring adifference when there is one (that is, b-risk).A

19、cceptable valuesof a and b vary depending on the test objective. The valuesshould be agreed upon by all parties affected by the results ofthe test.4.3 The two products of interest (A and B) are selected.Assessors are presented with one of four possible pairs ofsamples: A/A, B/B, A/B, and B/A. The to

20、tal number of samepairs (A/Aand B/B) usually equals the total number of differentpairs (A/B and B/A). The assessors task is to categorize thegiven pair of samples as same or different.4.4 The data are summarized in a two-by-two table wherethe columns show the type of pair received (same or different

21、)and the rows show the assessors response (same or different).A Fishers Exact Test (FET) is used to determine whether thesamples are perceptibly different. Other statistical methods thatapproximate the FET can sometimes be used.5. Significance and Use5.1 This overall difference test method is used w

22、hen the testobjective is to determine whether a sensory difference exists ordoes not exist between two samples. It is also known as thesimple difference test.5.2 The test is appropriate in situations where samples haveextreme intensities, give rapid sensory fatigue, have longlingering flavors, or ca

23、nnot be consumed in large quantities, ora combination thereof.5.3 The test is also appropriate for situations where thestimulus sites are limited to two (for example, two hands, eachside of the face, two ears).5.4 The test provides a measure of the bias where judgesperceive two same products to be d

24、ifferent.5.5 The test has the advantage of being a simple andintuitive task.6. Apparatus6.1 Carry out the test under conditions that prevent contactbetween assessors until the evaluations have been completed,for example, booths that comply with STP 913.6.2 For food and beverage tests, sample prepara

25、tion andserving sizes should comply with Practice E 1871, or see Refs(1) or (2).47. Definition of Hypotheses7.1 This test can be characterized by a two-by-two table ofprobabilities according to the sample pair that the assessors inthe population would receive and their responses, as follows:Assessor

26、 Would ReceiveMatched Pair(AA or BB)Unmatched Pair(AB or BA)AssessorsResponseSame: 1 p11p2Different: p1p2=(=p1+ D)Total: 1 1where p1and p2are the probabilities of responding differentfor those who would receive the matched pairs and theunmatched pairs, respectively.7.2 To determine whether the sampl

27、es are perceptibly dif-ferent with a given sensitivity, the following one-sided statis-tical hypothesis is tested:Ho: p1= p2Ha: p10).Delta (D) will equal 0 and p1will equal p2if there is nodetectable difference between the samples. This test addresseswhether or not D is greater than 0. Thus, the hyp

28、othesis isone-sided because it is not of interest in this test to considerthat responding different to the matched pair could be morelikely than responding different to the unmatched pair.8. Assessors8.1 All assessors must be familiar with the mechanics of thesame-different test (the format, the tas

29、k, and the procedure ofevaluation). Greater test sensitivity, if needed, may be achievedthrough selection of assessors who demonstrate above averageindividual sensitivity (see STP 758).8.2 In order to perform this test, assessors do not requirespecial sensory training on the samples in question. For

30、example, they do not need to be able to recognize any specificattribute.8.3 The assessors must be sampled from a homogeneouspopulation that is well-defined. The population must be chosenon the basis of the test objective. Defining characteristics of thepopulation can be, for example, training level,

31、 gender, experi-ence with the product, and so forth.9. Number of Assessors9.1 Choose all the sensitivity parameters that are needed tochoose the number of assessors for the test. Choose the a-riskand the b-risk. Based on experience, choose the expected valuefor p1. Choose D, p2 p1, the minimum diffe

32、rence in propor-tions that the researcher wants to detect. The most commonlyused values for a-risk, b-risk, p1and D are a = 0.05, b = 0.20,4The boldface numbers in parentheses refer to the list of references at the end ofthis standard.E2139052p1= 0.3, and D = 0.3. These values can be adjusted on aca

33、se-by-case basis to reflect the sensitivity desired versus thenumber of assessors.9.2 Having defined the required sensitivity (a-risk, b-risk,p1, and D), determine the corresponding sample size fromTable A1.1 (see Ref (9). This is done by first finding thesection of the table with a p1value correspo

34、nding to theproportion of assessors in the population who would responddifferent to the matched sample pair. Second, locate the totalsample size from the intersection of the desired a, p2(or D),and b values. In the case of the most commonly used valueslisted in 9.1, TableA1.1 indicates that 84 asses

35、sors are needed.The sample size n is based on the number of same and differentsamples being equal The sample sizes listed are the totalsample size rounded up to the nearest number evenly divisibleby 4 since there are four possible combinations of the samples.To determine the number of same and diffe

36、rent pairs to prepare,divide n by two.9.3 If the user has no prior experience with the same-different test and has no specific expectation for the value of p1,then two options are available. Either use p1= 0.3 and proceedas indicated in 9.2, or use the last section of Table A1.1. Thissection gives s

37、amples sizes that are the largest required, givena, b, and D, regardless of p1.9.4 Often in practice, the number of assessors is determinedby practical conditions (for example, duration of the experi-ment, number of available assessors, quantity of product, andso forth) However, increasing the numbe

38、r of assessors in-creases the likelihood of detecting small differences. Thus, oneshould expect to use larger numbers of assessors when trying todemonstrate that products are similar compared to when one istrying to demonstrate that they are different.9.4.1 When the number of assessors is fixed, the

39、 power ofthe test (1-b) may be calculated by establishing a value for p1,defining the required sensitivity for a-risk and the D, locatingthe number of assessors nearest the fixed amount, and thenfollowing up the column to the listed b-risk.9.5 If a researcher wants to be 90 % certain of detectingres

40、ponse proportions of p2= 60 % versus the expectedp1= 40 % with an a-risk of 5 %, then D = 0.60 0.40 = 0.20and b = 0.10 or 90 % power. The number of assessors neededin this case is 232 (Table A1.1). If a researcher wants to be90 % certain of detecting response proportions of p2=70%versus the expected

41、 p1= 50 % with an a-risk of 5 %, then D =0.70 0.50 = 0.20 and b = 0.10 or 90 % power. The numberof assessors needed in this case is 224 (Table A1.1).10. Procedure10.1 Determine the number of assessors needed for the testas well as the population that they should represent (forexample, assessors sele

42、cted for a specific sensory sensitivity).10.2 It is critical to the validity of the test that assessorscannot identify the samples from the way in which they arepresented. One should avoid any subtle differences in tempera-ture or appearance, especially color, caused by factors such asthe time seque

43、nce of preparation. It may be possible to maskcolor differences using light filters, subdued illumination orcolored vessels. Prepare samples out of sight and in anidentical manner: same apparatus, same vessels, same quanti-ties of product (see Practice E 1871). The samples may beprepared in advance;

44、 however, this may not be possible for alltypes of products. It is essential that the samples cannot berecognized from the way they are presented.10.3 Prepare serving order worksheet and ballot in advanceof the test to ensure a balanced order of sample presentation ofthe two products, A and B. One o

45、f four possible pairs (A/A,B/B,A/B, and B/A) is assigned to each assessor. Make sure thisassignment is done randomly. Design the test so that thenumber of same pairs equals the number of different pairs. Thepresentation order of the different pairs should be balanced asmuch as possible. Serving orde

46、r worksheets should alsoinclude the identification of the samples for each set.10.4 Prepare the response ballots in a way consistent withthe product you are evaluating. For example, in a taste test,give the following instructions: (1) you will receive twosamples. They may be the same or different; (

47、2) evaluate thesamples from left to right; and (3) determine whether they arethe same or different.10.4.1 The researcher can choose to add an instruction to theballot indicating whether the assessor may re-evaluate thesamples or not.10.4.2 The ballot should also identify the assessor and dateof test

48、, as well as a ballot number that must be related to thesample set identification on the worksheet.10.4.3 A section soliciting comments may be includedfollowing the initial forced-choice question.10.4.4 The example of a ballot is provided in Fig. X2.2.10.5 When possible, present both samples at the

49、same time,along with the response ballot. In some instances, the samplesmay be presented sequentially if required by the type ofproduct or the way they need to be presented, or both.This maybe the case, for example, for the evaluation of a fragrance in aroom where the assessor must change rooms to evaluate thesecond sample.10.6 Collect all ballots and tabulate results for analysis.11. Analysis and Interpretation of Results11.1 The data from the test is summarized in a two-by-twotable, as illustrated in the table below.Assessor ReceivedMatched Pair(AA or BB)Un

展开阅读全文