1、Designation: E2935 16 An American National StandardStandard Practice forConducting Equivalence Testing in Laboratory Applications1This standard is issued under the fixed designation E2935; the number immediately following the designation indicates the year oforiginal adoption or, in the case of revi
2、sion, the year of last revision. A number in parentheses indicates the year of last reapproval. Asuperscript epsilon () indicates an editorial change since the last revision or reapproval.1. Scope1.1 This practice provides statistical methodology for con-ducting equivalence testing on numerical data
3、 from twosources to determine if their true means or variances differ byno more than predetermined limits.1.2 Applications include (1) equivalence testing for biasagainst an accepted reference value, (2) determining meansequivalence of two test methods, test apparatus, instruments,reagent sources, o
4、r operators within a laboratory or equiva-lence of two laboratories in a method transfer, and (3)determining non-inferiority of a modified test procedure versusa current test procedure with respect to a performance charac-teristic.1.3 The guidance in this standard applies only to experi-ments conduc
5、ted on a single material at a given level of the testresult.1.4 Guidance is given for determining the amount of datarequired for an equivalence trial. The control of risks associ-ated with the equivalence decision is discussed.1.5 The values stated in SI units are to be regarded asstandard. No other
6、 units of measurement are included in thisstandard.1.6 This standard does not purport to address all of thesafety concerns, if any, associated with its use. It is theresponsibility of the user of this standard to establish appro-priate safety and health practices and determine the applica-bility of
7、regulatory limitations prior to use.2. Referenced Documents2.1 ASTM Standards:2E177 Practice for Use of the Terms Precision and Bias inASTM Test MethodsE456 Terminology Relating to Quality and StatisticsE2282 Guide for Defining the Test Result of a Test MethodE2586 Practice for Calculating and Using
8、 Basic Statistics2.2 USP Standard:3USP Validation of Alternative MicrobiologicalMethods3. Terminology3.1 DefinitionsSee Terminology E456 for a more exten-sive listing of statistical terms.3.1.1 accepted reference value, na value that serves as anagreed-upon reference for comparison, and which is der
9、ivedas: (1) a theoretical or established value, based on scientificprinciples, (2) an assigned or certified value, based on experi-mental work of some national or international organization, or(3) a consensus or certified value, based on collaborativeexperimental work under the auspices of a scienti
10、fic orengineering group. E1773.1.2 bias, nthe difference between the expectation of thetest results and an accepted reference value. E1773.1.3 confidence interval, nan interval estimate L, Uwith the statistics L and U as limits for the parameter andwith confidence level 1 , where Pr(L U) 1 . E25863.
11、1.3.1 DiscussionThe confidence level, 1 , reflects theproportion of cases that the confidence interval L, U wouldcontain or cover the true parameter value in a series of repeatedrandom samples under identical conditions. Once L and U aregiven values, the resulting confidence interval either does ord
12、oes not contain it. In this sense “confidence” applies not to theparticular interval but only to the long run proportion of caseswhen repeating the procedure many times.3.1.4 confidence level, nthe value, 1 , of the probabilityassociated with a confidence interval, often expressed as apercentage. E2
13、5863.1.4.1 Discussion is generally a small number. Confi-dence level is often 95 % or 99 %.3.1.5 confidence limit, neach of the limits, L and U, of aconfidence interval, or the limit of a one-sided confidenceinterval. E25861This test method is under the jurisdiction of ASTM Committee E11 on Qualitya
14、nd Statistics and is the direct responsibility of Subcommittee E11.20 on TestMethod Evaluation and Quality Control.Current edition approved Nov. 15, 2016. Published January 2017. Originallyapproved in 2013. Last previous edition approved in 2015 as E2935 15. DOI:10.1520/E2935-16.2For referenced ASTM
15、 standards, visit the ASTM website, www.astm.org, orcontact ASTM Customer Service at serviceastm.org. For Annual Book of ASTMStandards volume information, refer to the standards Document Summary page onthe ASTM website.3Available from U.S. Pharmacopeial Convention (USP), 12601 TwinbrookPkwy., Rockvi
16、lle, MD 20852-1790, http:/www.usp.org.Copyright ASTM International, 100 Barr Harbor Drive, PO Box C700, West Conshohocken, PA 19428-2959. United StatesThis international standard was developed in accordance with internationally recognized principles on standardization established in the Decision on
17、Principles for theDevelopment of International Standards, Guides and Recommendations issued by the World Trade Organization Technical Barriers to Trade (TBT) Committee.13.1.6 degrees of freedom, nthe number of independentdata points minus the number of parameters that have to beestimated before calc
18、ulating the variance. E25863.1.7 equivalence, ncondition that two population param-eters differ by no more than predetermined limits.3.1.8 intermediate precision conditions, nconditions un-der which test results are obtained with the same test methodusing test units or test specimens taken at random
19、 from a singlequantity of material that is as nearly homogeneous as possible,and with changing conditions such as operator, measuringequipment, location within the laboratory, and time. E1773.1.9 mean, nof a population, , average or expectedvalue of a characteristic in a population of a sample, Xsum
20、of the observed values in the sample divided by the samplesize. E25863.1.10 percentile, nquantile of a sample or a population,for which the fraction less than or equal to the value isexpressed as a percentage. E25863.1.11 population, nthe totality of items or units ofmaterial under consideration. E2
21、5863.1.12 population parameter, nsummary measure of thevalues of some characteristic of a population. E25863.1.13 precision, nthe closeness of agreement betweenindependent test results obtained under stipulated conditions.E1773.1.14 quantile, nvalue such that a fraction f of the sampleor population
22、is less than or equal to that value. E25863.1.15 repeatability, nprecision under repeatabilityconditions. E1773.1.16 repeatability conditions, nconditions where inde-pendent test results are obtained with the same method onidentical test items in the same laboratory by the same operatorusing the sam
23、e equipment within short intervals of time. E1773.1.17 repeatability standard deviation (sr), nthe standarddeviation of test results obtained under repeatabilityconditions. E1773.1.18 sample, na group of observations or test results,taken from a larger collection of observations or test results,whic
24、h serves to provide information that may be used as a basisfor making a decision concerning the larger collection. E25863.1.19 sample size, n, nnumber of observed values in thesample. E25863.1.20 sample statistic, nsummary measure of the ob-served values of a sample. E25863.1.21 standard deviationof
25、 a population, , the squareroot of the average or expected value of the squared deviationof a variable from its mean; of a sample, s, the square rootof the sum of the squared deviations of the observed values inthe sample from their mean divided by the sample sizeminus 1. E25863.1.22 test result, nt
26、he value of a characteristic obtainedby carrying out a specified test method. E22823.1.23 test unit, nthe total quantity of material (containingone or more test specimens) needed to obtain a test result asspecified in the test method. See test result. E22823.1.24 variance, 2,s2,nsquare of the standa
27、rd deviationof the population or sample. E25863.2 Definitions of Terms Specific to This Standard:3.2.1 bias equivalence, nequivalence of a populationmean with an accepted reference value.3.2.2 equivalence limit, E, nin equivalence testing, a limiton the difference between two population parameters.3
28、.2.2.1 DiscussionIn certain applications, this may betermed practical limit or practical difference.3.2.3 equivalence test, na statistical test conducted withinpredetermined risks to confirm equivalence of two populationparameters.3.2.4 means equivalence, nequivalence of two populationmeans.3.2.5 no
29、n-inferiority, ncondition that the difference inmeans or variances of test results between a modified testingprocess and a current testing process with respect to aperformance characteristic is no greater than a predeterminedlimit in the direction of inferiority of the modified process tothe current
30、 process.3.2.5.1 DiscussionOther terms used for non-inferior are“equivalent or better” or “at least equivalent as.”3.2.6 paired samples design, nin means equivalencetesting, single samples are taken from the two populations at anumber of sampling points.3.2.6.1 DiscussionThis design is termed a rand
31、omizedblock design for a general number of populations sampled, andeach group of data within a sampling point is termed a block.3.2.7 power, nin equivalence testing, the probability ofaccepting equivalence, given the true difference between twopopulation means.3.2.7.1 DiscussionIn the case of testin
32、g for bias equiva-lence the power is the probability of accepting equivalence,given the true difference between a population mean and anaccepted reference value.3.2.8 two independent samples design, nin means equiva-lence testing, replicate test results are determined indepen-dently from two populat
33、ions at a single sampling time for eachpopulation.3.2.8.1 DiscussionThis design is termed a completelyrandomized design for a general number of populationssampled.3.2.9 two one-sided tests (TOST) procedure, na statisticalprocedure used for testing the equivalence of the parametersfrom two distributi
34、ons (see equivalence).3.3 Symbols:B = bias (7.1.1)dj= difference between a pair of test results at samplingpoint j (7.1.1)d= average difference (7.1.1)D = difference in sample means (6.1.2)(X1.1.2)E2935 162E = equivalence limit (5.2)E1= lower equivalence limit (5.2.1)E2= upper equivalence limit (5.2
35、.1)f = degrees of freedom for s (8.1.1)(X1.1.2)F1=(1)th percentile of the F distribution (9.3.1)fi= degrees of freedom for si(6.1.1)fp= degrees of freedom for sp(6.1.2)( ) = the cumulative F distribution function (X1.6.3)H0: = null hypothesis (X1.1.1)HA: = alternate hypothesis (X1.1.1)n = sample siz
36、e (number of test results) from a popu-lation (5.4)(6.1.3)(7.1.1)(8.1.1)ni= sample size from ith population (6.1.1)n1= sample size from population 1 (6.1.2)n2= sample size from population 2 (6.1.2)R = ratio of two sample variances (5.5.3)5 = ratio of two population variances (X1.6.3)s = sample stand
37、ard deviation (8.1.1)sB= sample standard deviation for bias (8.1.2)sd= standard deviation of the difference between twotest results (7.1.1)sD= sample standard deviation for mean difference(6.1.3)(X1.1.2)si= sample standard deviation for ith population (6.1.1)si2= sample variance for ith population (
38、6.1.1)s12= sample variance for population 1 (6.1.2)s12= variance of test results from the current process(5.5.3)s22= sample variance for population 2 (6.1.2)s22= variance of test results from the modified process(5.5.3)sp= pooled sample standard deviation (6.1.2)sr= repeatability sample standard dev
39、iation (6.2)t = Students t statistic (6.1.4)(7.1.3)(8.1.3)t12,f= (1-)th percentile of the Students t distributionwith f degrees of freedom (X1.1.2)Xij= jth test result from the ith population (6.1)UCLR= = upper confidence limit for 5 (9.3.1)X= test result average (8.1.1)Xi= test result average for t
40、he ith population (6.1.1)X1= test result average for population 1 (6.1.3)X2= test result average for population 2 (6.1.3)Z12= (1-)th percentile of the standard normal distribu-tion (X1.6.1) = consumers risk (5.2.3)(6.2)(7.2) = producers risk (5.4.1) = true mean difference between populations (5.4.1)
41、 = population mean (X1.4.1)i= ith population mean (X1.1.1) = approximate degrees of freedom for sD(X1.1.4) = standard deviation of the test method (5.2)d= standard deviation of the true difference betweentwo populations (7.2)() = standard normal cumulative distribution function(X1.6.1)3.4 Acronyms:3
42、.4.1 ARV, naccepted reference value (5.3.3)(8.1)(X1.4)3.4.2 CRM, ncertified reference material (5.3.3)(8.1)3.4.3 ILS, ninterlaboratory study (6.2)3.4.4 LCL, nlower confidence limit (6.2.5)(7.2.3)3.4.5 TOST, ntwo one-sided tests (5.5.1) (Section 6)(Section 7) (Section 8)(Appendix X1)3.4.6 UCL, nupper
43、 confidence limit (6.2.5)(7.2.3)4. Significance and Use4.1 Laboratories conducting routine testing have a continu-ing need to make improvements in their testing processes. Inthese situations it must be demonstrated that any changes willnot cause an undesirable shift in the test results from thecurre
44、nt testing process nor substantially affect a performancecharacteristic of the test method. This standard providesguidance on experiments and statistical methods needed todemonstrate that the test results from a modified testing processare equivalent to those from the current testing process, wheree
45、quivalence is defined as agreement within a prescribed limit,termed an equivalence limit.4.1.1 Examples of modifications to the testing processinclude, but are not limited, to the following:(1) Changes to operating levels in the steps of the testmethod procedure,(2) Installation of new instruments,
46、apparatus, or sources ofreagents and test materials,(3) Evaluation of new personnel performing the testing,and(4) Transfer of testing to a new location.4.1.2 The equivalence limit, which represents a worst-casedifference, is determined prior to the equivalence test and itsvalue is usually set by con
47、sensus among subject-matter ex-perts.4.2 Two principal types of equivalence are covered in thepractice, means equivalence and non-inferiority. Meansequivalence implies that a sustained shift in test resultsbetween the modified and current testing processes refers to anabsolute difference, meaning di
48、fferences in either directionfrom zero. Non-inferiority is concerned with a difference onlyin the direction of an inferior outcome in a performancecharacteristic of the modified testing procedure versus thecurrent testing procedure.4.2.1 Equivalence testing is performed by an experimentthat generate
49、s test results from the modified and current testingprocedures on the same materials that are routinely tested. Anexception is bias equivalence where the experiment consists ofconducting multiple testing on a certified reference material(CRM) having an accepted reference value (ARV) to evaluatethe test method bias.4.2.2 Examples of performance characteristics directly ap-plicable to the test method are bias, precision, sensitivity,specificity, linearity, and range. Additional ch