1、Designation: E2489 11An American National StandardStandard Practice forStatistical Analysis of One-Sample and Two-SampleInterlaboratory Proficiency Testing Programs1This standard is issued under the fixed designation E2489; the number immediately following the designation indicates the year oforigin
2、al adoption or, in the case of revision, the year of last revision. A number in parentheses indicates the year of last reapproval. Asuperscript epsilon () indicates an editorial change since the last revision or reapproval.1. Scope1.1 This practice describes methods for the statistical analy-sis of
3、laboratory results obtained from interlaboratory profi-ciency testing programs. As in accordance with PracticeE1301, proficiency testing is the use of interlaboratory com-parisons for the determination of laboratory testing or mea-surement performance. Conversely, collaborative study (orcollaborativ
4、e trial) is the use of interlaboratory comparisons forthe determination of the performance characteristics of amethod, as covered by Practice E691.1.1.1 Method A covers testing programs using single testresults obtained by testing a single sample (each laboratorysubmits a single test result).1.1.2 M
5、ethod B covers testing programs using paired testresults obtained by testing two samples (each laboratorysubmits one test result for each of the two samples). The twosamples should be of the same material or two materials similarenough to have approximately the same degree of variation intest result
6、s.1.2 Methods A and B are applicable to proficiency testingprograms containing a minimum of 10 participating laborato-ries.1.3 The methods provide direction for assessing and catego-rizing the performance of individual laboratories based on therelative likelihood of occurrence of their test results,
7、 and fordetermining estimates of testing variation associated withrepeatability and reproducibility. Assumptions are that a ma-jority of the participating laboratories execute the test methodproperly and that samples are of sufficient homogeneity that thetesting results represent results obtained fr
8、om each laboratorytesting essentially the same material. Each laboratory receivesthe same instructions or protocol.1.4 This standard does not purport to address all of thesafety concerns, if any, associated with its use. It is theresponsibility of the user of this standard to establish appro-priate
9、safety and health practices and determine the applica-bility of regulatory limitations prior to use.2. Referenced Documents2.1 ASTM Standards:2E177 Practice for Use of the Terms Precision and Bias inASTM Test MethodsE178 Practice for Dealing With Outlying ObservationsE456 Terminology Relating to Qua
10、lity and StatisticsE691 Practice for Conducting an Interlaboratory Study toDetermine the Precision of a Test MethodE1301 Guide for Proficiency Testing by InterlaboratoryComparisons3. Terminology3.1 DefinitionsThe terminology defined in TerminologyE456 applies to this practice unless modified herein.
11、3.1.1 collaborative study, ninterlaboratory study in whicheach laboratory uses the defined method of analysis to analyzeidentical portions of homogeneous materials to assess theperformance characteristics obtained for that method of analy-sis.33.1.2 collaborative trial, nsee collaborative study.3.1.
12、3 interlaboratory comparison, norganization, perfor-mance, and evaluation of tests on the same or similar test itemsby two or more laboratories in accordance with predeterminedconditions.3.1.4 median, nmiddle value of a data set when the data isarranged in order of size or the average of the middle
13、twovalues when there is an even number of items in the data set.3.1.5 outlier, nsee outlying observation. E1783.1.6 outlying observation, nobservation that appears todeviate markedly in value from other members of the sample inwhich it appears. E1783.1.7 proficiency testing, ndetermination of labora
14、torytesting performance by means of interlaboratory comparisons.3.1.8 repeatability standard deviation (Sr), nstandard de-viation of test results obtained under repeatability conditions.E1771This practice is under the jurisdiction of ASTM Committee E11 on Quality andStatistics and is the direct resp
15、onsibility of Subcommittee E11.20 on Test MethodEvaluation and Quality Control.Current edition approved Oct. 1, 2011. Published October 2011. Originallyapproved in 2006. Last previous edition as E2489 061. DOI: 10.1520/E2489-11.2For referenced ASTM standards, visit the ASTM website, www.astm.org, or
16、contact ASTM Customer Service at serviceastm.org. For Annual Book of ASTMStandards volume information, refer to the standards Document Summary page onthe ASTM website.3Horwitz, W., “Protocol for the Design, Conduct and Interpretation of Collab-orative Studies,” Pure and Applied Chemistry, Vol 60, No
17、. 6, 1988, pp. 855864.1Copyright ASTM International, 100 Barr Harbor Drive, PO Box C700, West Conshohocken, PA 19428-2959, United States.3.1.9 reproducibility standard deviation (SR), nstandarddeviation of test results obtained under reproducibility condi-tions. E1773.2 Definitions of Terms Specific
18、 to This Standard:3.2.1 hinge (upper or lower), nmedian of the upper orlower half of a set of data when the data is arranged in order ofsize.3.2.1.1 DiscussionWhen there is an odd number of itemsin the data set, the middle value is included in both the upperand lower halves. The upper hinge is an es
19、timate of the 75thpercentile; the lower hinge is an estimate of the 25th percentile.3.2.2 inner fence (upper or lower), nvalue equal to theupper or lower hinge of a data set plus (upper) or minus (lower)1.5 times the difference between upper and lower hinges.3.2.3 interquartile range, ndistance betw
20、een the upperand lower hinges of a data set.3.2.4 outer fence (upper or lower), nvalue equal to theupper or lower hinge of a data set plus (upper) or minus (lower)three times the difference between upper and lower hinges.4. Summary of Practice4.1 This practice describes methods of displaying interla
21、bo-ratory data that visually show individual laboratory results.4.2 The methods described in this practice can be applied tolarge and small sample populations from any distributionexpected to have a general mound shape. It is recommendedthat in cases in which it is suspected that the data may be hig
22、hlyunsymmetrical or very unusual in some other manner astatistician should be consulted regarding the applicability ofthe analysis method.4.2.1 The median is used as the “consensus” value of themeasured test property.4.2.2 The interquartile range (IQR) is used as the basis forestimating the spread i
23、n the data. Because the median and theinterquartile range are not affected by the magnitude ofextreme values of a data set, the analysis approach presented inthis practice effectively eliminates the need to identify outlyingobservations (outliers).4.3 Laboratory results are categorized according to
24、how farthe results lie outside of the interquartile range.4.4 The upper and lower ends of the interquartile range arereferred to as the hinges. The limits for categorizing laboratoryresults lying outside of the interquartile range are determinedby multiplying the extent of the interquartile range by
25、 the fixedfactors of 1.5 and 3.0. The upper and lower limits lying adistance of 1.5 times the range of the IQR beyond the hingesare referred to as the inner fences. The upper and lower limitsfor results lying at 3.0 times the range of the IQR beyond thehinges are referred to as the outer fences.4.5
26、Guidance is provided for proficiency testing programswishing to establish additional limits (or fences). The user isdirected to Guide E1301 for additional guidance.4.6 When using the methods in this practice, the number ofparticipating laboratories should be at least ten. Since thedegree of confiden
27、ce is lower for analyses performed on smallsample populations, caution should be used in applying statis-tics obtained from small sample populations.4.7 When possible, it is generally desirable to have 30 ormore participants when estimating the precision of test meth-ods.4.8 Estimates of the repeata
28、bility standard deviation and thereproducibility standard deviation are determined by dividingthe interquartile ranges of appropriate data sets by a factor of1.35.4.8.1 The number 1.35 used in determining the repeatabilityand reproducibility standard deviations is based on an assump-tion of similari
29、ty to a normal distribution. Therefore, theestimate of the standard deviation using the methods describedin this practice may not supply the desired accuracy if thedistribution differs too much from the general shape of anormal curve. It is beyond the scope of this practice to describeprocedures for
30、 determining when the analysis methods de-scribed in this practice are not applicable.5. Significance and Use5.1 This practice is specifically designed to describe simplerobust statistical methods for use in proficiency testing pro-grams.5.2 Proficiency testing programs can use the methods in thispr
31、actice for the purpose of comparing testing results obtainedfrom a group of participating laboratories. The laboratorycomparisons can then be used for evaluation of individuallaboratory performance.5.3 In addition, the data obtained in proficiency testingprograms may contain information regarding re
32、peatability(within-lab) and reproducibility (between-lab) testing varia-tion. Repeatability information is possible only if the programuses more than one sample. See Method B. Proficiency testingprograms often have a greater number of participants thanmight be available for conducting an interlabora
33、tory study todetermine the precision of a test method (such as described inPractice E691). Precision estimates obtained for the largernumber of participants in a proficiency testing program, alongwith the corresponding wider variation of test conditions, canprovide useful information to standards de
34、velopers regardingthe precision of test results that can be expected for a testmethod when in actual use in the general testing community.5.4 To estimate the precision of a test method, the partici-pants must use the same test method to obtain their test results,and testing must be performed under t
35、he conditions requiredfor repeatability and reproducibility. The precision estimatesare applicable to the property levels and material typesincluded in the testing program. The precision of a test methodmay vary considerably for different material types and atdifferent property levels.5.5 This pract
36、ice may be useful to proficiency testingprogram administrators and provides examples of statisticalmethods along with explanations of some of the advantages ofthe suggested methods of analysis. The analyses resulting fromthe application of methods described in this practice may beused by laboratorie
37、s as part of their quality control procedures,accrediting bodies to assist in the evaluation of laboratoryperformance, and ASTM International technical committees(and other organizations charged with the task of writing,maintaining, or improving test methods) to obtain informationregarding reproduci
38、bility and repeatability.E2489 1125.6 There are many types of proficiency testing programs inexistence and many methods exist for analyzing the dataresulting from the interlaboratory testing. It is not the intentionof this practice to call into question the integrity of programsusing other methods o
39、f analysis. Testing programs usingreplicate testing of one or more samples (each laboratorysubmits two or more results for each sample) are directed toPractice E691 or other practices for the description of a methodof analysis that may be more suitable to that type of program.6. Analysis of a One-Sa
40、mple Program (Method A)6.1 Display of Data:6.1.1 When possible, display the data in a table to show theactual results submitted by each laboratory. This may not bepractical if the number of participants is too large.6.1.1.1 To assist in maintaining confidentiality, give eachlaboratory an identificat
41、ion number if one does not alreadyexist.6.1.1.2 List the laboratory results in increasing order bylaboratory identification number to make it easy to locate theresults for a particular laboratory. See Table 1.6.1.2 Sort the laboratory results in decreasing order by testresult to show the range and d
42、istribution of the test results. SeeTable 2. Besides the laboratory identification number andcorresponding test results, Table 2 contains columns of addi-tional information that will be explained in the followingsections of this practice.6.1.3 Display the data in a dot diagram to show the locationof
43、 each laboratorys test result in the distribution of all testresults. For each test result, plot occurrence number of that testresult value versus the value of the test result. As points areplotted from the top of Table 2 to the bottom, the first time atest value occurs assign it an occurrence of “o
44、ne.” The nexttime that test result value occurs, assign it an occurrence of“two.” If the test result value appears a third time, assign it anoccurrence of “three” and so forth. If a test result value appearsthree times in the data, plot the test result value three times,once with an occurrence of “o
45、ne,” once with an occurrence of“two.” and once with an occurrence of “three.” The conse-quence is that each laboratorys test result will be plotted as anindividual dot and no dots will be concealed by being plottedon top of one another.6.1.3.1 Fig. 1 shows the dot diagram for the data in Table 2.The
46、re are no repeat values in the test results, so Column 3 ofTable 2 shows that the number of occurrences is “one” for eachtest result and the dots in Fig. 1 appear in a single horizontalrow. The dot diagram in Fig. 1 also shows that the test result forTABLE 1 Original Data for a One-Sample ProgramLab
47、 Test Result1 1.222 1.623 1.824 0.605 2.756 1.557 1.178 1.769 1.3510 1.1811 1.1912 1.7113 2.0314 1.1015 1.8416 1.3917 1.1318 1.6619 1.2820 1.2421 0.6922 1.5423 1.4324 0.8425 0.9826 1.9727 4.8928 1.8529 1.0930 1.07TABLE 2 Data in Descending Order for One-Sample ProgramCount of Labs LabTestResultNumbe
48、r ofOccurrencesCategory27 4.89 1 Extremely Unusual5 2.75 1 Unusual13 2.03 1 Typical26 1.97 1 Typical28 1.85 1 Typical15 1.84 1 Typical3 1.82 1 Typical8th from Top 8 1.76 1 Typical12 1.71 1 Typical18 1.66 1 Typical2 1.62 1 Typical6 1.55 1 Typical22 1.54 1 Typical23 1.43 1 Typical15th from Top 16 1.39
49、 1 Typical16th from Top 9 1.35 1 Typical19 1.28 1 Typical20 1.24 1 Typical1 1.22 1 Typical11 1.19 1 Typical10 1.18 1 Typical7 1.17 1 Typical8th from Bottom 17 1.13 1 Typical14 1.10 1 Typical29 1.09 1 Typical30 1.07 1 Typical25 0.98 1 Typical24 0.84 1 Typical21 0.69 1 Typical4 0.60 1 TypicalShown Below Is Determination of “Fences” for Data AboveMedian of All Test Results = 1.37Upper hinge (Median of Top Half) = 1.76Lower Hinge (Median of Bottom Half) = 1.13Interquartile Range (IQR) = (1.76 - 1.13) = 0.63(3 3 IQR) = 1.89Outer Fence (Upper) = (1.76 + 1.89) = 3.65Oute