1、Designation: E2489 16 An American National StandardStandard Practice forStatistical Analysis of One-Sample and Two-SampleInterlaboratory Proficiency Testing Programs1This standard is issued under the fixed designation E2489; the number immediately following the designation indicates the year oforigi
2、nal adoption or, in the case of revision, the year of last revision. A number in parentheses indicates the year of last reapproval. Asuperscript epsilon () indicates an editorial change since the last revision or reapproval.1. Scope1.1 This practice describes methods for the statistical analy-sis of
3、 laboratory results obtained from interlaboratory profi-ciency testing programs. As in accordance with PracticeE1301, proficiency testing is the use of interlaboratory com-parisons for the determination of laboratory testing or mea-surement performance. Conversely, collaborative study (orcollaborati
4、ve trial) is the use of interlaboratory comparisons forthe determination of the precision of a test method, as coveredby Practice E691.1.1.1 Method A covers testing programs using single testresults obtained by testing a single sample (each laboratorysubmits a single test result).1.1.2 Method B cove
5、rs testing programs using paired testresults obtained by testing two samples (each laboratorysubmits one test result for each of the two samples). The twosamples should be of the same material or two materials similarenough to have approximately the same degree of variation intest results.1.2 Method
6、s A and B are applicable to proficiency testingprograms containing a minimum of 10 participating laborato-ries.1.3 The methods provide direction for assessing and catego-rizing the performance of individual laboratories based on therelative likelihood of occurrence of their test results, and fordete
7、rmining estimates of testing variation associated withrepeatability and reproducibility. Assumptions are that a ma-jority of the participating laboratories execute the test methodproperly and that samples are of sufficient homogeneity that thetesting results represent results obtained from each labo
8、ratorytesting essentially the same material. Each laboratory receivesthe same instructions or protocol.1.4 This standard does not purport to address all of thesafety concerns, if any, associated with its use. It is theresponsibility of the user of this standard to establish appro-priate safety and h
9、ealth practices and determine the applica-bility of regulatory limitations prior to use.2. Referenced Documents2.1 ASTM Standards:2E177 Practice for Use of the Terms Precision and Bias inASTM Test MethodsE178 Practice for Dealing With Outlying ObservationsE456 Terminology Relating to Quality and Sta
10、tisticsE691 Practice for Conducting an Interlaboratory Study toDetermine the Precision of a Test MethodE1301 Guide for Proficiency Testing by InterlaboratoryComparisons (Withdrawn 2012)3E2586 Practice for Calculating and Using Basic Statistics3. Terminology3.1 DefinitionsThe terminology defined in T
11、erminologyE456 applies to this practice unless modified herein.3.1.1 collaborative study, ninterlaboratory study in whicheach laboratory uses the defined method of analysis to analyzeidentical portions of homogeneous materials to assess theperformance characteristics obtained for that method ofanaly
12、sis. Horwitz43.1.2 collaborative trial, nsee collaborative study.3.1.3 interlaboratory comparison, norganization,performance, and evaluation of tests on the same or similar testitems by two or more laboratories in accordance with prede-termined conditions.3.1.4 median, X,nthe 50thpercentile in a pop
13、ulation orsample. E25863.1.4.1 DiscussionThe sample median is the (n + 1) 2order statistic if the sample size n is odd and is the average ofthe n/2 and n/2 + 1 order statistics if n is even.3.1.5 outlier, nsee outlying observation. E1781This practice is under the jurisdiction ofASTM Committee E11 on
14、 Quality andStatistics and is the direct responsibility of Subcommittee E11.20 on Test MethodEvaluation and Quality Control.Current edition approved Nov. 15, 2016. Published November 2016. Originallyapproved in 2006. Last previous edition approved in 2011 as E2489 11. DOI:10.1520/E2489-16.2For refer
15、enced ASTM standards, visit the ASTM website, www.astm.org, orcontact ASTM Customer Service at serviceastm.org. For Annual Book of ASTMStandards volume information, refer to the standards Document Summary page onthe ASTM website.3The last approved version of this historical standard is referenced on
16、www.astm.org.4Horwitz, W., “Protocol for the Design, Conduct and Interpretation of Collab-orative Studies,” Pure and Applied Chemistry, Vol 60, No. 6, 1988, pp. 855864.Copyright ASTM International, 100 Barr Harbor Drive, PO Box C700, West Conshohocken, PA 19428-2959. United States13.1.6 outlying obs
17、ervation, nobservation that appears todeviate markedly in value from other members of the sample inwhich it appears. E1783.1.7 proficiency testing, ndetermination of laboratorytesting performance by means of interlaboratory comparisons.3.1.8 repeatability standard deviation (Sr), nstandard de-viatio
18、n of test results obtained under repeatability conditions.E1773.1.9 reproducibility standard deviation (SR), nstandarddeviation of test results obtained under reproducibilityconditions. E1773.2 Definitions of Terms Specific to This Standard:3.2.1 hinge (upper or lower), nmedian of the upper orlower
19、half of a set of data when the data is arranged in order ofsize.3.2.1.1 DiscussionWhen there is an odd number of itemsin the data set, the middle value is included in both the upperand lower halves. The upper hinge is an estimate of the 75thpercentile; the lower hinge is an estimate of the 25th perc
20、entile.3.2.2 inner fence (upper or lower), nvalue equal to theupper or lower hinge of a data set plus (upper) or minus (lower)1.5 times the difference between upper and lower hinges.3.2.3 interquartile range, ndistance between the upperand lower hinges of a data set.3.2.4 outer fence (upper or lower
21、), nvalue equal to theupper or lower hinge of a data set plus (upper) or minus (lower)three times the difference between upper and lower hinges.4. Summary of Practice4.1 This practice describes methods of displaying interlabo-ratory data that visually show individual laboratory results.4.2 The metho
22、ds described in this practice can be applied tolarge and small sample populations from any distributionexpected to have a general mound shape. It is recommendedthat in cases in which it is suspected that the data may be highlyunsymmetrical or very unusual in some other manner astatistician should be
23、 consulted regarding the applicability ofthe analysis method.4.2.1 The median is used as the “consensus” value of themeasured test property.4.2.2 The interquartile range (IQR) is used as the basis forestimating the spread in the data. Because the median and theinterquartile range are not affected by
24、 the magnitude ofextreme values of a data set, the analysis approach presented inthis practice effectively eliminates the need to identify outlyingobservations (outliers).4.3 Laboratory results are categorized according to how farthe results lie outside of the interquartile range.4.4 The upper and l
25、ower ends of the interquartile range arereferred to as the hinges. The limits for categorizing laboratoryresults lying outside of the interquartile range are determinedby multiplying the extent of the interquartile range by the fixedfactors of 1.5 and 3.0. The upper and lower limits lying adistance
26、of 1.5 times the range of the IQR beyond the hingesare referred to as the inner fences. The upper and lower limitsfor results lying at 3.0 times the range of the IQR beyond thehinges are referred to as the outer fences.4.5 Guidance is provided for proficiency testing programswishing to establish add
27、itional limits (or fences). The user isdirected to Guide E1301 for additional guidance.4.6 When using the methods in this practice, the number ofparticipating laboratories should be at least ten. Since thedegree of confidence is lower for analyses performed on smallsample populations, caution should
28、 be used in applying statis-tics obtained from small sample populations.4.7 When possible, it is generally desirable to have 30 ormore participants when estimating the precision of test meth-ods.4.8 Estimates of the repeatability standard deviation and thereproducibility standard deviation are deter
29、mined by dividingthe interquartile ranges of appropriate data sets by a factor of1.35.4.8.1 The number 1.35 used in determining the repeatabilityand reproducibility standard deviations is based on an assump-tion of similarity to a normal distribution. Therefore, theestimate of the standard deviation
30、 using the methods describedin this practice may not supply the desired accuracy if thedistribution differs too much from the general shape of anormal curve. It is beyond the scope of this practice to describeprocedures for determining when the analysis methods de-scribed in this practice are not ap
31、plicable.5. Significance and Use5.1 This practice is specifically designed to describe simplerobust statistical methods for use in proficiency testing pro-grams.5.2 Proficiency testing programs can use the methods in thispractice for the purpose of comparing testing results obtainedfrom a group of p
32、articipating laboratories. The laboratorycomparisons can then be used for evaluation of individuallaboratory performance.5.3 In addition, the data obtained in proficiency testingprograms may contain information regarding repeatability(within-lab) and reproducibility (between-lab) testing varia-tion.
33、 Repeatability information is possible only if the programuses more than one sample. See Method B. Proficiency testingprograms often have a greater number of participants thanmight be available for conducting an interlaboratory study todetermine the precision of a test method (such as described inPr
34、actice E691). Precision estimates obtained for the largernumber of participants in a proficiency testing program, alongwith the corresponding wider variation of test conditions, canprovide useful information to standards developers regardingthe precision of test results that can be expected for a te
35、stmethod when in actual use in the general testing community.5.4 To estimate the precision of a test method, the partici-pants must use the same test method to obtain their test results,and testing must be performed under the conditions requiredfor repeatability and reproducibility. The precision es
36、timatesare applicable to the property levels and material typesincluded in the testing program. The precision of a test methodE2489 162may vary considerably for different material types and atdifferent property levels.5.5 This practice may be useful to proficiency testingprogram administrators and p
37、rovides examples of statisticalmethods along with explanations of some of the advantages ofthe suggested methods of analysis. The analyses resulting fromthe application of methods described in this practice may beused by laboratories as part of their quality control procedures,accrediting bodies to
38、assist in the evaluation of laboratoryperformance, and ASTM International technical committees(and other organizations charged with the task of writing,maintaining, or improving test methods) to obtain informationregarding reproducibility and repeatability.5.6 There are many types of proficiency tes
39、ting programs inexistence and many methods exist for analyzing the dataresulting from the interlaboratory testing. It is not the intentionof this practice to call into question the integrity of programsusing other methods of analysis. Testing programs usingreplicate testing of one or more samples (e
40、ach laboratorysubmits two or more results for each sample) are directed toPractice E691 or other practices for the description of a methodof analysis that may be more suitable to that type of program.6. Analysis of a One-Sample Program (Method A)6.1 Display of Data:6.1.1 When possible, display the d
41、ata in a table to show theactual results submitted by each laboratory. This may not bepractical if the number of participants is too large.6.1.1.1 To assist in maintaining confidentiality, give eachlaboratory an identification number if one does not alreadyexist.6.1.1.2 List the laboratory results i
42、n increasing order bylaboratory identification number to make it easy to locate theresults for a particular laboratory. See Table 1.6.1.2 Sort the laboratory results in decreasing order by testresult to show the range and distribution of the test results. SeeTable 2. Besides the laboratory identific
43、ation number andcorresponding test results, Table 2 contains columns of addi-tional information that will be explained in the followingsections of this practice.6.1.3 Display the data in a dot diagram to show the locationof each laboratorys test result in the distribution of all testresults. For eac
44、h test result, plot occurrence number of that testresult value versus the value of the test result. As points areplotted from the top of Table 2 to the bottom, the first time atest value occurs assign it an occurrence of “one.” The nexttime that test result value occurs, assign it an occurrence of“t
45、wo.” If the test result value appears a third time, assign it anoccurrence of “three” and so forth. If a test result value appearsthree times in the data, plot the test result value three times,once with an occurrence of “one,” once with an occurrence of“two.” and once with an occurrence of “three.”
46、 The conse-quence is that each laboratorys test result will be plotted as anindividual dot and no dots will be concealed by being plottedon top of one another.6.1.3.1 Fig. 1 shows the dot diagram for the data in Table 2.There are no repeat values in the test results, so Column 3 ofTable 2 shows that
47、 the number of occurrences is “one” for eachtest result and the dots in Fig. 1 appear in a single horizontalrow.The dot diagram in Fig. 1 also shows that the test result forLaboratory 5, at (2.75, 1), is slightly removed from the rest ofthe data. The test result for Laboratory 27, at (4.89, 1), isfa
48、rther removed.6.1.3.2 A dot diagram with a different appearance can beobtained by classifying the results into multiple contiguous sizeclasses such that each class contains a portion of the data, buttogether, the classes cover the entire data range. Table 3 showsthe number of occurrences in each siz
49、e class when the range ofeach class is 0.10. When the numbers of occurrences in eachsize class are plotted versus the corresponding values of thelower ends of each size class (see Fig. 2), the display has theadvantage of being more compact, and it is more apparent howtest results are clustered. The dot diagram in Fig. 2 still showsthat the test result for Laboratory 5 is slightly removed from therest of the data and that the test result for Laboratory 27 isfarther removed.6.1.3.3 Other ranges for the size classes are permitted to beused to classify the tes