ASTM D6708-2016a red 1365 Standard Practice for Statistical Assessment and Improvement of Expected Agreement Between Two Test Methods that Purport to Measure the Same Property of a.pdf

资源描述

1、Designation: D6708 16D6708 16a An American National StandardStandard Practice forStatistical Assessment and Improvement of ExpectedAgreement Between Two Test Methods that Purport toMeasure the Same Property of a Material1This standard is issued under the fixed designation D6708; the number immediate

2、ly following the designation indicates the year oforiginal adoption or, in the case of revision, the year of last revision. A number in parentheses indicates the year of last reapproval. Asuperscript epsilon () indicates an editorial change since the last revision or reapproval.1. Scope*1.1 This pra

3、ctice covers statistical methodology for assessing the expected agreement between two standard test methods thatpurport to measure the same property of a material, and deciding if a simple linear bias correction can further improve the expectedagreement. It is intended for use with results collected

4、 from an interlaboratory study meeting the requirement of Practice D6300or equivalent (for example, ISO 4259). The interlaboratory study must be conducted on at least ten materials that span theintersecting scopes of the test methods, and results must be obtained from at least six laboratories using

5、 each method.1.2 The statistical methodology is based on the premise that a bias correction will not be needed. In the absence of strongstatistical evidence that a bias correction would result in better agreement between the two methods, a bias correction is not made.If a bias correction is required

6、 then the parsimony principle is followed whereby a simple correction is to be favored over a morecomplex one.NOTE 1Failure to adhere to the parsimony principle generally results in models that are over-fitted and do not perform well in practice.1.3 The bias corrections of this practice are limited

7、 to a constant correction, proportional correction or a linear (proportional +constant) correction.1.4 The bias-correction methods of this practice are method symmetric, in the sense that equivalent corrections are obtainedregardless of which method is bias-corrected to match the other.1.5 A methodo

8、logy is presented for establishing the 95 % confidence limit (designated by this practice as the between methodsreproducibility) for the difference between two results where each result is obtained by a different operator using different apparatusand each applying one of the two methods X and Y on i

9、dentical material, where one of the methods has been appropriatelybias-corrected in accordance with this practice.NOTE 2In earlier versions of this standard practice, the term “cross-method reproducibility” was used in place of the term “between methodsreproducibility.” The change was made because t

10、he “between methods reproducibility” term is more intuitive and less confusing. It is important to notethat these two terms are synonymous and interchangeable with one another, especially in cases where the “cross-method reproducibility” term wassubsequently referenced by name in methods where a D67

11、08 assessment was performed, before the change in terminology in this standard practice wasadopted.NOTE 3Users are cautioned against applying the between methods reproducibility as calculated from this practice to materials that are significantlydifferent in composition from those actually studied,

12、as the ability of this practice to detect and address sample-specific biases (see 6.8) is dependent onthe materials selected for the interlaboratory study. When sample-specific biases are present, the types and ranges of samples may need to be expandedsignificantly from the minimum of ten as specifi

13、ed in this practice in order to obtain a more comprehensive and reliable 95 % confidence limits forbetween methods reproducibility that adequately cover the range of sample specific biases for different types of materials.1.6 This practice is intended for test methods which measure quantitative (num

14、erical) properties of petroleum or petroleumproducts.1.7 The statistical methodology outlined in this practice is also applicable for assessing the expected agreement between anytwo test methods that purport to measure the same property of a material, provided the results are obtained on the same co

15、mparisonsample set, the standard error associated with each test result is known, the sample set design meets the requirement of this practice,and the statistical degree of freedom of the data set exceeds 30.1 This practice is under the jurisdiction of ASTM Committee D02 on Petroleum Products, Liqui

16、d Fuels, and Lubricants and is the direct responsibility of SubcommitteeD02.94 on Coordinating Subcommittee on Quality Assurance and Statistics.Current edition approved Jan. 1, 2016April 1, 2016. Published February 2016April 2016. Originally approved in 2001. Last previous edition approved in 201520

17、16 asD6708 15.D6708 16. DOI: 10.1520/D6708-16.10.1520/D6708-16A.This document is not an ASTM standard and is intended only to provide the user of an ASTM standard an indication of what changes have been made to the previous version. Becauseit may not be technically possible to adequately depict all

18、changes accurately, ASTM recommends that users consult prior editions as appropriate. In all cases only the current versionof the standard as published by ASTM is to be considered the official document.*A Summary of Changes section appears at the end of this standardCopyright ASTM International, 100

19、 Barr Harbor Drive, PO Box C700, West Conshohocken, PA 19428-2959. United States12. Referenced Documents2.1 ASTM Standards:2D5580 Test Method for Determination of Benzene, Toluene, Ethylbenzene, p/m-Xylene, o-Xylene, C9 and Heavier Aromatics,and Total Aromatics in Finished Gasoline by Gas Chromatogr

20、aphyD5769 Test Method for Determination of Benzene, Toluene, and Total Aromatics in Finished Gasolines by GasChromatography/Mass SpectrometryD6299 Practice for Applying Statistical Quality Assurance and Control Charting Techniques to Evaluate Analytical Measure-ment System PerformanceD6300 Practice

21、for Determination of Precision and Bias Data for Use in Test Methods for Petroleum Products and LubricantsD7372 Guide for Analysis and Interpretation of Proficiency Test Program Results2.2 ISO Standard:3ISO 4259 Petroleum ProductsDetermination and application of precision data in relation to methods

22、 of test.3. Terminology3.1 Definitions:3.1.1 between-method bias, na quantitative expression for the mathematical correction that can statistically improve thedegree of agreement between the expected values of two test methods which purport to measure the same property.3.1.2 between methods reproduc

23、ibility (RXY),na quantitative expression of the random error associated with the differencebetween two results obtained by different operators using different apparatus and applying the two methods X and Y, respectively,each obtaining a single result on an identical test sample, when the methods hav

24、e been assessed and an appropriate bias-correctionhas been applied in accordance with this practice; it is defined as the 95 % confidence limit for the difference between two suchsingle and independent results.3.1.2.1 DiscussionA statement of between methods reproducibility must include a descriptio

25、n of any bias correction used in accordance with thispractice.3.1.2.2 DiscussionBetween methods reproducibility is a meaningful concept only if there are no statistically observable sample-specific relativebiases between the two methods, or if such biases vary from one sample to another in such a wa

26、y that they may be consideredrandom effects. (see 6.7.)3.1.3 closeness sum of squares (CSS), na statistic used to quantify the degree of agreement between the results from two testmethods after bias-correction using the methodology of this practice.3.1.4 total sum of squares (TSS), na statistic used

27、 to quantify the information content from the inter-laboratory study in termsof total variation of sample means relative to the standard error of each sample mean.3.2 Symbols:X,Y = single X-method and Y-method results, respectivelyXijk, Yijk = single results from the X-method and Y-method round robi

28、ns, respectivelyXi, Yi = means of results on the ith round robin sampleS = the number of samples in the round robinLXi, LYi = the numbers of laboratories that returned results on the ith round robin sampleRX, RY = the reproducibilities of the X- and Y- methods, respectivelysRXi, sRYi = the reproduci

29、bility standard deviations, evaluated at the means of the i th round robin samplesrXi, srYi = the repeatability standard deviations, evaluated at the means of the ith round robin samplesXi, sYi = standard errors of the means ith round robin sampleX, Y = the weighted means of round robins (across sam

30、ples)x i, yi = deviations of the means of the ith round robin sample results from X and Y, respectively.TSSX, TSSY = total sums of squares, around X and YF = a ratio for comparing variances; not uniquemore than one use2 For referencedASTM standards, visit theASTM website, www.astm.org, or contactAST

31、M Customer Service at serviceastm.org. For Annual Book of ASTM Standardsvolume information, refer to the standards Document Summary page on the ASTM website.3 Available from American National Standards Institute (ANSI), 25 W. 43rd St., 4th Floor, New York, NY 10036.D6708 16a2vX, vY = the degrees of

32、freedom for reproducibility variances from the round robinswi = weight associated with the difference between mean results (or corrected mean results) from the ith round robinsampleCSS = weighted sum of squared differences between (possibly corrected) mean results from the round robina,b = parameter

33、s of a linear correction: Y = a + bXt1, t2 = ratios for assessing reductions in sums of squaresRXY = estimate of between methods reproducibilityY = Y-method value predicted from X-method resultYi = ith round robin sample Y-method mean, predicted from corresponding X-method meani = standardized diffe

34、rence between Yi and Yi.LX, LY = harmonic mean numbers of laboratories submitting results on round robin samples, by X- and Y- methods,respectivelyRX Y = estimate of between methods reproducibility, computed from an X-method result only4. Summary of Practice4.1 Precisions of the two methods are quan

35、tified using inter-laboratory studies meeting the requirements of Practice D6300 orequivalent, using at least ten samples in common that span the intersecting scopes of the methods. The arithmetic means of theresults for each common sample obtained by each method are calculated. Estimates of the sta

36、ndard errors of these means arecomputed.NOTE 4For established standard test methods, new precision studies generally will be required in order to meet the common sample requirement.NOTE 5Both test methods do not need to be run by the same laboratory. If they are, care should be taken to ensure the i

37、ndependent test resultrequirement of Practice D6300 is met (for example, by double-blind testing of samples in random order).4.2 Weighted sums of squares are computed for the total variation of the mean results across all common samples for eachmethod. These sums of squares are assessed against the

38、standard errors of the mean results for each method to ensure that thesamples are sufficiently varied before continuing with the practice.4.3 The closeness of agreement of the mean results by each method is evaluated using appropriate weighted sums of squareddifferences. Such sums of squares are com

39、puted from the data first with no bias correction, then with a constant bias correction,then, when appropriate, with a proportional correction, and finally with a linear (proportional + constant) correction.4.4 The weighted sums of squared differences for the linear correction is assessed against th

40、e total variation in the mean resultsfor both methods to ensure that there is sufficient correlation between the two methods.4.5 The most parsimonious bias correction is selected.4.6 The weighted sum of squares of differences, after applying the selected bias correction, is assessed to determine whe

41、theradditional unexplained sources of variation remain in the residual (that is, the individual Yi minus bias-corrected Xi) data. Anyremaining, unexplained variation is attributed to sample-specific biases (also known as method-material interactions, or matrixeffects). In the absence of sample-speci

42、fic biases, the between methods reproducibility is estimated.4.7 If sample-specific biases are present, the residuals (that is, the individual Yi minus bias-corrected Xi) are tested forrandomness. If they are found to be consistent with a random-effects model, then their contribution to the between

43、methodsreproducibility is estimated, and accumulated into an all-encompassing between methods reproducibility estimate.4.8 Refer to Fig. 1 for a simplified flow diagram of the process described in this practice.5. Significance and Use5.1 This practice can be used to determine if a constant, proporti

44、onal, or linear bias correction can improve the degree ofagreement between two methods that purport to measure the same property of a material.5.2 The bias correction developed in this practice can be applied to a single result (X) obtained from one test method (methodX) to obtain a predicted result

45、 (Y) for the other test method (method Y).NOTE 6Users are cautioned to ensure that Y is within the scope of method Y before its use.5.3 The between methods reproducibility established by this practice can be used to construct an interval around Y that wouldcontain the result of test method Y, if it

46、were conducted, with about 95 % confidence.5.4 This practice can be used to guide commercial agreements and product disposition decisions involving test methods thathave been evaluated relative to each other in accordance with this practice.5.5 The magnitude of a statistically detectable bias is dir

47、ectly related to the uncertainties of the statistics from the experimentalstudy. These uncertainties are related to both the size of the data set and the precision of the processes being studied. A large dataset, or, highly precise test method(s), or both, can reduce the uncertainties of experimenta

48、l statistics to the point where the“statistically detectable” bias can become “trivially small,” or be considered of no practical consequence in the intended use of theD6708 16a3test method under study. Therefore, users of this practice are advised to determine in advance as to the magnitude of bias

49、 correctionbelow which they would consider it to be unnecessary, or, of no practical concern for the intended application prior to executionof this practice.NOTE 7It should be noted that the determination of this minimum bias of no practical concern is not a statistical decision, but rather, a subjectivedecision that is directly dependent on the application requirements of the users.6. ProcedureNOTE 8For an in-depth statistical discussion of the methodology used in this section, see Appendix X1. For a worked example, see Appendix X2.6.1 Calculate

展开阅读全文