1、Designation: D6708 16aD6708 16b An American National StandardStandard Practice forStatistical Assessment and Improvement of ExpectedAgreement Between Two Test Methods that Purport toMeasure the Same Property of a Material1This standard is issued under the fixed designation D6708; the number immediat
2、ely following the designation indicates the year oforiginal adoption or, in the case of revision, the year of last revision. A number in parentheses indicates the year of last reapproval. Asuperscript epsilon () indicates an editorial change since the last revision or reapproval.1. Scope*1.1 This pr
3、actice covers statistical methodology for assessing the expected agreement between two standard test methods thatpurport to measure the same property of a material, and deciding if a simple linear bias correction can further improve the expectedagreement. It is intended for use with results collecte
4、d from an interlaboratory study meeting the requirement of Practice D6300or equivalent (for example, ISO 4259). The interlaboratory study must be conducted on at least ten materials that span theintersecting scopes of the test methods, and results must be obtained from at least six laboratories usin
5、g each method.1.2 The statistical methodology is based on the premise that a bias correction will not be needed. In the absence of strongstatistical evidence that a bias correction would result in better agreement between the two methods, a bias correction is not made.If a bias correction is require
6、d, then the parsimony principle is followed whereby a simple correction is to be favored over a morecomplex one.NOTE 1Failure to adhere to the parsimony principle generally results in models that are over-fitted and do not perform well in practice.1.3 The bias corrections of this practice are limite
7、d to a constant correction, proportional correction or a linear (proportional +constant) correction.1.4 The bias-correction methods of this practice are method symmetric, in the sense that equivalent corrections are obtainedregardless of which method is bias-corrected to match the other.1.5 A method
8、ology is presented for establishing the 95 % confidence limit (designated by this practice as the between methodsreproducibility) for the difference between two results where each result is obtained by a different operator using different apparatusand each applying one of the two methods X and Y on
9、identical material, where one of the methods has been appropriatelybias-corrected in accordance with this practice.NOTE 2In earlier versions of this standard practice, the term “cross-method reproducibility” was used in place of the term “between methodsreproducibility.” The change was made because
10、the “between methods reproducibility” term is more intuitive and less confusing. It is important to notethat these two terms are synonymous and interchangeable with one another, especially in cases where the “cross-method reproducibility” term wassubsequently referenced by name in methods where a D6
11、708 assessment was performed, before the change in terminology in this standard practice wasadopted.NOTE 3Users are cautioned against applying the between methods reproducibility as calculated from this practice to materials that are significantlydifferent in composition from those actually studied,
12、 as the ability of this practice to detect and address sample-specific biases (see 6.8) is dependent onthe materials selected for the interlaboratory study. When sample-specific biases are present, the types and ranges of samples may need to be expandedsignificantly from the minimum of ten as specif
13、ied in this practice in order to obtain a more comprehensive and reliable 95 % confidence limits forbetween methods reproducibility that adequately cover the range of sample specific biases for different types of materials.1.6 This practice is intended for test methods which measure quantitative (nu
14、merical) properties of petroleum or petroleumproducts.1.7 The statistical methodology outlined in this practice is also applicable for assessing the expected agreement between anytwo test methods that purport to measure the same property of a material, provided the results are obtained on the same c
15、omparisonsample set, the standard error associated with each test result is known, and the sample set design meets the requirementrequire-ments of this practice, and in particular that the statistical degree of freedom of the data set exceeds 30.associated with all standarderrors are 30 or greater.1
16、 This practice is under the jurisdiction of ASTM Committee D02 on Petroleum Products, Liquid Fuels, and Lubricants and is the direct responsibility of SubcommitteeD02.94 on Coordinating Subcommittee on Quality Assurance and Statistics.Current edition approved April 1, 2016June 15, 2016. Published Ap
17、ril 2016August 2016. Originally approved in 2001. Last previous edition approved in 2016 asD6708 16.D6708 16a. DOI: 10.1520/D6708-16A.10.1520/D6708-16B.This document is not an ASTM standard and is intended only to provide the user of an ASTM standard an indication of what changes have been made to t
18、he previous version. Becauseit may not be technically possible to adequately depict all changes accurately, ASTM recommends that users consult prior editions as appropriate. In all cases only the current versionof the standard as published by ASTM is to be considered the official document.*A Summary
19、 of Changes section appears at the end of this standardCopyright ASTM International, 100 Barr Harbor Drive, PO Box C700, West Conshohocken, PA 19428-2959. United States12. Referenced Documents2.1 ASTM Standards:2D5580 Test Method for Determination of Benzene, Toluene, Ethylbenzene, p/m-Xylene, o-Xyl
20、ene, C9 and Heavier Aromatics,and Total Aromatics in Finished Gasoline by Gas ChromatographyD5769 Test Method for Determination of Benzene, Toluene, and Total Aromatics in Finished Gasolines by GasChromatography/Mass SpectrometryD6299 Practice for Applying Statistical Quality Assurance and Control C
21、harting Techniques to Evaluate Analytical Measure-ment System PerformanceD6300 Practice for Determination of Precision and Bias Data for Use in Test Methods for Petroleum Products and LubricantsD7372 Guide for Analysis and Interpretation of Proficiency Test Program Results2.2 ISO Standard:3ISO 4259
22、Petroleum ProductsDetermination and application of precision data in relation to methods of test.3. Terminology3.1 Definitions:3.1.1 between ILCP method-averages reproducibility (RILCP_ X, ILCP_Y), na quantitative expression of the random errorassociated with the difference between the bias-correcte
23、d ILCP average of method X versus the ILCP average of method Y froma Proficiency Testing program, when the method X has been assessed versus methodY, and an appropriate bias-correction has beenapplied to all method X results in accordance with this practice; it is defined as the 95 % confidence limi
24、t for the difference betweentwo such averages.3.1.2 between-method bias, na quantitative expression for the mathematical correction that can statistically improve thedegree of agreement between the expected values of two test methods which purport to measure the same property.3.1.3 between methods r
25、eproducibility (RXY),na quantitative expression of the random error associated with the differencebetween two results obtained by different operators using different apparatus and applying the two methods X and Y, respectively,each obtaining a single result on an identical test sample, when the meth
26、ods have been assessed and an appropriate bias-correctionhas been applied in accordance with this practice; it is defined as the 95 % confidence limit for the difference between two suchsingle and independent results.3.1.3.1 DiscussionA statement of between methods reproducibility must include a des
27、cription of any bias correction used in accordance with thispractice.3.1.3.2 DiscussionBetween methods reproducibility is a meaningful concept only if there are no statistically observable sample-specific relativebiases between the two methods, or if such biases vary from one sample to another in su
28、ch a way that they may be consideredrandom effects. (see 6.7.)3.1.4 closeness sum of squares (CSS), na statistic used to quantify the degree of agreement between the results from two testmethods after bias-correction using the methodology of this practice.3.1.5 Interlaboratory Crosscheck Program (IL
29、CP), nASTM International Proficiency Test Program sponsored by CommitteeD02 on Petroleum Products, Liquid Fuels, and Lubricants; see ASTM website for current details. D73723.1.6 total sum of squares (TSS), na statistic used to quantify the information content from the inter-laboratory study in terms
30、of total variation of sample means relative to the standard error of each sample mean.3.2 Symbols:X,Y = single X-method and Y-method results, respectivelyXijk, Yijk = single results from the X-method and Y-method round robins, respectivelyXi, Yi = means of results on the ith round robin sampleS = th
31、e number of samples in the round robinLXi, LYi = the numbers of laboratories that returned results on the ith round robin sample2 For referencedASTM standards, visit theASTM website, www.astm.org, or contactASTM Customer Service at serviceastm.org. For Annual Book of ASTM Standardsvolume information
32、, refer to the standards Document Summary page on the ASTM website.3 Available from American National Standards Institute (ANSI), 25 W. 43rd St., 4th Floor, New York, NY 10036.D6708 16b2RX, RY = the reproducibilities of the X- and Y- methods, respectivelyRXi, RYi = the reproducibility of method X an
33、d Y, evaluated at the method X and Y means of the ith round robin sample,respectivelyRILCP_ X, ILCP_Y = estimate of between ILCP method-averages reproducibilitysRXi, sRYi = the reproducibility standard deviations, evaluated at the method X and Y means of the i th round robin samplesrXi, srYi = the r
34、epeatability standard deviations, evaluated at the method X and Y means of the ith round robin samplesXi, sYi = standard errors of the means ith round robin sampleX, Y = the weighted means of round robins (across samples)xi, yi = deviations of the means of the ith round robin sample results from X a
35、nd Y, respectively.TSSX, TSSY = total sums of squares, around X and YF = a ratio for comparing variances; not uniquemore than one usevX, vY = the degrees of freedom for reproducibility variances from the round robinswi = weight associated with the difference between mean results (or corrected mean r
36、esults) from the ith roundrobin sampleCSS = weighted sum of squared differences between (possibly corrected) mean results from the round robina,b = parameters of a linear correction: Y = a + bXt1, t2 = ratios for assessing reductions in sums of squaresRXY = estimate of between methods reproducibilit
37、yY = Y-method value predicted from X-method resultY = predicted Y-method value for a sample by applying the bias correction established from this practice to anactual X-method result for the same sampleYi = ith round robin sample Y-method mean, predicted from corresponding X-method meanYi = predicte
38、d ith round robin sample Y-method mean, by applying the bias correction established from thispractice to its corresponding X-method meani = standardized difference between Yi and Yi.LX, LY = harmonic mean numbers of laboratories submitting results on round robin samples, by X- and Y- methods,respect
39、ivelyRX Y = estimate of between methods reproducibility, computed from an X-method result only4. Summary of Practice4.1 Precisions of the two methods are quantified using inter-laboratory studies meeting the requirements of Practice D6300 orequivalent, using at least ten samples in common that span
40、the intersecting scopes of the methods. The arithmetic means of theresults for each common sample obtained by each method are calculated. Estimates of the standard errors of these means arecomputed.NOTE 4For established standard test methods, new precision studies generally will be required in order
41、 to meet the common sample requirement.NOTE 5Both test methods do not need to be run by the same laboratory. If they are, care should be taken to ensure the independent test resultrequirement of Practice D6300 is met (for example, by double-blind testing of samples in random order).4.2 Weighted sums
42、 of squares are computed for the total variation of the mean results across all common samples for eachmethod. These sums of squares are assessed against the standard errors of the mean results for each method to ensure that thesamples are sufficiently varied before continuing with the practice.4.3
43、The closeness of agreement of the mean results by each method is evaluated using appropriate weighted sums of squareddifferences. Such sums of squares are computed from the data first with no bias correction, then with a constant bias correction,then, when appropriate, with a proportional correction
44、, and finally with a linear (proportional + constant) correction.4.4 The weighted sums of squared differences for the linear correction is assessed against the total variation in the mean resultsfor both methods to ensure that there is sufficient correlation between the two methods.4.5 The most pars
45、imonious bias correction is selected.4.6 The weighted sum of squares of differences, after applying the selected bias correction, is assessed to determine whetheradditional unexplained sources of variation remain in the residual (that is, the individual Yi minus bias-corrected Xi) data. Anyremaining
46、, unexplained variation is attributed to sample-specific biases (also known as method-material interactions, or matrixeffects). In the absence of sample-specific biases, the between methods reproducibility is estimated.4.7 If sample-specific biases are present, the residuals (that is, the individual
47、 Yi minus bias-corrected Xi) are tested forrandomness. If they are found to be consistent with a random-effects model, then their contribution to the between methodsreproducibility is estimated, and accumulated into an all-encompassing between methods reproducibility estimate.4.8 Refer to Fig. 1 for
48、 a simplified flow diagram of the process described in this practice.5. Significance and Use5.1 This practice can be used to determine if a constant, proportional, or linear bias correction can improve the degree ofagreement between two methods that purport to measure the same property of a material
49、.D6708 16b35.2 The bias correction developed in this practice can be applied to a single result (X) obtained from one test method (methodX) to obtain a predicted result (Y) for the other test method (method Y).NOTE 6Users are cautioned to ensure that Y is within the scope of method Y before its use.5.3 The between methods reproducibility established by this practice can be used to construct an interval around Y that wouldcontain the result of test method Y, if it were conducted, with about 95 % confidence.5.4 This practice can be used to guid
copyright@ 2008-2019 麦多课文库(www.mydoc123.com)网站版权所有
备案/许可证编号:苏ICP备17064731号-1