1、Designation: D 6708 08An American National StandardStandard Practice forStatistical Assessment and Improvement of ExpectedAgreement Between Two Test Methods that Purport toMeasure the Same Property of a Material1This standard is issued under the fixed designation D 6708; the number immediately follo

2、wing the designation indicates the year oforiginal adoption or, in the case of revision, the year of last revision. A number in parentheses indicates the year of last reapproval. Asuperscript epsilon () indicates an editorial change since the last revision or reapproval.1. Scope1.1 This practice cov

3、ers statistical methodology for assess-ing the expected agreement between two standard test methodsthat purport to measure the same property of a material, anddeciding if a simple linear bias correction can further improvethe expected agreement. It is intended for use with resultscollected from an i

4、nterlaboratory study meeting the require-ment of Practice D 6300 or equivalent (for example,ISO 4259). The interlaboratory study must be conducted on atleast ten materials that span the intersecting scopes of the testmethods, and results must be obtained from at least sixlaboratories using each meth

5、od.NOTE 1Examples of standard test methods are those developed byvoluntary consensus standards bodies such as ASTM, IP/BSI, DIN,AFNOR, CGSB.1.2 The statistical methodology is based on the premise thata bias correction will not be needed. In the absence of strongstatistical evidence that a bias corre

6、ction would result in betteragreement between the two methods, a bias correction is notmade. If a bias correction is required, then the parsimonyprinciple is followed whereby a simple correction is to befavored over a more complex one.NOTE 2Failure to adhere to the parsimony principle generally resu

7、ltsin models that are over-fitted and do not perform well in practice.1.3 The bias corrections of this practice are limited to aconstant correction, proportional correction or a linear (propor-tional + constant) correction.1.4 The bias-correction methods of this practice are methodsymmetric, in the

8、sense that equivalent corrections are obtainedregardless of which method is bias-corrected to match theother.1.5 A methodology is presented for establishing the 95 %confidence limit (designated by this practice as the betweenmethods reproducibility) for the difference between two resultswhere each r

9、esult is obtained by a different operator usingdifferent apparatus and each applying one of the two methodsX and Y on identical material, where one of the methods hasbeen appropriately bias-corrected in accordance with thispractice.NOTE 3In earlier versions of this standard practice, the term “cross

10、-method reproducibility” was used in place of the term “between methodsreproducibility.” The change was made because the “between methodsreproducibility” term is more intuitive and less confusing. It is importantto note that these two terms are synonymous and interchangeable with oneanother, especia

11、lly in cases where the “cross-method reproducibility” termwas subsequently referenced by name in methods where a D 6708assessment was performed, before the change in terminology in thisstandard practice was adopted.NOTE 4Users are cautioned against applying the between methodsreproducibility as calc

12、ulated from this practice to materials that aresignificantly different in composition from those actually studied, as theability of this practice to detect and address sample-specific biases (see6.8) is dependent on the materials selected for the interlaboratory study.When sample-specific biases are

13、 present, the types and ranges of samplesmay need to be expanded significantly from the minimum of ten asspecified in this practice in order to obtain a more comprehensive andreliable 95 % confidence limits for between methods reproducibility thatadequately cover the range of sample specific biases

14、for different types ofmaterials.1.6 This practice is intended for test methods which mea-sure quantitative (numerical) properties of petroleum or petro-leum products.1.7 The statistical methodology outlined in this practice isalso applicable for assessing the expected agreement betweenany two test m

15、ethods that purport to measure the same propertyof a material, provided the results are obtained on the samecomparison sample set, the standard error associated with eachtest result is known, the sample set design meets the require-ment of this practice, and the statistical degree of freedom ofthe d

16、ata set exceeds 30.1This practice is under the jurisdiction of ASTM Committee D02 on PetroleumProducts and Lubricants and is the direct responsibility of Subcommittee D02.94 onCoordinating Subcommittee on Quality Assurance and Statistics.Current edition approved Dec. 15, 2008. Published February 200

17、9. Originallyapproved in 2001. Last previous edition approved in 2007 as D 670807.1Copyright ASTM International, 100 Barr Harbor Drive, PO Box C700, West Conshohocken, PA 19428-2959, United States.2. Referenced Documents2.1 ASTM Standards:2D 5580 Test Method for Determination of Benzene, Tolu-ene, E

18、thylbenzene, p/m-Xylene, o-Xylene, C9and HeavierAromatics, and Total Aromatics in Finished Gasoline byGas ChromatographyD 5769 Test Method for Determination of Benzene, Tolu-ene, and Total Aromatics in Finished Gasolines by GasChromatography/Mass SpectrometryD 6299 Practice for Applying Statistical

19、Quality Assuranceand Control Charting Techniques to Evaluate AnalyticalMeasurement System PerformanceD 6300 Practice for Determination of Precision and BiasData for Use in Test Methods for Petroleum Products andLubricants2.2 ISO Standard:3ISO 4259 Petroleum ProductsDetermination and applica-tion of

20、precision data in relation to methods of test.3. Terminology3.1 Definitions:3.1.1 closeness sum of squares (CSS), na statistic used toquantify the degree of agreement between the results from twotest methods after bias-correction using the methodology ofthis practice.3.1.2 between methods reproducib

21、ility (RXY), na quantita-tive expression of the random error associated with thedifference between two results obtained by different operatorsusing different apparatus and applying the two methods X andY, respectively, each obtaining a single result on an identicaltest sample, when the methods have

22、been assessed and anappropriate bias-correction has been applied in accordancewith this practice; it is defined as the 95 % confidence limit forthe difference between two such single and independentresults. DiscussionAstatement of between methods repro-ducibility must include a description of

23、 any bias correctionused in accordance with this practice. DiscussionBetween methods reproducibility is ameaningful concept only if there are no statistically observablesample-specific relative biases between the two methods, or ifsuch biases vary from one sample to another in such a way that

24、they may be considered random effects. (see 6.7.)3.1.3 total sum of squares (TSS), na statistic used toquantify the information content from the inter-laboratorystudy in terms of total variation of sample means relative to thestandard error of each sample mean.3.2 Symbols:X,Y = single X-method and Y

25、-method results,respectivelyXijk,Yijk= single results from the X-method andY-method round robins, respectivelyXi,Yi= means of results on the ithround robinsampleS = the number of samples in the round robinLXi,LYi= the numbers of laboratories that returnedresults on the ithround robin sampleRX,RY= th

26、e reproducibilities of the X- andY- meth-ods, respectivelysRXi,sRYi= the reproducibility standard deviations,evaluated at the means of the ithroundrobin samplesrXi,srYi= the repeatability standard deviations,evaluated at the means of the ithroundrobin samplesXi,sYi= standard errors of the means ithr

27、ound robinsampleX,Y= the weighted means of round robins(across samples)xi,yi= deviations of the means of the ithroundrobin sample results from Xand Y, respec-tively.TSSX, TSSY= total sums of squares, around Xand YF = a ratio for comparing variances; notuniquemore than one usevX,vY= the degrees of fr

28、eedom for reproducibilityvariances from the round robinswi= weight associated with the difference be-tween mean results (or corrected meanresults) from the ithround robin sampleCSS = weighted sum of squared differences be-tween (possibly corrected) mean resultsfrom the round robina,b = parameters of

29、 a linear correction: Y= a +bXt1,t2= ratios for assessing reductions in sums ofsquaresRXY= estimate of between methods reproducibil-ityY= Y-method value predicted from X-methodresultYi= ithround robin sample Y-method mean,predicted from corresponding X-methodmeani= standardized difference between Yi

30、and Yi.LX,LY= harmonic mean numbers of laboratoriessubmitting results on round robin samples,by X- and Y- methods, respectivelyRXY= estimate of between methods reproducibil-ity, computed from an X-method resultonly4. Summary of Practice4.1 Precisions of the two methods are quantified usinginter-labo

31、ratory studies meeting the requirements of PracticeD 6300 or equivalent, using at least ten samples in commonthat span the intersecting scopes of the methods. The arithmeticmeans of the results for each common sample obtained by eachmethod are calculated. Estimates of the standard errors of thesemea

32、ns are computed.2For referenced ASTM standards, visit the ASTM website,, orcontact ASTM Customer Service at For Annual Book of ASTMStandards volume information, refer to the standards Document Summary page onthe ASTM website.3Available from American National Standards I

33、nstitute (ANSI), 25 W. 43rd St.,4th Floor, New York, NY 10036.D6708082NOTE 5For established standard test methods, new precision studiesgenerally will be required in order to meet the common sample require-ment.NOTE 6Both test methods do not need to be run by the samelaboratory. If they are, care sh

34、ould be taken to ensure the independent testresult requirement of Practice D 6300 is met (for example, by double-blind testing of samples in random order).4.2 Weighted sums of squares are computed for the totalvariation of the mean results across all common samples foreach method. These sums of squa

35、res are assessed against thestandard errors of the mean results for each method to ensurethat the samples are sufficiently varied before continuing withthe practice.4.3 The closeness of agreement of the mean results by eachmethod is evaluated using appropriate weighted sums ofsquared differences. Su

36、ch sums of squares are computed fromthe data first with no bias correction, then with a constant biascorrection, then, when appropriate, with a proportional correc-tion, and finally with a linear (proportional + constant) correc-tion.4.4 The weighted sums of squared differences for the linearcorrect

37、ion is assessed against the total variation in the meanresults for both methods to ensure that there is sufficientcorrelation between the two methods.4.5 The most parsimonious bias correction is selected.4.6 The weighted sum of squares of differences, afterapplying the selected bias correction, is a

38、ssessed to determinewhether additional unexplained sources of variation remain inthe residual (that is, the individual Yiminus bias-corrected Xi)data. Any remaining, unexplained variation is attributed tosample-specific biases (also known as method-material inter-actions, or matrix effects). In the

39、absence of sample-specificbiases, the between methods reproducibility is estimated.4.7 If sample-specific biases are present, the residuals (thatis, the individual Yiminus bias-corrected Xi) are tested forrandomness. If they are found to be consistent with a random-effects model, then their contribu

40、tion to the between methodsreproducibility is estimated, and accumulated into an all-encompassing between methods reproducibility estimate.4.8 Refer to Fig. 1 for a simplified flow diagram of theprocess described in this practice.5. Significance and Use5.1 This practice can be used to determine if a

41、 constant,proportional, or linear bias correction can improve the degreeof agreement between two methods that purport to measure thesame property of a material.5.2 The bias correction developed in this practice can beapplied to a single result (X) obtained from one test method(method X) to obtain a

42、predicted result ( Y) for the other testmethod (method Y).NOTE 7Users are cautioned to ensure that Yis within the scope ofmethod Y before its use.5.3 The between methods reproducibility established by thispractice can be used to construct an interval around Ythatwould contain the result of test meth

43、od Y, if it were conducted,with about 95 % confidence.5.4 This practice can be used to guide commercial agree-ments and product disposition decisions involving test methodsthat have been evaluated relative to each other in accordancewith this practice.6. ProcedureNOTE 8For an in-depth statistical di

44、scussion of the methodology usedin this section, see Appendix X1. For a worked example, see AppendixX2.6.1 Calculate sample means and standard errors from Prac-tice D 6300 results.6.1.1 The process of applying Practice D 6300 to the datamay involve elimination of some results as outliers, and it may

45、also involve applying a transformation to the data. For thispractice, compute the mean results from data that have notbeen transformed, but with outliers removed in accordancewith Practice D 6300. The precision estimates from PracticeD 6300 are used to estimate the standard errors of these means.6.1

46、.2 Compute the means as follows: Let Xijkrepresent the kthresult on the ithcommonmaterial by the jthlab in the round robin for method X.Similarly for Yijk. (The ithmaterial is the same for both roundrobins, but the jthlab in one round robin is not necessarily thesame lab as the jthlab in the

47、other round robin.) Let nXijbe thenumber of results on the ithmaterial from the jthX-method lab,after removing outliers that is, the number of results in cell (i,j).Let LXibe the number of laboratories in the X-method roundrobin that have at least one result on the ithmaterial remainingin the data s

48、et, after removal of outliers. Let S be the totalnumber of materials common to both round robins. The mean X-method result for the ithmaterial is:Xi51Lxi(j(kXijknXij(1)where, Xiis the average of the cell averages on the ithmaterial by method X. Similarly, the mean Y-method result for t

49、he ithmaterial is:Yi51LYi(j(kYijknYij(2)6.1.3 The standard errors (standard deviations of the meansof the results) are computed as follows: If sRXiis the estimated reproducibility standarddeviation from the X-method round robin, and srXiis theestimated repeatibility standard deviation, then an estimate ofthe standard error for Xiis given by:sXi51LXiFsRXi22 srXi2S1 21LXi(j1nXijDG(3)NOTE 9Since repeatability and reproducibility may vary with X, evenif the LXiwere the same for all materials and the nXijwere the same for alllaboratories and all mater


