1、Designation: D 6617 08An American National StandardStandard Practice forLaboratory Bias Detection Using Single Test Result fromStandard Material1This standard is issued under the fixed designation D 6617; the number immediately following the designation indicates the year oforiginal adoption or, in
2、the case of revision, the year of last revision. A number in parentheses indicates the year of last reapproval. Asuperscript epsilon () indicates an editorial change since the last revision or reapproval.INTRODUCTIONDue to the inherent imprecision in all test methods, a laboratory cannot expect to o
3、btain thenumerically exact accepted reference value (ARV) of a check standard (CS) material every time oneis tested. Results that are reasonably close to the ARV should provide assurance that the laboratory isperforming the test method either without bias, or with a bias that is of no practical conc
4、ern, hencerequiring no intervention. Results differing from the ARV by more than a certain amount, however,should lead the laboratory to take corrective action.1. Scope*1.1 This practice covers a methodology for establishing anacceptable tolerance zone for the difference between the resultobtained f
5、rom a single implementation of a test method on aCS and its ARV, based on user-specified Type I error, theuser-established test method precision, the standard error of theARV, and a presumed hypothesis that the laboratory is per-forming the test method without bias.NOTE 1Throughout this practice, th
6、e term user refers to the user ofthis practice; and the term laboratory (see 1.1) refers to the organization orentity that is performing the test method.1.2 For the tolerance zone established in 1.1, a methodologyis presented to estimate the probability that the single test resultwill fall outside t
7、he zone, in the event that there is a bias(positive or negative) of a user-specified magnitude that isdeemed to be of practical concern (that is, the presumedhypothesis is not true).1.3 This practice is intended for ASTM Committee D02 testmethods that produce results on a continuous numerical scale.
8、1.4 This practice assumes that the normal (Gaussian) modelis adequate for the description and prediction of measurementsystem behavior when it is in a state of statistical control.NOTE 2While this practice does not cover scenarios in which multipleresults are obtained on the same CS under site preci
9、sion or repeatabilityconditions, the statistical concepts presented are applicable. Users wishingto apply these concepts for the scenarios described are advised to consulta statistician and to reference the CS methodology described in PracticeD 6299.2. Referenced Documents2.1 ASTM Standards:2D2699 T
10、est Method for Research Octane Number ofSpark-Ignition Engine FuelD 6299 Practice for Applying Statistical Quality Assuranceand Control Charting Techniques to Evaluate AnalyticalMeasurement System PerformanceE 178 Practice for Dealing With Outlying Observations3. Terminology3.1 Definitions for accep
11、ted reference value (ARV), accu-racy, bias, check standard (CS), in statistical control, siteprecision, site precision standard deviation (sSITE), site preci-sion conditions, repeatability conditions, and reproducibilityconditions can be found in Practice D 6299.3.2 Definitions of Terms Specific to
12、This Standard:3.2.1 acceptable tolerance zone, na numerical zonebounded inclusively by zero 6 k (k is a value based on auser-specified Type I error; is defined in 3.2.7) such that if thedifference between the result obtained from a single implemen-tation of a test method for a CS and its ARV falls i
13、nside thiszone, the presumed hypothesis that the laboratory or testingorganization is performing the test method without bias isaccepted, and the difference is attributed to normal randomvariation of the test method. Conversely, if the difference fallsoutside this zone, the presumed hypothesis is re
14、jected.3.2.2 consensus check standard (CCS), n a special type ofCS in which the ARV is assigned as the arithmetic average ofat least 16 non-outlying (see Practice E 178 or equivalent) test1This practice is under the jurisdiction of ASTM Committee D02 on PetroleumProducts and Lubricants and is the di
15、rect responsibility of Subcommittee D02.94 onCoordinating Subcommittee on Quality Assurance and Statistics.Current edition approved Dec. 1, 2008. Published January 2009. Originallyapproved in 2000. Last previous edition approved in 2005 as D 661705.2For referenced ASTM standards, visit the ASTM webs
16、ite, www.astm.org, orcontact ASTM Customer Service at serviceastm.org. For Annual Book of ASTMStandards volume information, refer to the standards Document Summary page onthe ASTM website.1*A Summary of Changes section appears at the end of this standard.Copyright ASTM International, 100 Barr Harbor
17、 Drive, PO Box C700, West Conshohocken, PA 19428-2959, United States.results obtained under reproducibility conditions, and theresults pass the Anderson-Darling normality test in PracticeD 6299, or other statistical normality test at the 95 % confi-dence level.3.2.2.1 DiscussionThese may be producti
18、on materialswith unspecified composition, but are compositionally repre-sentative of material routinely tested by the test method, ormaterials with specified compositions that are reproducible, butmay not be representative of routinely tested materials.3.2.3 delta (D), na signless quantity, to be sp
19、ecified by theuser as the minimum magnitude of bias (either positive ornegative) that is of practical concern.3.2.4 power of bias detection, nin applying the method-ology of this practice, this refers to the long run probability ofbeing able to correctly detect a bias of a magnitude of at leastD; gi
20、ven the acceptance tolerance zone set under the presumedhypothesis, and is defined as (1 Type II error), for a user-specified D.3.2.4.1 DiscussionThe quantity (1 Type II error), com-monly known as the power of the test in classical statisticalhypothesis testing, refers to the probability of correctl
21、y reject-ing the null hypothesis, given that the alternate hypothesis istrue. In applying this SP, the power refers to the probability ofdetecting a positive or negative bias of at least D.3.2.5 standardized delta (DS), nD, expressed in units oftotal uncertainty () per the equation:DS! 5D/ (1)3.2.6
22、standard error of ARV (SEARV), na statistic quanti-fying the uncertainty associated with the ARV in which thelatter is used as an estimate for the true value of the propertyof interest. For a CCS, this is defined as:sCCS/ = N (2)where:N = total number of non-outlying results used to estab-lish the A
23、RV, collected under reproducibility con-ditions, andsCCS= the standard deviation of all the non-outlyingresults.3.2.6.1 DiscussionAssuming a normal model, a 95 %confidence interval that would contain the true value of theproperty of interest can be constructed as follows:ARV 1.96 SEARVto ARV 1 1.96
24、SEARV(3)3.2.7 total uncertainty (), ncombined quantity of testmethod sSITEand SEARVas follows:5=s2SITE1 SE2ARV(4)3.2.8 type I error, nin applying the methodology of thispractice, this refers to the theoretical long run probability ofrejecting the presumed hypothesis that the test method isperformed
25、without bias when in fact the hypothesis is true,hence, committing an error in decision.3.2.8.1 DiscussionType I error, commonly known asalpha (a) error in classical statistical hypothesis testing, refersto the probability of incorrectly rejecting a presumed, or nullhypothesis based on statistics ge
26、nerated from relevant data. Inapplying this practice, the null hypothesis is stated as: The testmethod is being performed without bias; or it can be equiva-lently stated as: H0: bias = 0.3.2.9 type II error, nin applying the methodology of thispractice, this refers to the long run probability of acc
27、epting(that is, not rejecting) the presumed hypothesis that the methodis performed without bias, when in fact the presumed hypoth-esis is not true, and the test method is biased by a magnitude ofat least D, hence, committing an error in decision.3.2.9.1 DiscussionType II error, commonly known asbeta
28、 (b) error in classical statistical hypothesis testing, refers tothe probability of failure to reject the null hypothesis when it isnot true, based on statistics generated from relevant data. Toquantify Type II error, the user is required to declare a specificalternate hypothesis that is believed to
29、 be true. In applying thispractice, the alternate hypothesis will take the form: “The testmethod is biased by at least D”, where D is a priori decided bythe user as the minimum amount of bias in either direction(positive or negative) that is of practical concern. The alternatehypothesis can be equiv
30、alently stated as: H1: |bias|$ D.4. Significance and Use4.1 Laboratories performing petroleum test methods can usethis practice to set an acceptable tolerance zone for infrequenttesting of CS or CCS material, based on , and a desired TypeI error, for the purpose of ascertaining if the test method is
31、being performed without bias.4.2 This practice can be used to estimate the power ofcorrectly detecting bias of different magnitudes, using theacceptable tolerance zone set in 4.1, and hence, gain insightinto the limitation of the true bias detection capability associ-ated with this acceptable tolera
32、nce zone. With this insight,trade-offs can be made between desired Type I error versusdesired bias detection capability to suit specific business needs.4.3 The CS testing activities described in this practice areintended to augment and not replace the regular statisticalmonitoring of test method per
33、formance as described in PracticeD 6299.5. General Requirement5.1 Application of the methodology in this practice requiresthe following:5.1.1 The standard material has an ARV and associatedstandard error (SEARV).NOTE 3For a given power of detection, the magnitude of theassociated bias detectable is
34、directly proportional to 5=SE2ARV1s2SITE. Therefore, efforts should be made to keep the ratio(SEARV/ sSITE) to as low a value as practical. A ratio of 0.5 or less isconsidered useful.5.1.2 The user has a sSITEfor the test method that isreasonably suited for the standard material.NOTE 4It is recogniz
35、ed that there will be situations in which the CSmay not be compositionally similar to or have property level similar to, orboth, the materials regularly tested. For those situations, the site precisionstandard deviation (sSITE) estimated using regularly tested material at aproperty level closest to
36、the check standard should be used.5.1.3 The user should pre-specify the required Type I errorand the minimum magnitude of bias that is of practical concern(D).D66170825.1.4 The test method is in statistical control.NOTE 5Within the context of this practice, a test method can be instatistical control
37、 (that is, mean is stable, under common cause variations),but can be biased.NOTE 6Generally, sites with ssitenominally less than 0.25 R, orequivalently, site precision (2.77 3ssite) less than 0.69 R (R is thepublished test method reproducibility, if available) are considered to bereasonably proficie
38、nt in controlling the common cause or random varia-tions associated with the execution of the test method.6. Procedure6.1 Confirm the usefulness of the CS by assessing the ratioSEARV/ sSITE# .NOTE 7A ratio of less than or equal to 0.5 is considered useful.6.2 Calculate 5=s2SITE1 SE2ARV6.3 Specify th
39、e required Type I error rate.NOTE 8A suggested starting value is 0.05.6.4 Specify required D.NOTE 9The magnitude of D is usually specified based on nonstatis-tical considerations such as business risks or operational issues, or both.6.5 Calculate DS5D/ .6.6 See Table 1.6.7 Look across the row with t
40、he DSvalues and identify thecolumn with a DSvalue closest to the DScalculated in 6.5.6.8 Look down the column identified in 6.7 and locate therow with the value in Column A closest to the required Type Ierror. The value in the cell where the row and column intersectis the power of detection.6.9 If t
41、he power of detection is not acceptable (typically itwill be too low), iteratively change one or all of the followinguntil all requirements are met.6.9.1 Type I error.6.9.2 Delta (D).6.9.3 Power of bias detection.NOTE 10For a single implementation of the test method, the power ofbias detection will
42、depend on the magnitude of D specified, the totaluncertainty , and the specified Type I error rate. Power of bias detectionwill increase at the expense of an increase in Type I error rate or increasein D.6.10 Use the appropriate k value from Column B of Table 1that met the specified Type I error and
43、 power of bias detectionto calculate the boundaries of the acceptable tolerance zone.6.11 Construct the acceptable tolerance zone: 0 6 k.6.12 When a single test result X for a CS is obtained,calculate the quantity (XARV).6.13 If XARVfalls inside the acceptable tolerance zoneinclusively, accept the p
44、resumed hypothesis that the laboratoryis performing the test method without bias.6.14 If XARVfalls outside the acceptable tolerance zoneon the positive side, reject the presumed hypothesis that thelaboratory is performing the test method without bias, andconclude that there is evidence to suggest th
45、e laboratory isperforming the test method with a positive bias of at least themagnitude D.6.15 If XARVfalls outside the acceptable tolerance zoneon the negative side, reject the presumed hypothesis that thelaboratory is performing the test method without bias, andconclude that there is evidence to s
46、uggest the laboratory isperforming the test method with a negative bias of at least themagnitude D.7. Keywords7.1 accepted reference value; bias; check standard; consen-sus; power of test; probability of bias detection; type I error;type II errorTABLE 1 Type I Error and Associated Power of Bias Dete
47、ction for Various DsValuesA B Power of Detecting |bias| = DsARequiredType IError Ratek(Ds) =0.5 0.75 1 1.25 1.5 1.75 2 2.25 2.5 2.75 3 3.5 40.05 1.96 0.07 0.11 0.17 0.24 0.32 0.42 0.52 0.61 0.71 0.79 0.85 0.94 0.980.10 1.64 0.13 0.19 0.26 0.35 0.44 0.54 0.64 0.73 0.80 0.87 0.91 0.97 0.990.15 1.44 0.
48、17 0.25 0.33 0.42 0.52 0.62 0.71 0.79 0.86 0.90 0.94 0.98 0.990.2 1.28 0.22 0.30 0.39 0.49 0.59 0.68 0.76 0.83 0.89 0.93 0.96 0.99 1.000.25 1.15 0.26 0.34 0.44 0.54 0.64 0.73 0.80 0.86 0.91 0.95 0.97 0.99 1.000.30 1.04 0.30 0.39 0.49 0.58 0.68 0.76 0.83 0.89 0.93 0.96 0.98 0.99 1.000.35 0.93 0.33 0.
49、43 0.53 0.62 0.71 0.79 0.86 0.91 0.94 0.97 0.98 0.99 1.000.4 0.84 0.37 0.46 0.56 0.66 0.74 0.82 0.88 0.92 0.95 0.97 0.98 1.00 1.000.45 0.76 0.40 0.50 0.60 0.69 0.77 0.84 0.89 0.93 0.96 0.98 0.99 1.00 1.000.50 0.67 0.43 0.53 0.63 0.72 0.80 0.86 0.91 0.94 0.97 0.98 0.99 1.00 1.000.55 0.60 0.46 0.56 0.66 0.74 0.82 0.88 0.92 0.95 0.97 0.98 0.99 1.00 1.000.6 0.52 0.49 0.59 0.68 0.77 0.84 0.89 0.93 0.96 0.98 0.99 0.99 1.00 1.000.65 0.45 0.52 0.62 0.71 0.79 0.85 0.90 0.94 0.96 0.98 0.99 0.99 1.00 1.000.70 0.39 0.55 0.64 0.73 0.81 0.87 0.91