1、Designation: D6617 13 An American National StandardStandard Practice forLaboratory Bias Detection Using Single Test Result fromStandard Material1This standard is issued under the fixed designation D6617; the number immediately following the designation indicates the year oforiginal adoption or, in t
2、he case of revision, the year of last revision. A number in parentheses indicates the year of last reapproval. Asuperscript epsilon () indicates an editorial change since the last revision or reapproval.INTRODUCTIONDue to the inherent imprecision in all test methods, a laboratory cannot expect to ob
3、tain thenumerically exact accepted reference value (ARV) of a check standard (CS) material every time oneis tested. Results that are reasonably close to the ARV should provide assurance that the laboratory isperforming the test method either without bias, or with a bias that is of no practical conce
4、rn, hencerequiring no intervention. Results differing from the ARV by more than a certain amount, however,should lead the laboratory to take corrective action.1. Scope*1.1 This practice covers a methodology for establishing anacceptable tolerance zone for the difference between the resultobtained fr
5、om a single implementation of a test method on aCheck Standard (CS) and its ARV, based on user-specifiedType I error, the user-established test method precision, thestandard error of the ARV, and a presumed hypothesis that thelaboratory is performing the test method without bias.NOTE 1Throughout thi
6、s practice, the term user refers to the user ofthis practice; and the term laboratory (see 1.1) refers to the organization orentity that is performing the test method.1.2 For the tolerance zone established in 1.1, a methodologyis presented to estimate the probability that the single test resultwill
7、fall outside the zone, in the event that there is a bias(positive or negative) of a user-specified magnitude that isdeemed to be of practical concern (that is, the presumedhypothesis is not true).1.3 This practice is intended forASTM Committee D02 testmethods that produce results on a continuous num
8、erical scale.1.4 This practice assumes that the normal (Gaussian) modelis adequate for the description and prediction of measurementsystem behavior when it is in a state of statistical control.NOTE 2While this practice does not cover scenarios in which multipleresults are obtained on the same CS und
9、er site precision or repeatabilityconditions, the statistical concepts presented are applicable. Users wishingto apply these concepts for the scenarios described are advised to consulta statistician and to reference the CS methodology described in PracticeD6299.2. Referenced Documents2.1 ASTM Standa
10、rds:2D2699 Test Method for Research Octane Number of Spark-Ignition Engine FuelD6299 Practice for Applying Statistical Quality Assuranceand Control Charting Techniques to Evaluate AnalyticalMeasurement System PerformanceE178 Practice for Dealing With Outlying Observations3. Terminology3.1 Definition
11、s for accepted reference value (ARV),accuracy, bias, check standard (CS), in statistical control, siteprecision, site precision standard deviation (SITE), site preci-sion conditions, repeatability conditions, and reproducibilityconditions can be found in Practice D6299.3.2 Definitions of Terms Speci
12、fic to This Standard:3.2.1 acceptable tolerance zone, na numerical zonebounded inclusively by zero 6 k (k is a value based on auser-specified Type I error; is defined in 3.2.7) such that if thedifference between the result obtained from a single implemen-tation of a test method for a CS and its ARV
13、falls inside thiszone, the presumed hypothesis that the laboratory or testingorganization is performing the test method without bias isaccepted, and the difference is attributed to normal randomvariation of the test method. Conversely, if the difference fallsoutside this zone, the presumed hypothesi
14、s is rejected.1This practice is under the jurisdiction of ASTM Committee D02 on PetroleumProducts and Lubricantsand is the direct responsibility of Subcommittee D02.94 onCoordinating Subcommittee on Quality Assurance and Statistics.Current edition approved June 15, 2013. Published July 2013. Origina
15、llyapproved in 2000. Last previous edition approved in 2008 as D6617 08. DOI:10.1520/D6617-13.2For referenced ASTM standards, visit the ASTM website, www.astm.org, orcontact ASTM Customer Service at serviceastm.org. For Annual Book of ASTMStandards volume information, refer to the standards Document
16、 Summary page onthe ASTM website.*A Summary of Changes section appears at the end of this standardCopyright ASTM International, 100 Barr Harbor Drive, PO Box C700, West Conshohocken, PA 19428-2959. United States13.2.2 consensus check standard (CCS), n a special type ofCS in which the ARV is assigned
17、 as the arithmetic average ofat least 16 non-outlying (see Practice E178 or equivalent) testresults obtained under reproducibility conditions, and theresults pass the Anderson-Darling normality test in PracticeD6299, or other statistical normality test at the 95 % confi-dence level.3.2.2.1 Discussio
18、nThese may be production materialswith unspecified composition, but are compositionally repre-sentative of material routinely tested by the test method, ormaterials with specified compositions that are reproducible, butmay not be representative of routinely tested materials.3.2.3 delta (), na sign-l
19、ess quantity, to be specified bythe user as the minimum magnitude of bias in either direction(either positive or negative) that is of practical concern.3.2.4 power of bias detection, nin applying the method-ology of this practice, this refers to the long run probability ofbeing able to correctly det
20、ect a bias of a magnitude of at least in the correct direction, using the acceptance tolerance zoneset under the presumed hypothesis, and is defined as (1 TypeII error), for a user-specified .3.2.4.1 DiscussionThe quantity (1 Type II error), com-monly known as the power of the test in classical stat
21、isticalhypothesis testing, refers to the probability of correctly reject-ing the null hypothesis, given that the alternate hypothesis istrue. In applying this standard practice, the power refers to theprobability of correctly detecting a positive or negative bias ofat least .3.2.5 standardized delta
22、 (S),n, expressed in units oftotal uncertainty () per the equation:S! 5 / (1)3.2.6 standard error of ARV (SEARV),na statistic quanti-fying the uncertainty associated with the ARV in which thelatter is used as an estimate for the true value of the propertyof interest. For a CCS, this is defined as:CC
23、S/= N (2)where:N = total number of non-outlying results used to establishthe ARV, collected under reproducibility conditions,andCCS= the standard deviation of all the non-outlying results.3.2.6.1 DiscussionAssuming a normal model, a 95 %confidence interval that would contain the true value of thepro
24、perty of interest can be constructed as follows:ARV 2 1.96 SEARVtoARV11.96 SEARV(3)3.2.7 total uncertainty (), ncombined quantity of testmethod SITEand SEARVas follows: 5 =2SITE1SE2ARV(4)3.2.8 type I error, nin applying the methodology of thispractice, this refers to the theoretical long run probabi
25、lity ofrejecting the presumed hypothesis that the test method isperformed without bias when in fact the hypothesis is true,hence, committing an error in decision.3.2.8.1 DiscussionType I error, commonly known as al-pha () error in classical statistical hypothesis testing, refers tothe probability of
26、 incorrectly rejecting a presumed, or nullhypothesis based on statistics generated from relevant data. Inapplying this practice, the null hypothesis is stated as: The testmethod is being performed without bias; or it can be equiva-lently stated as: H0: bias = 0.3.2.9 type II error, nin applying the
27、methodology of thispractice, this refers to the long run probability of accepting(that is, not rejecting) the presumed hypothesis that the methodis performed without bias, when in fact the presumed hypoth-esis is not true, and the test method is biased by a magnitude ofat least , hence, committing a
28、n error in decision.3.2.9.1 DiscussionType II error, commonly known as beta() error in classical statistical hypothesis testing, refers to theprobability of failure to reject the null hypothesis when it is nottrue, based on statistics generated from relevant data. Toquantify Type II error, the user
29、is required to declare a specificalternate hypothesis that is believed to be true. In applying thispractice, the alternate hypothesis will take the form: “The testmethod is biased by at least ”, where is a priori decided bythe user as the minimum amount of bias in either direction(positive or negati
30、ve) that is of practical concern. The alternatehypothesis can be equivalently stated as: H1: |bias| .4. Significance and Use4.1 Laboratories performing petroleum test methods can usethis practice to set an acceptable tolerance zone for infrequenttesting of CS or CCS material, based on , and a desire
31、d TypeI error, for the purpose of ascertaining if the test method isbeing performed without bias.4.2 This practice can be used to estimate the power ofcorrectly detecting bias of different magnitudes, using theacceptable tolerance zone set in 4.1, and hence, gain insightinto the limitation of the tr
32、ue bias detection capability associ-ated with this acceptable tolerance zone. With this insight,trade-offs can be made between desired Type I error versusdesired bias detection capability to suit specific business needs.4.3 The CS testing activities described in this practice areintended to augment
33、and not replace the regular statisticalmonitoring of test method performance as described in PracticeD6299.5. General Requirement5.1 Application of the methodology in this practice requiresthe following:5.1.1 The standard material has an ARV and associatedstandard error (SEARV).NOTE 3For a given pow
34、er of detection, the magnitude of theassociated bias detectable is directly proportional to 5=SE2ARV12SITE. Therefore, efforts should be made to keep the ratio(SEARV/ SITE) to as low a value as practical. A ratio of 0.5 or less isconsidered useful.5.1.2 The user has a SITEfor the test method that is
35、reasonably suited for the standard material.D6617 132NOTE 4It is recognized that there will be situations in which the CSmay not be compositionally similar to or have property level similar to, orboth, the materials regularly tested. For those situations, the site precisionstandard deviation (SITE)
36、estimated using regularly tested material at aproperty level closest to the check standard should be used.5.1.3 User-specified Type I error and the minimum magni-tude of bias that is of practical concern ().5.1.4 The test method is in statistical control.NOTE 5Within the context of this practice, a
37、test method can be instatistical control (that is, mean is stable, under common cause variations),but can be biased.NOTE 6Generally, sites with sitenominally less than 0.25 R, orequivalently, site precision (2.77 site) less than 0.69 R (R is thepublished test method reproducibility, if available) ar
38、e considered to bereasonably proficient in controlling the common cause or random varia-tions associated with the execution of the test method.6. Procedure6.1 Confirm the usefulness of the CS by assessing the ratioSEARV/SITE#.NOTE 7A ratio of less than or equal to 0.5 is considered useful.6.2 Calcul
39、ate 5=2SITE1SE2ARV6.3 Specify the required Type I error rate.NOTE 8A suggested starting value is 0.05.6.4 Specify required .NOTE 9The magnitude of is usually specified based on nonstatis-tical considerations such as business risks or operational issues, or both.6.5 Calculate S5/ .6.6 See Table 1.6.7
40、 Look across the row with the Svalues and identify thecolumn with a Svalue closest to the Scalculated in 6.5.6.8 Look down the column identified in 6.7 and locate therow with the value in Column A closest to the required Type Ierror. The value in the cell where the row and column intersectis the pow
41、er of detection.6.9 If the power of detection is not acceptable (typically itwill be too low), iteratively change one or all of the followinguntil all requirements are met.6.9.1 Type I error.6.9.2 Delta ().6.9.3 Power of bias detection.NOTE 10For a single implementation of the test method, the power
42、 ofbias detection will depend on the magnitude of specified, the totaluncertainty , and the specified Type I error rate. For a fixed magnitude of, power of bias detection (of magnitude ) can be increased at theexpense of an increase in Type I error rate. For a fixed Type I error, powerof detection b
43、ias will increase as the magnitude of increases.6.10 Use the appropriate k value from Column B of Table 1that met the specified Type I error and power of bias detectionto calculate the boundaries of the acceptable tolerance zone.6.11 Construct the acceptable tolerance zone: 0 6 k.6.12 When a single
44、test result X for a CS is obtained,calculate the quantity (XARV).6.13 If XARVfalls inside the acceptable tolerance zoneinclusively, accept the presumed hypothesis that the laboratoryis performing the test method without bias.6.14 If XARVfalls outside the acceptable tolerance zoneon the positive side
45、, reject the presumed hypothesis that thelaboratory is performing the test method without bias, andconclude that there is evidence to suggest the laboratory isperforming the test method with a positive bias of at least themagnitude .6.15 If XARVfalls outside the acceptable tolerance zoneon the negat
46、ive side, reject the presumed hypothesis that thelaboratory is performing the test method without bias, andconclude that there is evidence to suggest the laboratory isperforming the test method with a negative bias of at least themagnitude .7. Keywords7.1 accepted reference value; bias; check standa
47、rd; consen-sus; power of test; probability of bias detection; type I error;type II errorTABLE 1 Type I Error and Associated Power of Bias Detection for Various sValuess=Magnitude of bias expressed as (s) = see 6.50.5 0.75 1 1.25 1.5 1.75 2 2.25 2.5 2.75 3 3.25 3.5 4(Column A) (Column B)Type I Error
48、k Power of correctly detecting (s) in either direction = that is, either + or 0.01 2.58 0.019 0.034 0.058 0.092 0.141 0.204 0.282 0.372 0.470 0.569 0.664 0.750 0.822 0.9230.025 2.24 0.041 0.068 0.107 0.161 0.229 0.312 0.405 0.503 0.602 0.694 0.776 0.843 0.896 0.9610.05 1.96 0.072 0.113 0.169 0.239 0
49、.323 0.417 0.516 0.614 0.705 0.785 0.851 0.901 0.938 0.9790.1 1.64 0.126 0.185 0.260 0.346 0.442 0.542 0.639 0.727 0.804 0.865 0.912 0.946 0.968 0.9910.15 1.44 0.174 0.245 0.330 0.425 0.524 0.622 0.712 0.791 0.856 0.905 0.941 0.965 0.980 0.9950.2 1.28 0.217 0.298 0.389 0.487 0.586 0.680 0.764 0.834 0.888 0.929 0.957 0.975 0.987 0.9970.25 1.15 0.258 0.344 0.440 0.540 0.637 0.726 0.802 0.864 0.911 0.945 0.968 0.982 0.991 0.9980.3 1.04 0.296 0.387 0.485 0.585 0.679 0.762 0.832 0.888 0.