1、Designation: D6617 17 An American National StandardStandard Practice forLaboratory Bias Detection Using Single Test Result fromStandard Material1This standard is issued under the fixed designation D6617; the number immediately following the designation indicates the year oforiginal adoption or, in t
2、he case of revision, the year of last revision. A number in parentheses indicates the year of last reapproval. Asuperscript epsilon () indicates an editorial change since the last revision or reapproval.INTRODUCTIONDue to the inherent imprecision in all test methods, a laboratory cannot expect to ob
3、tain thenumerically exact accepted reference value (ARV) of a check standard (CS) material every time oneis tested. Results that are reasonably close to the ARV should provide assurance that the laboratory isperforming the test method either without bias, or with a bias that is of no practical conce
4、rn, hencerequiring no intervention. Results differing from the ARV by more than a certain amount, however,should lead the laboratory to take corrective action.1. Scope*1.1 This practice covers a methodology for establishing anacceptable tolerance zone for the difference between the resultobtained fr
5、om a single implementation of a test method on aCheck Standard (CS) and its ARV, based on user-specifiedType I error, the user-established test method precision, thestandard error of the ARV, and a presumed hypothesis that thelaboratory is performing the test method without bias.NOTE 1Throughout thi
6、s practice, the term “user” refers to the user ofthis practice, and the term “laboratory” (see 1.1) refers to the organizationor entity that is performing the test method.1.2 For the tolerance zone established in 1.1, a methodologyis presented to estimate the probability that the single test resultw
7、ill fall outside the zone, in the event that the presumedhypothesis is not true and there is a bias (positive or negative)of a user-specified magnitude that is deemed to be of practicalconcern.1.3 This practice is intended forASTM Committee D02 testmethods that produce results on a continuous numeri
8、cal scale.1.4 This practice assumes that the normal (Gaussian) modelis adequate for the description and prediction of measurementsystem behavior when it is in a state of statistical control.NOTE 2While this practice does not cover scenarios in which multipleresults are obtained on the same CS under
9、site precision or repeatabilityconditions, the statistical concepts presented are applicable. Users wishingto apply these concepts for the scenarios described are advised to consulta statistician and to reference the CS methodology described in PracticeD6299.1.5 This international standard was devel
10、oped in accor-dance with internationally recognized principles on standard-ization established in the Decision on Principles for theDevelopment of International Standards, Guides and Recom-mendations issued by the World Trade Organization TechnicalBarriers to Trade (TBT) Committee.2. Referenced Docu
11、ments2.1 ASTM Standards:2D2699 Test Method for Research Octane Number of Spark-Ignition Engine FuelD6299 Practice for Applying Statistical Quality Assuranceand Control Charting Techniques to Evaluate AnalyticalMeasurement System PerformanceD7915 Practice for Application of Generalized ExtremeStudent
12、ized Deviate (GESD) Technique to Simultane-ously Identify Multiple Outliers in a Data Set3. Terminology3.1 Definitions for accepted reference value (ARV),accuracy, bias, check standard (CS), in statistical control, siteprecision, site precision standard deviation (SITE), site preci-sion conditions,
13、repeatability conditions, and reproducibilityconditions can be found in Practice D6299.3.2 Definitions of Terms Specific to This Standard:3.2.1 acceptable tolerance zone, na numerical zonebounded inclusively by zero 6 k (k is a value based on a1This practice is under the jurisdiction of ASTM Committ
14、ee D02 on PetroleumProducts, Liquid Fuels, and Lubricantsand is the direct responsibility of Subcom-mittee D02.94 on Coordinating Subcommittee on Quality Assurance and Statistics.Current edition approved May 1, 2017. Published May 2017. Originallyapproved in 2000. Last previous edition approved in 2
15、013 as D6617 13. DOI:10.1520/D6617-17.2For referenced ASTM standards, visit the ASTM website, www.astm.org, orcontact ASTM Customer Service at serviceastm.org. For Annual Book of ASTMStandards volume information, refer to the standards Document Summary page onthe ASTM website.*A Summary of Changes s
16、ection appears at the end of this standardCopyright ASTM International, 100 Barr Harbor Drive, PO Box C700, West Conshohocken, PA 19428-2959. United StatesThis international standard was developed in accordance with internationally recognized principles on standardization established in the Decision
17、 on Principles for theDevelopment of International Standards, Guides and Recommendations issued by the World Trade Organization Technical Barriers to Trade (TBT) Committee.1user-specified Type I error; is defined in 3.2.7) such that if thedifference between the result obtained from a single implemen
18、-tation of a test method for a CS and its ARV falls inside thiszone, the presumed hypothesis that the laboratory or testingorganization is performing the test method without bias isaccepted, and the difference is attributed to normal randomvariation of the test method. Conversely, if the difference
19、fallsoutside this zone, the presumed hypothesis is rejected.3.2.2 consensus check standard (CCS), n a special type ofCS in which the ARV is assigned as the arithmetic average ofat least 16 non-outlying (see Practice D7915 or equivalent) testresults obtained under reproducibility conditions, and ther
20、esults pass the Anderson-Darling normality test in PracticeD6299, or other statistical normality test at the 95 % confi-dence level.3.2.2.1 DiscussionThese may be production materialswith unspecified composition, but are compositionally repre-sentative of material routinely tested by the test method
21、, ormaterials with specified compositions that are reproducible, butmay not be representative of routinely tested materials.3.2.3 delta (), na sign-less quantity, to be specified bythe user as the minimum magnitude of bias in either direction(either positive or negative) that is of practical concern
22、.3.2.4 power of bias detection, nin applying the method-ology of this practice, this refers to the long run probability ofbeing able to correctly detect a bias of a magnitude of at least in the correct direction, using the acceptance tolerance zoneset under the presumed hypothesis, and is defined as
23、 (1 TypeII error), for a user-specified .3.2.4.1 DiscussionThe quantity (1 Type II error), com-monly known as the power of the test in classical statisticalhypothesis testing, refers to the probability of correctly reject-ing the null hypothesis, given that the alternate hypothesis istrue. In applyi
24、ng this standard practice, the power refers to theprobability of correctly detecting a positive or negative bias ofat least .3.2.5 standardized delta (S),n, expressed in units oftotal uncertainty () per the equation:S! 5 / (1)3.2.6 standard error of ARV (SEARV),na statistic quanti-fying the uncertai
25、nty associated with the ARV in which thelatter is used as an estimate for the true value of the propertyof interest. For a CCS, this is defined as:CCS/= N (2)where:N = total number of non-outlying results used to establishthe ARV, collected under reproducibility conditions,andCCS= the standard devia
26、tion of all the non-outlying results.3.2.6.1 DiscussionAssuming a normal model, a 95 %confidence interval that would contain the true value of theproperty of interest can be constructed as follows:ARV 2 1.96 SEARVtoARV11.96 SEARV(3)3.2.7 total uncertainty (), ncombined quantity of testmethod SITEand
27、 SEARVas follows: 5 =2SITE1SE2ARV(4)3.2.8 type I error, nin applying the methodology of thispractice, this refers to the theoretical long-run probability ofrejecting the presumed hypothesis that the test method isperformed without bias when in fact the hypothesis is true,hence, committing an error i
28、n decision.3.2.8.1 DiscussionType I error, commonly known as al-pha () error in classical statistical hypothesis testing, refers tothe probability of incorrectly rejecting a presumed, or nullhypothesis based on statistics generated from relevant data. Inapplying this practice, the null hypothesis is
29、 stated as: The testmethod is being performed without bias; or it can be equiva-lently stated as: H0: bias = 0.3.2.9 type II error, nin applying the methodology of thispractice, this refers to the long-run probability of accepting(that is, not rejecting) the presumed hypothesis that the methodis per
30、formed without bias, when in fact the presumed hypoth-esis is not true and the test method is performed with a bias,hence, committing an error in decision.3.2.9.1 DiscussionType II error, commonly known as beta() error in classical statistical hypothesis testing, refers to theprobability of failure
31、to reject the null hypothesis when it is nottrue, based on statistics generated from relevant data. Toquantify Type II error, the user is required to declare a specificalternate hypothesis that is believed to be true. In applying thispractice, the alternate hypothesis will take the form: “The testme
32、thod is biased by at least ,” where is a priori decided bythe user as the minimum amount of bias in either direction(positive or negative) that is of practical concern. The alternatehypothesis can be equivalently stated as: H1: |bias| .4. Significance and Use4.1 Laboratories performing petroleum tes
33、t methods can usethis practice to set an acceptable tolerance zone for infrequenttesting of CS or CCS material, based on , and a desired TypeI error, for the purpose of ascertaining if the test method isbeing performed without bias.4.2 This practice can be used to estimate the power ofcorrectly dete
34、cting bias of different magnitudes, given theacceptable tolerance zone set in 4.1, and hence, gain insightinto the limitation of the true bias detection capability associ-ated with this acceptable tolerance zone. With this insight,trade-offs can be made between desired Type I error versusdesired bia
35、s detection capability to suit specific business needs.4.3 The CS testing activities described in this practice areintended to augment and not replace the regular statisticalmonitoring of test method performance as described in PracticeD6299.5. General Requirement5.1 Application of the methodology i
36、n this practice requiresthe following:5.1.1 The standard material has an ARV and associatedstandard error (SEARV).NOTE 3For a given power of detection, the magnitude of theassociated bias detectable is directly proportional to 5=SE2ARV12SITE. Therefore, efforts should be made to keep the ratioD6617
37、172(SEARV SITE) to as low a value as practical. A ratio of 0.5 or less isconsidered useful.5.1.2 The user has a SITEfor the test method that isreasonably suited for the standard material.NOTE 4It is recognized that there will be situations in which the CSmay not be compositionally similar to or have
38、 property level similar to, orboth, the materials regularly tested. For those situations, the site precisionstandard deviation (SITE) estimated using regularly tested material at aproperty level closest to the check standard should be used.5.1.3 User-specified Type I error and the minimum magni-tude
39、 of bias that is of practical concern ().5.1.4 The test method is in statistical control.NOTE 5Within the context of this practice, a test method can be instatistical control (that is, mean is stable, under common cause variations),but can be biased.NOTE 6Generally, sites with sitenominally less tha
40、n 0.25 R, orequivalently, site precision (2.77 site) less than 0.69 R (R is thepublished test method reproducibility, if available) are considered to bereasonably proficient in controlling the common cause or random varia-tions associated with the execution of the test method.6. Procedure6.1 Confirm
41、 the usefulness of the CS by assessing the ratioSEARV/SITE#.NOTE 7A ratio of less than or equal to 0.5 is considered useful.6.2 Calculate 5=2SITE1SE2ARV.6.3 Specify the required Type I error rate.NOTE 8A suggested starting value is 0.05.6.4 Specify required .NOTE 9The magnitude of is usually specifi
42、ed based on nonstatis-tical considerations such as business risks or operational issues, or both.6.5 Calculate S5/.6.6 See Table 1.6.7 Look across the row with the Svalues and identify thecolumn with a Svalue closest to the Scalculated in 6.5.6.8 Look down the column identified in 6.7 and locate the
43、row with the value in Column A closest to the required Type Ierror. The value in the cell where the row and column intersectis the power of detection.6.9 If the power of detection is not acceptable (typically itwill be too low), iteratively change one or all of the followinguntil all requirements ar
44、e met.6.9.1 Type I error.6.9.2 Delta ().6.9.3 Power of bias detection.NOTE 10For a single implementation of the test method, the power ofbias detection will depend on the magnitude of specified, the totaluncertainty , and the specified Type I error rate. For a fixed magnitude of, power of bias detec
45、tion (of magnitude ) can be increased at theexpense of an increase in Type I error rate. For a fixed Type I error, powerof detection bias will increase as the magnitude of increases.6.10 Use the appropriate k value from Column B of Table 1that met the specified Type I error and power of bias detecti
46、onto calculate the boundaries of the acceptable tolerance zone.6.11 Construct the acceptable tolerance zone: 0 6 k.6.12 When a single test result X for a CS is obtained,calculate the quantity (XARV).6.13 If XARVfalls inside the acceptable tolerance zoneinclusively, accept the presumed hypothesis tha
47、t the laboratoryis performing the test method without bias.6.14 If XARVfalls outside the acceptable tolerance zoneon the positive side, reject the presumed hypothesis that thelaboratory is performing the test method without bias, andconclude that there is evidence to suggest the laboratory isperform
48、ing the test method with a positive bias of at least themagnitude .6.15 If XARVfalls outside the acceptable tolerance zoneon the negative side, reject the presumed hypothesis that thelaboratory is performing the test method without bias, andconclude that there is evidence to suggest the laboratory i
49、sperforming the test method with a negative bias of at least themagnitude .7. Keywords7.1 accepted reference value; bias; check standard; consen-sus; power of test; probability of bias detection; type I error;type II errorTABLE 1 Type I Error and Associated Power of Bias Detection for Various sValuess=Magnitude of bias expressed as (s) = see 6.50.5 0.75 1 1.25 1.5 1.75 2 2.25 2.5 2.75 3 3.25 3.5 4(Column A) (Column B)Type I Error k Power of correctly detecting (s) in either direction