1、TAPPI/ANSI T 1205 sp-14 TENTATIVE STANDARD 1963 RECOMMENDED PRACTICE 1982 REVISED 1987 REVISED 1992 REVISED 2000 REVISED 2005 REAFFIRMED 2009 REVISED 2014 2014 TAPPI The information and data contained in this document were prepared by a technical committee of the Association. The committee and the A
2、ssociation assume no liability or responsibility in connection with the use of such information or data, including but not limited to any liability under patent, copyright, or trade secret laws. The user is responsible for determining that this document is the most recent edition published. Approved
3、 by the Standard Specific Interest Group for this Standard Practice TAPPI CAUTION: This Test Method may include safety precautions which are believed to be appropriate at the time of publication of the method. The intent of these is to alert the user of the method to safety issues related to such us
4、e. The user is responsible for determining that the safety precautions are complete and are appropriate to their use of the method, and for ensuring that suitable safety practices have not changed since publication of the method. This method may require the use, disposal, or both, of chemicals which
5、 may present serious health hazards to humans. Procedures for the handling of such substances are set forth on Material Safety Data Sheets which must be developed by all manufacturers and importers of potentially hazardous chemicals and maintained by all distributors of potentially hazardous chemica
6、ls. Prior to the use of this method, the user must determine whether any of the chemicals to be used or disposed of are potentially hazardous and, if so, must follow strictly the procedures specified by both the manufacturer, as well as local, state, and federal authorities for safe use and disposal
7、 of these chemicals. Dealing with suspect (outlying) test determinations 1. Scope1.1 This TAPPI Standard Practice provides a procedure for judging whether suspect test determinations should be investigated further for possible rejection. A suspect determination (apparent outlier) is one that appears
8、 to deviate markedly from other determinations on the same sample of material. An outlying determination (outlier) is a suspect determination for which the deviation has, in fact, been found to be significant using an appropriate statistical test. 1.2 Formal treatment of suspect test determinations,
9、 as specified in this document, is necessary only in critical situations (e.g., very critical research) or when required by a product specification or an official test method. 1.2.1 Formal treatment of suspect test determinations and test results is highly desirable in studies establishing the repea
10、tability and reproducibility of a test method (see TAPPI T 1200 “Interlaboratory Evaluation of Test Methods to Determine TAPPI Repeatability and Reproducibility”). 1.3 Both nonstatistical and statistical rules for dealing with suspect test determinations are given. Basically no test determination sh
11、ould be accepted, no matter how correct the value appears to be, if it is known that a faulty determination has been made, and no test determination should be completely rejected purely on a statistical significance test. 1.4 The statistical tests described in this practice have been selected from a
12、 large number that are available. They apply to the simplest kind of experimental data, that is, replicate determinations of some property of a given sample of material. NOTE 1: This practice applies to replicate test determinations, usually on several specimens taken under the same conditions and m
13、easured in a brief period of time. A test result, obtained in accordance with a TAPPI Test Method, is usually one or the average of two or more such test determinations (see definitions in TAPPI T 400 “Sampling and Accepting a Single Lot of Paper, Paperboard, Containerboard or Related Product”). Thi
14、s practice allows the examination and possible elimination of suspect test determinations (from sets of 3 to 30 determinations) before the calculation of the final test result. T 1205 sp-14 Dealing with suspect (outlying) test determinations / 2 NOTE 2: This practice may also be applied to suspect t
15、est results (by substituting the words “test results” for “test determinations” throughout this document), when a laboratory must evaluate a large shipment requiring the determination and calculation of several test results. 1.5 Three categories of suspect determinations are considered: 1.5.1 A sing
16、le suspect determination; 1.5.2 Two suspect determinations, one the least and the other the greatest in the set of replicate determinations; and 1.5.3 Two suspect determinations, the two largest or the two smallest in the set. 2. Summary of procedure 2.1 First, nonstatistical reasons for rejecting o
17、r correcting test determinations are considered. Then the remaining apparent outliers are subjected to statistical tests of significance, and those found to be statistically significant are again examined for nonstatistical causes. Finally, appropriate action is taken as regards the further analysis
18、 of the data in the presence of outlying test determinations for which no cause can be determined. 2.2 Only statistical tests relatively simple to calculate are included in this standard practice. One group of tests involving the Dixon criteria avoids calculation of the standard deviation (s) and pe
19、rmits quick judgment. Since s is now easily obtainable on a pocket or desk calculator, other tests are included that do require a calculation of s; these are generally more powerful in detecting outliers, except when the Dixon test avoids secondary outliers included in the other tests. 2.3 The table
20、s reproduced in this recommended practice are limited to 25 or 30 replicate determinations. 2.3.1 Generally, it is not useful to make a larger number of determinations in order to reduce the effect of random errors or sample variability, because the effect of a constant error present in all the dete
21、rminations will usually become more important than the residual effect of the random errors. However, if more than 25 or 30 replicate determinations have been made, the data may be divided into groups of less than 25 or 30 and each group separately tested for outliers. 2.3.2 The tables are based on
22、an assumed underlying normal (Gaussian) distribution (1) of test determinations, and those involving the standard deviation assume that the estimate of the standard deviation must come from the group of replicate determinations being tested. If the distribution is not normal but is of the same gener
23、al bell shape, the level of risk of an erroneous conclusion (see 4.2.1.3) will change somewhat but not seriously. However, if the distribution is markedly skewed, test determinations on the long-tailed side of the distribution will be disproportionately selected out. If the tail is on the high side,
24、 as for folding endurance, a log transformation of the data prior to application of the tests for outliers is suggested in order to induce a more normal distribution of the data. 3. Significance Even a single aberrant determination among a number of replicate test determinations of a property of a s
25、ample of material may lead to an incorrect conclusion about the nature of the material. On the other hand, discarding determinations because they appear to be aberrant, when in fact they are not, can also result in false conclusions. This standard practice provides a procedure for reducing these two
26、 dangers. 4. Procedure 4.1 Nonstatistical considerations 4.1.1 When it is clearly known that a deviation from the prescribed test method (e.g., a blunder) has taken place, discard the resultant determination even when it appears to agree with the rest of the data. Such a deviation might be the accid
27、ental rubbing of the observers finger against the swinging pendulum during the tearing strength test. However, correct and retain the determination if a reliable procedure for doing so is available, such as correcting a change in temperature. 4.2 Statistical tests for detecting outliers 4.2.1 Genera
28、l procedure 4.2.1.1 If the number of replicate test determinations in the set to be tested is greater than 25 (or 30), divide them 3 / Dealing with suspect (outlying) test determinations T 1205 sp-14 into groups of less than 25 (or 30), depending on the statistical test to be used (see Tables 1-4).
29、Divide the determinations in the order in which they were obtained (i.e., first 25, second 25, etc.), unless trends (drifting, cycling, etc.) are observed, in which case assign the determinations at random to the groups. 4.2.1.2 Designate the number of determinations in the group to be tested by the
30、 letter n, and arrange the n determinations in order of magnitude: x1, x2, x3,xnThe order of the arrangement may be either ascending or descending and the suspected determination(s) may be either the largest or the smallest, or the extremes, depending on the statistical test to be used, as explained
31、 in the following sections. 4.2.1.3 Choose an appropriate significance level or risk (probability) of erroneously selecting a good determination. Unless otherwise required, it is conventional to use the 5% level. 4.2.1.4 The statistical test to be used depends on the number and location of the suspe
32、ct test determinations. For each category of suspect determinations, the non-Dixon test is generally the more powerful in confirming outliers, but is frequently more sensitive to an error in selecting the category. An incorrect choice of statistical test will result in incorrect conclusions, as illu
33、strated by the examples. Therefore, carefully examine the data (arranged as in 4.2.1.2) for suspect test determinations, and select the appropriate statistical test, as follows: Dixon G w/s Modified Category test statistic statistic Grubbs test Single suspect determination 4.2.2 4.2.3 Two suspect de
34、terminations, the least and the greatest 4.2.4 4.2.5 Two suspect determinations, the two largest or two smallest 4.2.6 4.2.7 4.2.1.5 Calculate the statistic required by the statistical test to be used and compare it with the critical value given in the table for the chosen level of risk and the numb
35、er n of replicate test determinations. If the statistic exceeds the critical value, conclude that the suspected determination(s) has (have) been confirmed to be an outlier; except that in 4.2.7, if the statistic is less than the critical value, conclude that the suspicion is confirmed. 4.2.2 The Dix
36、on test for a single suspect determination 4.2.2.1 With the data arranged as in 4.2.1.2, assume that x1is the suspect test determination. Depending on the number of test determinations, calculate the appropriate statistic r1, r2, r3, or r4see Table 1 from Dixon (2). Note that the statistic compares
37、the distance of this suspect determination from its neighbors with the range of all of the n determinations (n less than or equal to 7), or with all but one, two, or three of the n determinations (n equal to or greater than 8). 4.2.2.2 Compare the calculated value of r with the tabulated critical va
38、lue (Table 1), confirming or rejecting the suspicion of an outlying test determination. 4.2.2.3 Examples: (a) Five test determinations yielded the following values (rearranged in decreasing order of magnitude): 0.1064, 0.1057, 0.1056, 0.1055, and 0.1053. The determination 0.1064 is suspect. Since n
39、= 5, the statistic r1applies (Table 1): T 1205 sp-14 Dealing with suspect (outlying) test determinations / 4 0.1057 - 0.1064 r1= = 0.636 0.1053 - 0.1064 At the 5% risk level, the critical value for n = 5 is 0.642. The calculated statistic r1does not exceed this tabulated critical value, so it is con
40、cluded that the determination 0.1064 is not an outlier. (b) Fourteen test determinations yielded the following values (rearranged in increasing order of magnitude): 0.6, 2.0, 2.0, 2.1, 2.1, 2.1, 2.2, 2.2, 2.2, 2.3, 2.3, 2.3, 3.0 and 4.0. Since n = 14, the statistic r4applies (Table 1). To check the
41、inclusion of 0.6: 2.0 - 0.6 r4= = 0.824 2.3 - 0.6 At the 5% risk level, the critical value for n = 14 is 0.546. The calculated statistic r4exceeds this tabulated critical value, so it is concluded that the determination 0.6 is an outlier. Indeed, it is an outlier even at the 1% risk level. 4.2.3 The
42、 ratio of deviation to standard deviation test for a single suspect determination (G-statistic) 4.2.3.1 With the data arranged as in 4.2.1.2, assume that x1is the suspect determination. Calculate the mean x , the standard deviation s, and the statistic G, as follows: xn1= xin=1i1 - n) x - x( = s2in=
43、1isx- x =G 1The G-statistic is more powerful than the Dixon statistic for the single-outlier case. 4.2.3.2 Compare the calculated value of G with the tabulated critical value Table 2 from Grubbs (3) and Grubbs and Beck (4), confirming or rejecting the suspicion of an outlying test determination. 4.2
44、.3.3 Examples: (a) Using the same example as in 4.2.2.3 (a): x = 0.1057 s = 0.0004183 |0.1057 - 0.1064| G = = 1.673 0.000183 At the 5% risk level, the critical value for n = 5 is 1.672. The calculated statistic G exceeds this tabulated critical value, so it is concluded that the determination 0.1064
45、 is an outlier. It will be noted that this conclusion reverses that of the Dixon test. These results illustrate how borderline cases may be rejected under one test and accepted under another. Since the G-statistic is the best one to use for the single-outlier case, it should be used for the final st
46、atistical judgment. (b) Using the same example as in 4.2.2.3 (b): x = 2.2429 s = 0.7101 5 / Dealing with suspect (outlying) test determinations T 1205 sp-14 |2.2429 - 0.61| G = = 2.314 0.7107 At the 5% risk level, the critical value for n = 14 is 2.371. The calculated statistic G does not exceed thi
47、s tabulated critical value, so it is concluded that the determination 0.6 is not an outlier. Normally the more powerful G-statistic should have shown an outlier if the Dixon test does. However, closer examination of the data shows that the G-statistic was improperly applied to the case since the dat
48、a include more than one suspect determination, the others (3.0 and 4.0) at the other extreme being avoided in this case by the Dixon test. 4.2.4 The Dixon test applied to two suspect determinations, the least and the greatest 4.2.4.1 With the data arranged as in 4.2.1.2, assume that x1and xnare the
49、suspect determinations. 4.2.4.2 If n is greater than seven, the Dixon test excludes the opposite extreme so that each extreme may be separately tested following the procedure of 4.2.2. 4.2.4.3 If n is less than or equal to seven, examine the data to see which of the two extremes is farthest from its neighbor. Temporarily omit this extreme from the analysis, reduce n by one, and apply the procedure of 4.2.2 to the remaining determinations. If the second extreme is thereby confirmed as an outlier, the first extreme may also be accepted as