1、UDC 519.2 : 31 1.131 DEUTSCHE NORM March 1985 - DIN 53 804 Statistical interpretation of data Part 4 Attribute characteristics Statistische Auswertungen; Attributmerkmale In keeping with current practice in standards published by the International Organization for Standardization (/so), a comma has
2、been used throughout as the decimal marker. Contents Page 1 2 Concepts . 3 Binomial distribution Scope and field of application . 4 Characteristic value of the sample proportion p of the population 5 Estimated value and confidence interval for the 6 Testing of proportions for binomial distribution .
3、 6.1 Comparison of a proportion with a specified value 6.2 Comparison of two proportions 6.2.1 Graphical method 6.2.2 Computational method . 7 Nornographs and tables . 7.1 Confidence interval for a binomial distribution with a confidence level 1 - a= 0,95 7.2 Confidence interval for a binomial distr
4、ibution with a confidence level 1 - a = 0,99 . . 1 2 2 2 2 3 3 3 3 4 4 4 5 1 The properties of products and activities are differen- tiated by characteristics. Values of a suitable scale will be allocated to the values of a characteristic. The scale values are - any real numbers (as values of physic
5、al quantities) when measureable (continuous) characteristics are concerned; - whole numbers (integers) when countable (discrete) characteristics are concerned; - property categories which follow a ranking (e.g. smooth, somewhat creased, heavily creased) when ordinal characteristics are concerned; -
6、attributes (e.g. red, yellow, blue; present, not present; green, not green) when attribute characteristics are concerned. Measurable or countable characteristics are designated as quantitative, whilst ordinal characteristics and attribute characteristics are designated as qualitative (assessable). T
7、hese types of characteristics correspond to the funda- mentai concepts in metrology: measuring, counting, sorting and classifying (see DIN 1319 Part 1). It is generally not reasonable to determine characteristic values from all the units of a population and therefore samples are taken and the charac
8、teristic values of the samples determined. Scope and field of application Paw 7.3 Cumulative distribution function of the binomial distribution . 6 7.4 Standardized normal distribution . 7 7.5 One-sided F distribution . 7 7.6 Two-sided F distribution . 7 Appendix A Examples from textile technology.
9、8 A.l Calculation of the confidence interval for the proportion p of yarn spools without yarn reserve . 8 A.2 Comparison of proportion p with a specified value for yarn spools without yarn reserve . 8 A.3 Comparison of two proportions; graphical method; yarn spools without yarn reserve 8 A.4 Compari
10、son of two proportions; computational method; difference in degree of mercerizing . . 9 Appendix B Key to symbols used. . 9 Standards and other documents referred to 10 Explanatory notes 1 O Parameters describing the behaviour of the particular characteristic of the population are estimated from the
11、 characteristic values of the sample. These estimated values are subject to a definable uncertainty. Hypotheses concerned with a population investigated by way of a sample can be checked by means of statistical tests. The statistical methods used depend on the type of characteristic and hence also o
12、n the type of scale used. This series of standards therefore is issued in four Parts, DIN 53 804 Part 1, Part 2 and Part 3 dealing with measurable, countable and ordinal characteristics. This standard covers attribute characteristics. The particular forms (values, attributes) of such charac- teristi
13、cs are represented on a nominal scale (1). Each unit of a population has only one particular form of the attribute. The properties deriving from this characteristic are described by means of the proportions (relative frequencies) with which the particular form of the attribute occurs in the populati
14、on. Often only one particular form of the attribute and the proportion it constitutes of the population are of interest. This standard describes methods of statistical processing of the number x of the n units examined having the particular form of the attribute. It also deals with estimates and tes
15、ts of the proportion of units of the population having the observed form of the attribute. The methodology is based on the binomial distribution. Continued on pages 2 to 10 DIN 53 804 Part 4 Engl. Price group 8 Beuth Verlag GmbH. Berlin 30. has exclusive sale rights for German Standards (DIN-Normen)
16、 04.86 Caler No. O108 Page 2 DIN 53 804 Part 4 For cases where the proportions of more than two attributes have to be estimated or tested simultaneously, reference should be made to the literature on multi-way tables and the multinomial distribution. If it is possible to arrange the various forms of
17、 the attributes according to a ranking, it will be possible to carry out a statistical interpretation as described in DIN 53 804 Part 3 (ordinal characteristics). DIN 40 080 deals with acceptance inspection by random sampling on the basis of qualitative characteristics (attribute testing), the attri
18、bute tested being non- conformance of units (the two forms of the attribute being: defective, nonconforming). 2 Concepts The statistical concepts used in this standard are to be found in DIN 13 303 Part 1 and Part 2 and DIN 55 350 Part 12, Part 14 (at present at the stage of draft), Part21, Part 22,
19、 Part 23, Part 24 and DIN 55 303 Part 2. The sample size n is the number of individual units examined. The absolute frequency x is the number of units out of n units examined that exhibit the particular form of the attribute. Note. This number x corresponds to the individual value xi in the case of
20、measurable characteristics (DIN 53 804 Part 1); x can only be an integer (x=0,1,2 ,., n). 3 Binomial distribution In this standard, it is assumed for calculation of the confidence interval and for testing hypotheses that the number of units in the sample having a particular form of the attribute fol
21、lows a binomial distribution. This condition of a binomial distribution will be met if n units are taken and returned each time to the population of N units. If units are taken and not returned to the population, the exact probability distribution will be a hypergeometric distribution. Provided n/N
22、is less than 0,l (.e., if the sample size is less than 1 O 96 of the population), the binomial distribution can be used as a good approxima- tion to the actual distribution. The probability function of the binomial distribution shows the probability that the observed form of the attribute will occur
23、 zero times, once, twice, . . ., k times, . . . , n times in the n units inspected. The parameter p of the binomial distribution shows the proportion in which the particular form of the attribute will occur on average in a sample of n units. The expectation for the number of units having the particu
24、lar form of the attribute in a sample of n units is n X p. The expectation for the relative frequency of units having the particular form of the attribute isp. 4 The proportion of units in the sample found to have the particular form of the attribute is x/n. This is the observed relative frequency i
25、n the sample of n units. Characteristic value of the sample 5 The estimated value for the proportion p is Estimated value and confidence interval for the proportion p of the population x p=-. n The two-sided confidence interval i -a12 where f, = 2 (n - x t 1) and f2 = 2x. For the confidence levels 1
26、 - a = 0,95 or 1 -a = 0,99, these values can be taken from the nomographs in figures 3 and 4. The diagram for reading off the values is given in figure 5. For the special case = O, .e. x = O, it is easier to use (4) For the special case = 1, .e. x = n, it is simpler to use nr The one-sided upper con
27、fidence interval O 5 p I pob for p at confidence level 1 -a has the confidence limit (X + 1) .41, f2; 1 -a Pob = (6) (n-x) + (X + 1). Ff1, f2.1 -a where fi = 2 (x + 1 ) and f2 = 2 (n - x). For the special case rj = O, .e. x = O, it is simpler to use The one-sided lower confidence interval for p at c
28、onfidence level 1 -a, has the following con- fidence limit Pob= 1 -“fi. (7) s p I 1 IC Pun= (8) x + (n -X + 1) 1 Fr, f2; 1 -a where fi = 2 (n - x + 1) and fi = 2 x. For the special case j3 = 1, .e. x = n, it is simpler to use Pun =“Ja (9) Values of the F distribution for confidence level 1 - a = 0,9
29、5 can be obtained from tables 2 and 3. More detailed tables giving values for other values of 1 -a also can be found in the literature, for example in 2 and 3. The confidence limits can be calculated approximately on the assumption of a normal distribution. For n X rj (1 -) 9, equations (10) to (13)
30、 give an approximation for rough calculations; the relative error for p these equations can also be obtained from 5, if the error term of the order tri* is written as equal to zero in equation 19 in that reference, with a = O. The confidence limits can be calculated, using equations (14) to (18) wit
31、h a relative error of no more than 1 % for all values of x and n with -3,2 X ln alx 5 n + 3,2 X lna; for example, 96 I x 5 n - 9,6 for 1 - a = 0,95. For the two-sided case, the auxiliary value and ul-= arevalues (quantiles) of the is calculated; this is used for calculating the corrected x va I ues.
32、 and finally To calculate the one-sided confidence limits, a/2 is replaced by a, .e., for the upper confidence limit in equations (14), (1 5) and (1 7), and for the lower con- fidence limit, in equations (141, (16) and (18). 6 Testing of proportions for binomial distri bution 6.1 Comparison of a pro
33、portion with a specified value This comparison is used to check whether the proportion p of units of the population having the particular form of the attribute is different from a specified value po (for example a required value or a value obtained from past experience), .e. the null hypothesis HO :
34、 p = po is tested against the alternative hypothesis H1 : p fp,. For this purpose, the significance level a is established. To carry out the test, the two-sided confidence interval forp is to be determined at the confidence level 1 -a, obtained from the sample of size n and the estimated value fi (s
35、ee clause 5). For this purpose, the confidence level 1 -a is to be calculated from the required significance level a. If the specified value po is within the confidence interval, the null hypothesis HO : p =PO is not to be rejected. If however the specified value po is outside the confidence interva
36、l, the null hypothesis is to be rejected in favour of the alternative hypothesis Hl : P #PO. The relationships are shown in figures 1 and 2. Pun Pob b I Po Proportion p Figure 1. Specified value within the confidence interval Pun Pob I b I Po Proportion p Figure 2. Specified valve outside the confid
37、ence interval The null hypothesis HO : p = po is not to be rejected in figure 1 but is to be rejected in figure 2. 6.2 Comparison of two proportions This comparison is used to test whether the proportions p1 and p2 of units having the particular form of the attribute in two populations are different
38、, .e. the null hypothesis Ho :pi = p2 is to be tested against the alternative hypothesis Hl : pl fp, . The significance level 4 is to be specified for this purpose. In addition, two samples of size ni and n2 are taken independently from the two populations. x1 and x2 are the number of units with the
39、 required form of the attribute observed respectively in the two samples. If the null hypothesis Ho : p1 = p2 = p is not rejected, (19) x1 + xz p=- nl + n2 is the best estimate of p. There are two commonly used methods of making this comparison, .e. a) a graphical method for cases whereE I 0,2: for
40、this purpose, the nomograph of the binomial distribution, described in 6 and shown in figure 6 is used; solution on the basis of a normal distribution, where 0,2 ulTal2, the null hypothesis is to be rejected in favour of the alternative hypothesis Hl : pl fp2. I?% -821 (23) nl . n2 7 Nornographs and
41、 tables 7.1 Confidence interval for a binomial distribution with a confidence level 1 - a = 0.95. t c O c Cu 3 P O P al .- - 5 c O 4 C O .- w B 2 n Relative frequency fi in the sample - Figure 3. Nomograph for two-sided confidence interval for the proportion p for a binomial distribution with a conf
42、idence level 1 - a= 0,95 (according to Clopper and Pearson), taken from 2. DIN 53 804 Part 4 Page 5 7.2 Confidence interval for a binomial distribution with a confidence level 1 -a = 0,99. t C O .- c 9 is met, at a confidence level 1 -a= 0,95 with = 0,06, n = 200 and u9975 = 1,96, the following are
43、obtained: and Pun = 0.06 - 1.96 * = 0.0271 = 2.7 %. As a comparison, the following are obtained in this case with equations (17) and (18): 12,03 + 1,92 + 1.96 . d12.03 - 0,73 + 0,96 200 - 0.95 + 3,84 Pob = = 0,1026 = 10,3 %, and 11.03 + 1.92 - 1.96 411.03 - 0,61 + 0.96 run 200 - 0,95 + 3,84 = 0,031
44、2 = 3,l %. As a further comparison, using nomograph shown in figure 3 the following values are obtained: pob = approx. 1 O % and pun = approx. 3 %. A.2 Comparison of proportion p with a specified value for yarn spools without yarn reserve From experience over a number of years it is known that in de
45、liveries of yarn there is likely to be a proportion po = 3 % of spools without fibre reserve; it is required to test on the basis of a sample of size n = 1 O0 whether a particular delivery is in accordance with this experience, for a significance level a = 0.05. As described in subclause 6.1, the pr
46、oportion p is com- pared with a specified valuePo by using the confidence interval. In example A.l with n = 100 and x = 6, the confidence limits at confidence level 0,95 were obtained as pun = 2,2 % and pob = 12,6 %. Since po lies between and pob, the null hypothesis Ho : p = po is not to be rejecte
47、d. The delivery therefore matches the results of experience with regard to the fraction of spools without yarn reserve. A.3 Comparison of two proportions; graphical method; yarn spools without yarn reserve The delivery of spools as described in example A.l was obtained from two suppliers. On one par
48、ticular day, the processor received a delivery of spools from each of the suppliers, which he wished to compare in respect of the presence of the yarn reserve at a significance level of a = 0,05; the null hypothesis is Ho :pi = p2. The processor took one sample from each delivery and found: supplier
49、 A B sample size nA = 80 nB = 72 number of spools without xA= 5 xB= 8 DIN 53 804 Part 4 Page 9 Proportion of spools without yarn reserve X/n=0,0625=6,3% /=0,1111=11,1% From equation (1 9) this gives p = 0,0855 uO,995 = 2,58, the null hypothesis shall be rejected in favour of the alternative hypothesis This means that there are less properly mercerized fibres in the light regions of the yarn than in the dark regions. Hi :Pl fP2. Appendix B Key to symbols used auxiliary value number of degrees of freedom tabulated values of the F distribution for fi and f2