1、Designation:E258610a Designation: E2586 12An American National StandardStandard Practice forCalculating and Using Basic Statistics1This standard is issued under the fixed designation E2586; the number immediately following the designation indicates the year oforiginal adoption or, in the case of rev
2、ision, the year of last revision. A number in parentheses indicates the year of last reapproval. Asuperscript epsilon () indicates an editorial change since the last revision or reapproval.1. Scope1.1 This practice covers methods and equations for computing and presenting basic descriptive statistic
3、s using a set of sampledata containing a single variable. This practice includes simple descriptive statistics for variable data, tabular and graphicalmethods for variable data, and methods for summarizing simple attribute data. Some interpretation and guidance for use is alsoincluded.1.2 The system
4、 of units for this practice is not specified. Dimensional quantities in the practice are presented only as illustrationsof calculation methods. The examples are not binding on products or test methods treated.1.3 This standard does not purport to address all of the safety concerns, if any, associate
5、d with its use. It is the responsibilityof the user of this standard to establish appropriate safety and health practices and determine the applicability of regulatorylimitations prior to use.2. Referenced Documents2.1 ASTM Standards:2E178 Practice for Dealing With Outlying ObservationsE456 Terminol
6、ogy Relating to Quality and StatisticsE2282 Guide for Defining the Test Result of a Test Method2.2 ISO Standards:3ISO 3534-1 StatisticsVocabulary and Symbols, part 1: Probability and General Statistical TermsISO 3534-2 StatisticsVocabulary and Symbols, part 2: Applied Statistics3. Terminology3.1 Def
7、initions:3.1.1 Unless otherwise noted, terms relating to quality and statistics are as defined in Terminology E456.3.1.2 characteristic, na property of items in a sample or population which, when measured, counted, or otherwise observed,helps to distinguish among the items. E22823.1.3 coeffcient of
8、variation, CV, nfor a nonnegative characteristic, the ratio of the standard deviation to the mean for apopulation or sample3.1.3.1 DiscussionThe coefficient of variation is often expressed as a percentage.3.1.3.2 DiscussionThis statistic is also known as the relative standard deviation, RSD.3.1.4 co
9、nfidence bound, nsee confidence limit.3.1.5 confidence coeffcient, nsee confidence level.3.1.6 confidence interval, nan interval estimate L, U with the statistics L and U as limits for the parameter u and withconfidence level 1-a, where Pr(L # u # U) $ 1-a.3.1.6.1 DiscussionThe confidence level, 1-a
10、, reflects the proportion of cases that the confidence interval L, U would containor cover the true parameter value in a series of repeated random samples under identical conditions. Once Land U are given values,the resulting confidence interval either does or does not contain it. In this sense 9con
11、fidence9 applies not to the particular intervalbut only to the long run proportion of cases when repeating the procedure many times.3.1.7 confidence level, nthe value, 1-a, of the probability associated with a confidence interval, often expressed as apercentage.3.1.7.1 Discussiona is generally a sma
12、ll number. Confidence level is often 95 % or 99 %.1This practice is under the jurisdiction ofASTM Committee E11 on Quality and Statistics and is the direct responsibility of Subcommittee E11.10 on Sampling / Statistics.Current edition approved Sept.Jan. 1, 2010.2012. Published October 2010.February
13、2012. Originally approved in 2007. Last previous edition approved in 2010 asE2586 10a. DOI: 10.1520/E2586-10A.10.1520/E2586-12.2For referenced ASTM standards, visit the ASTM website, www.astm.org, or contact ASTM Customer Service at serviceastm.org. For Annual Book of ASTM Standardsvolume informatio
14、n, refer to the standards Document Summary page on the ASTM website.3Available from American National Standards Institute (ANSI), 25 W. 43rd St., 4th Floor, New York, NY 10036, http:/www.ansi.org.1This document is not an ASTM standard and is intended only to provide the user of an ASTM standard an i
15、ndication of what changes have been made to the previous version. Becauseit may not be technically possible to adequately depict all changes accurately, ASTM recommends that users consult prior editions as appropriate. In all cases only the current versionof the standard as published by ASTM is to b
16、e considered the official document.Copyright ASTM International, 100 Barr Harbor Drive, PO Box C700, West Conshohocken, PA 19428-2959, United States.3.1.8 confidence limit, neach of the limits, L and U, of a confidence interval, or the limit of a one-sided confidence interval.3.1.9 degrees of freedo
17、m, nthe number of independent data points minus the number of parameters that have to be estimatedbefore calculating the variance.3.1.9.1 DiscussionThe term degrees of freedom is best defined in the specific context of its use. For a general discussion,the following comments were reprinted from Box,
18、 Hunter, and Hunter,3.1.10 estimate, nsample statistic used to approximate a population parameter.3.1.53.1.11 histogram, ngraphical representation of the frequency distribution of a characteristic consisting of a set of rectangleswith area proportional to the frequency. ISO 3534-13.1.5.13.1.11.1 Dis
19、cussionWhile not required, equal bar or class widths are recommended for histograms.3.1.63.1.12 interquartile range, IQR, nthe 75thpercentile (0.75 quantile) minus the 25thpercentile (0.25 quantile), for a data set.3.1.73.1.13 kurtosis, g2,g2, nfor a population or a sample, a measure of the weight o
20、f the tails of a distribution relative to thecenter, calculated as the ratio of the fourth central moment (empirical if a sample, theoretical if a population applies) to the standarddeviation (sample, s, or population, s) raised to the fourth power, minus 3 (also referred to as excess kurtosis).3.1.
21、83.1.14 mean, nof a population, , average or expected value of a characteristic in a population of a sample, x, sum of theobserved values in the sample divided by the sample size.3.1.93.1.15 median, X , nthe 50thpercentile in a population or sample.3.1.9.13.1.15.1 DiscussionThe sample median is the
22、(n + 1)/2 order statistic if the sample size n is odd and is the average of then/2 and n/2 + 1 order statistics if n is even.3.1.103.1.16 midrange, naverage of the minimum and maximum values in a sample.3.1.113.1.17 order statistic, x(k), nvalue of the kthobserved value in a sample after sorting by
23、order of magnitude.3.1.11.13.1.17.1 DiscussionFor a sample of size n, the first order statistic x(1)is the minimum value, x(n)is the maximum value.3.1.123.1.18 parameter, nsee population parameter.3.1.133.1.19 percentile, nquantile of a sample or a population, for which the fraction less than or equ
24、al to the value is expressedas a percentage.3.1.143.1.20 population, nthe totality of items or units of material under consideration.3.1.153.1.21 population parameter, nsummary measure of the values of some characteristic of a population. ISO 3534-23.1.163.1.22 statistic, nsee sample statistic.3.1.1
25、73.1.23 quantile, nvalue such that a fraction f of the sample or population is less than or equal to that value.3.1.183.1.24 range, R, nmaximum value minus the minimum value in a sample.3.1.193.1.25 sample, na group of observations or test results, taken from a larger collection of observations or t
26、est results, whichserves to provide information that may be used as a basis for making a decision concerning the larger collection.3.1.203.1.26 sample size, n, nnumber of observed values in the sample3.1.213.1.27 sample statistic, nsummary measure of the observed values of a sample.3.1.223.1.28 skew
27、ness, g1,g1, nfor population or sample, a measure of symmetry of a distribution, calculated as the ratio of the thirdcentral moment (empirical if a sample, and theoretical if a population applies) to the standard deviation (sample, s, or population,s) raised to the third power.3.1.23E2586 1223.1.29
28、standard errorstandard deviation of the population of values of a sample statistic in repeated sampling, or an estimateof it.3.1.23.13.1.29.1 DiscussionIf the standard error of a statistic is estimated, it will itself be a statistic with some variance that dependson the sample size.3.1.243.1.30 stan
29、dard deviationof a population, s, the square root of the average or expected value of the squared deviation of avariable from its mean; of a sample, s, the square root of the sum of the squared deviations of the observed values in the sampledivided by the sample size minus 1.3.1.253.1.31 variance, s
30、2, s2, nsquare of the standard deviation of the population or sample.3.1.25.13.1.31.1 DiscussionFor a finite population, s2is calculated as the sum of squared deviations of values from the mean, dividedby n. For a continuous population, s2is calculated by integrating (x )2with respect to the density
31、 function. For a sample, s2is calculated as the sum of the squared deviations of observed values from their average divided by one less than the sample size.3.1.263.1.32 Z-score, nobserved value minus the sample mean divided by the sample standard deviation.4. Significance and Use4.1 This practice p
32、rovides approaches for characterizing a sample of n observations that arrive in the form of a data set. Largedata sets from organizations, businesses, and governmental agencies exist in the form of records and other empirical observations.Research institutions and laboratories at universities, gover
33、nment agencies, and the private sector also generate considerableamounts of empirical data.4.1.1 A data set containing a single variable usually consists of a column of numbers. Each row is a separate observation orinstance of measurement of the variable. The numbers themselves are the result of app
34、lying the measurement process to thevariable being studied or observed. We may refer to each observation of a variable as an item in the data set. In many situations,there may be several variables defined for study.4.1.2 The sample is selected from a larger set called the population. The population
35、can be a finite set of items, a very largeor essentially unlimited set of items, or a process. In a process, the items originate over time and the population is dynamic,continuing to emerge and possibly change over time. Sample data serve as representatives of the population from which the sampleori
36、ginates. It is the population that is of primary interest in any particular study.4.2 The data (measurements and observations) may be of the variable type or the simple attribute type. In the case of attributes,the data may be either binary trials or a count of a defined event over some interval (ti
37、me, space, volume, weight, or area). Binarytrials consist of a sequence of 0s and 1s in which a “1” indicates that the inspected item exhibited the attribute being studied anda “0” indicates the item did not exhibit the attribute. Each inspection item is assigned either a “0” or a “1.” Such data are
38、 oftengoverned by the binomial distribution. For a count of events over some interval, the number of times the event is observed on theinspection interval is recorded for each of n inspection intervals. The Poisson distribution often governs counting events over aninterval.4.3 For sample data to be
39、used to draw conclusions about the population, the process of sampling and data collection must beconsidered, at least potentially, repeatable. Descriptive statistics are calculated using real sample data that will vary in repeatingthe sampling process. As such, a statistic is a random variable subj
40、ect to variation in its own right. The sample statistic usuallyhas a corresponding parameter in the population that is unknown (see Section 5). The point of using a statistic is to summarizethe data set and estimate a corresponding population characteristic or parameter.4.4 Descriptive statistics co
41、nsider numerical, tabular, and graphical methods for summarizing a set of data. The methodsconsidered in this practice are used for summarizing the observations from a single variable.4.5 The descriptive statistics described in this practice are:4.5.1 Mean, median, min, max, range, mid range, order
42、statistic, quartile, empirical percentile, quantile, interquartile range,variance, standard deviation, Z-score, coefficient of variation, skewness and kurtosis, and standard error.4.6 Tabular methods described in this practice are:4.6.1 Frequency distribution, relative frequency distribution, cumula
43、tive frequency distribution, and cumulative relativefrequency distribution.4.7 Graphical methods described in this practice are:4.7.1 Histogram, ogive, boxplot, dotplot, normal probability plot, and q-q plot.4.8 While the methods described in this practice may be used to summarize any set of observa
44、tions, the results obtained by usingthem may be of little value from the standpoint of interpretation unless the data quality is acceptable and satisfies certainrequirements. To be useful for inductive generalization, any sample of observations that is treated as a single group for presentationpurpo
45、ses must represent a series of measurements, all made under essentially the same test conditions, on a material or product,all of which have been produced under essentially the same conditions. When these criteria are met, we are minimizing the dangerof mixing two or more distinctly different sets o
46、f data.E2586 1234.8.1 If a given collection of data consists of two or more samples collected under different test conditions or representingmaterial produced under different conditions (that is, different populations), it should be considered as two or more separatesubgroups of observations, each t
47、o be treated independently in a data analysis program. Merging of such subgroups, representingsignificantly different conditions, may lead to a presentation that will be of little practical value. Briefly, any sample of observationsto which these methods are applied should be homogeneous or, in the
48、case of a process, have originated from a process in a stateof statistical control.4.9 The methods developed in Sections 6, 7, and 8 apply to the sample data. There will be no misunderstanding when, forexample, the term “mean” is indicated, that the meaning is sample mean, not population mean, unles
49、s indicated otherwise. It isunderstood that there is a data set containing n observations. The data set may be denoted as:x1, x2, x3. xn(1)E2586-12_14.9.1 There is no order of magnitude implied by the subscript notation unless subscripts are contained in parenthesis (see 6.7).5. Characteristics of Populations5.1 A population is the totality of a set of items under consideration. Populations may be finite or unlimited in size and maybe existing or continuing to emerge as, for example, in a process. For continuous vari
copyright@ 2008-2019 麦多课文库(www.mydoc123.com)网站版权所有
备案/许可证编号:苏ICP备17064731号-1