1、Designation: G 16 95 (Reapproved 2004)Standard Guide forApplying Statistics to Analysis of Corrosion Data1This standard is issued under the fixed designation G 16; the number immediately following the designation indicates the year of originaladoption or, in the case of revision, the year of last re
2、vision. A number in parentheses indicates the year of last reapproval. A superscriptepsilon (e) indicates an editorial change since the last revision or reapproval.1. Scope1.1 This guide presents briefly some generally acceptedmethods of statistical analyses which are useful in the inter-pretation o
3、f corrosion test results.1.2 This guide does not cover detailed calculations andmethods, but rather covers a range of approaches which havefound application in corrosion testing.1.3 Only those statistical methods that have found wideacceptance in corrosion testing have been considered in thisguide.2
4、. Referenced Documents2.1 ASTM Standards:2E 178 Practice for Dealing with Outlying ObservationsE 380 Practice for Use of the International System of Units(SI) (the Modernized Metric System)3E 691 Practice for Conducting an Interlaboratory Study toDetermine the Precision of a Test MethodG 46 Guide fo
5、r Examination and Evaluation of PittingCorrosion3. Significance and Use3.1 Corrosion test results often show more scatter thanmany other types of tests because of a variety of factors,including the fact that minor impurities often play a decisiverole in controlling corrosion rates. Statistical analy
6、sis can bevery helpful in allowing investigators to interpret such results,especially in determining when test results differ from oneanother significantly. This can be a difficult task when a varietyof materials are under test, but statistical methods provide arational approach to this problem.3.2
7、Modern data reduction programs in combination withcomputers have allowed sophisticated statistical analyses ondata sets with relative ease. This capability permits investiga-tors to determine if associations exist between many variablesand, if so, to develop quantitative expressions relating thevari
8、ables.3.3 Statistical evaluation is a necessary step in the analysisof results from any procedure which provides quantitativeinformation. This analysis allows confidence intervals to beestimated from the measured results.4. Errors4.1 DistributionsIn the measurement of values associatedwith the corro
9、sion of metals, a variety of factors act to producemeasured values that deviate from expected values for theconditions that are present. Usually the factors which contrib-ute to the error of measured values act in a more or less randomway so that the average of several values approximates theexpecte
10、d value better than a single measurement. The patternin which data are scattered is called its distribution, and avariety of distributions are seen in corrosion work.4.2 HistogramsA bar graph called a histogram may beused to display the scatter of the data. A histogram isconstructed by dividing the
11、range of data values into equalintervals on the abscissa axis and then placing a bar over eachinterval of a height equal to the number of data points withinthat interval. The number of intervals should be few enough sothat almost all intervals contain at least three points, howeverthere should be a
12、sufficient number of intervals to facilitatevisualization of the shape and symmetry of the bar heights.Twenty intervals are usually recommended for a histogram.Because so many points are required to construct a histogram,it is unusual to find data sets in corrosion work that lendthemselves to this t
13、ype of analysis.4.3 Normal DistributionMany statistical techniques arebased on the normal distribution. This distribution is bell-shaped and symmetrical. Use of analysis techniques developedfor the normal distribution on data distributed in anothermanner can lead to grossly erroneous conclusions. Th
14、us, beforeattempting data analysis, the data should either be verified asbeing scattered like a normal distribution, or a transformationshould be used to obtain a data set which is approximatelynormally distributed. Transformed data may be analyzed sta-tistically and the results transformed back to
15、give the desired1This guide is under the jurisdiction of ASTM Committee G01 on Corrosion ofMetals and is the direct responsibility of Subcommittee G01.05 on LaboratoryCorrosion Tests.Current edition approved May 1, 2004. Published May 2004. Originallyapproved in 1971. Last previous edition approved
16、in 1999 as G 16 95 (1999)e1.2For referenced ASTM standards, visit the ASTM website, www.astm.org, orcontact ASTM Customer Service at serviceastm.org. For Annual Book of ASTMStandards volume information, refer to the standards Document Summary page onthe ASTM website.3Withdrawn.1Copyright ASTM Intern
17、ational, 100 Barr Harbor Drive, PO Box C700, West Conshohocken, PA 19428-2959, United States.results, although the process of transforming the data back cancreate problems in terms of not having symmetrical confidenceintervals.4.4 Normal Probability PaperIf the histogram is notconfirmatory in terms
18、of the shape of the distribution, the datamay be examined further to see if it is normally distributed byconstructing a normal probability plot as described as follows(1).44.4.1 It is easiest to construct a normal probability plot ifnormal probability paper is available. This paper has one linearaxi
19、s, and one axis which is arranged to reflect the shape of thecumulative area under the normal distribution. In practice, the“probability” axis has 0.5 or 50 % at the center, a numberapproaching 0 percent at one end, and a number approaching1.0 or 100 % at the other end. The marks are spaced far apar
20、tin the center and close together at the ends. A normalprobability plot may be constructed as follows with normalprobability paper.NOTE 1Data that plot approximately on a straight line on theprobability plot may be considered to be normally distributed. Deviationsfrom a normal distribution may be re
21、cognized by the presence ofdeviations from a straight line, usually most noticeable at the extreme endsof the data.4.4.1.1 Number the data points starting at the largest nega-tive value and proceeding to the largest positive value. Thenumbers of the data points thus obtained are called the ranks oft
22、he points.4.4.1.2 Plot each point on the normal probability paper suchthat when the data are arranged in order: y (1), y (2), y (3), .,these values are called the order statistics; the linear axisreflects the value of the data, while the probability axis locationis calculated by subtracting 0.5 from
23、 the number (rank) of thatpoint and dividing by the total number of points in the data set.NOTE 2Occasionally two or more identical values are obtained in aset of results. In this case, each point may be plotted, or a composite pointmay be located at the average of the plotting positions for all the
24、 identicalvalues.4.4.2 If normal probability paper is not available, thelocation of each point on the probability plot may be deter-mined as follows:4.4.2.1 Mark the probability axis using linear graduationsfrom 0.0 to 1.0.4.4.2.2 For each point, subtract 0.5 from the rank and dividethe result by th
25、e total number of points in the data set. This isthe area to the left of that value under the standardized normaldistribution. The cumulative distribution function is the num-ber, always between 0 and 1, that is plotted on the probabilityaxis.4.4.2.3 The value of the data point defines its location
26、on theother axis of the graph.4.5 Other Probability PaperIf the histogram is not sym-metrical and bell-shaped, or if the probability plot showsnonlinearity, a transformation may be used to obtain a new,transformed data set that may be normally distributed. Al-though it is sometimes possible to guess
27、 at the type ofdistribution by looking at the histogram, and thus determine theexact transformation to be used, it is usually just as easy to usea computer to calculate a number of different transformationsand to check each for the normality of the transformed data.Some transformations based on know
28、n non-normal distribu-tions, or that have been found to work in some situations, arelisted as follows:y = log xy= exp xy = = = x2y =1/ = sin1=x/nwhere:y = transformed datum,x = original datum, andn = number of data points.Time to failure in stress corrosion cracking usually is bestfitted with a log
29、x transformation (2, 3).Once a set of transformed data is found that yields anapproximately straight line on a probability plot, the statisticalprocedures of interest can be carried out on the transformeddata. Results, such as predicted data values or confidenceintervals, must be transformed back us
30、ing the reverse transfor-mation.4.6 Unknown DistributionIf there are insufficient datapoints, or if for any other reason, the distribution type of thedata cannot be determined, then two possibilities exist foranalysis:4.6.1 A distribution type may be hypothesized based on thebehavior of similar type
31、s of data. If this distribution is notnormal, a transformation may be sought which will normalizethat particular distribution. See 4.5 above for suggestions.Analysis may then be conducted on the transformed data.4.6.2 Statistical analysis procedures that do not require anyspecific data distribution
32、type, known as non-parametric meth-ods, may be used to analyze the data. Non-parametric tests donot use the data as efficiently.4.7 Extreme Value AnalysisIn the case of determining theprobability of perforation by a pitting or cracking mechanism,the usual descriptive statistics for the normal distri
33、bution arenot the most useful. In this case, Guide G 46 should beconsulted for the procedure (4).4.8 Significant DigitsPractice E 380 should be followedto determine the proper number of significant digits whenreporting numerical results.4.9 Propagation of VarianceIf a calculated value is afunction o
34、f several independent variables and those variableshave errors associated with them, the error of the calculatedvalue can be estimated by a propagation of variance technique.See Refs. (5) and (6) for details.4.10 MistakesMistakes either in carrying out an experi-ment or in calculations are not a cha
35、racteristic of the populationand can preclude statistical treatment of data or lead toerroneous conclusions if included in the analysis. Sometimesmistakes can be identified by statistical methods by recogniz-ing that the probability of obtaining a particular result is verylow.4.11 Outlying Observati
36、onsSee Practice E 178 for proce-dures for dealing with outlying observations.4The boldface numbers in parentheses refer to the list of references at the end ofthis guide.G 16 95 (2004)25. Central Measures5.1 It is accepted practice to employ several independent(replicate) measurements of any experim
37、ental quantity toimprove the estimate of precision and to reduce the variance ofthe average value. If it is assumed that the processes operatingto create error in the measurement are random in nature and areas likely to overestimate the true unknown value as tounderestimate it, then the average valu
38、e is the best estimate ofthe unknown value in question. The average value is usuallyindicated by placing a bar over the symbol representing themeasured variable.NOTE 3In this standard, the term “mean” is reserved to describe acentral measure of a population, while average refers to a sample.5.2 If p
39、rocesses operate to exaggerate the magnitude of theerror either in overestimating or underestimating the correctmeasurement, then the median value is usually a betterestimate.5.3 If the processes operating to create error affect both theprobability and magnitude of the error, then other approachesmu
40、st be employed to find the best estimation procedure. Aqualified statistician should be consulted in this case.5.4 In corrosion testing, it is generally observed that averagevalues are useful in characterizing corrosion rates. In cases ofpenetration from pitting and cracking, failure is often define
41、das the first through penetration and in these cases, averagepenetration rates or times are of little value. Extreme valueanalysis has been used in these cases, see Guide G 46.5.5 When the average value is calculated and reported as theonly result in experiments when several replicate runs weremade,
42、 information on the scatter of data is lost.6. Variability Measures6.1 Several measures of distribution variability are availablewhich can be useful in estimating confidence intervals andmaking predictions from the observed data. In the case ofnormal distribution, a number of procedures are availabl
43、e andcan be handled with computer programs. These measuresinclude the following: variance, standard deviation, and coef-ficient of variation. The range is a useful non-parametricestimate of variability and can be used with both normal andother distributions.6.2 VarianceVariance, s2, may be estimated
44、 for an ex-perimental data set of n observations by computing the sampleestimated variance, S2assuming all observations are subject tothe same errors:S25(d2n 2 1(1)where:d = the difference between the average and the mea-sured value,n1 = the degrees of freedom available.Variance is a useful measure
45、because it is additive in systemsthat can be described by a normal distribution, however, thedimensions of variance are square of units. A procedure knownas analysis of variance (ANOVA) has been developed for datasets involving several factors at different levels in order toestimate the effects of t
46、hese factors. (See Section 9.)6.3 Standard DeviationStandard deviation, s, is definedas the square root of the variance. It has the property of havingthe same dimensions as the average value and the originalmeasurements from which it was calculated and is generallyused to describe the scatter of the
47、 observations.6.3.1 Standard Deviation of the AverageThe standarddeviation of an average, Sx, is different from the standarddeviation of a single measured value, but the two standarddeviations are related as in (Eq 2):Sx 5S=n(2)where:n = the total number of measurements which were used tocalculate t
48、he average value.When reporting standard deviation calculations, it is impor-tant to note clearly whether the value reported is the standarddeviation of the average or of a single value. In either case, thenumber of measurements should also be reported. The sampleestimate of the standard deviation i
49、s s.6.4 Coeffcient of VariationThe population coefficient ofvariation is defined as the standard deviation divided by themean. The sample coefficient of variation may be calculated asS/ x and is usually reported in percent. This measure ofvariability is particularly useful in cases where the size of theerrors is proportional to the magnitude of the measured valueso that the coefficient of variation is approximately constantover a wide range of values.6.5 RangeThe range is defined as the difference betweenthe maximum and minimum values in a set of replicate da
copyright@ 2008-2019 麦多课文库(www.mydoc123.com)网站版权所有
备案/许可证编号:苏ICP备17064731号-1