1、Designation: D6589 05 (Reapproved 2010)1Standard Guide forStatistical Evaluation of Atmospheric Dispersion ModelPerformance1This standard is issued under the fixed designation D6589; the number immediately following the designation indicates the year oforiginal adoption or, in the case of revision,
2、the year of last revision. A number in parentheses indicates the year of last reapproval. Asuperscript epsilon () indicates an editorial change since the last revision or reapproval.1NOTEReapproved with editorial corrections in appendixes in April 2010.1. Scope1.1 This guide provides techniques that
3、 are useful for thecomparison of modeled air concentrations with observed fielddata. Such comparisons provide a means for assessing amodels performance, for example, bias and precision oruncertainty, relative to other candidate models. Methodologiesfor such comparisons are yet evolving; hence, modif
4、icationswill occur in the statistical tests and procedures and dataanalysis as work progresses in this area. Until the interestedparties agree upon standard testing protocols, differences inapproach will occur. This guide describes a framework, orphilosophical context, within which one determines wh
5、ether amodels performance is significantly different from othercandidate models. It is suggested that the first step should be todetermine which models estimates are closest on average tothe observations, and the second step would then test whetherthe differences seen in the performance of the other
6、 models aresignificantly different from the model chosen in the first step.An example procedure is provided inAppendix X1 to illustratean existing approach for a particular evaluation goal. Thisexample is not intended to inhibit alternative approaches ortechniques that will produce equivalent or sup
7、erior results. Asdiscussed in Section 6, statistical evaluation of model perfor-mance is viewed as part of a larger process that collectively isreferred to as model evaluation.1.2 This guide has been designed with flexibility to allowexpansion to address various characterizations of atmosphericdispe
8、rsion, which might involve dose or concentration fluctua-tions, to allow development of application-specific evaluationschemes, and to allow use of various statistical comparisonmetrics. No assumptions are made regarding the manner inwhich the models characterize the dispersion.1.3 The focus of this
9、 guide is on end results, that is, theaccuracy of model predictions and the discernment of whetherdifferences seen between models are significant, rather thanoperational details such as the ease of model implementation orthe time required for model calculations to be performed.1.4 This guide offers
10、an organized collection of informationor a series of options and does not recommend a specific courseof action. This guide cannot replace education or experienceand should be used in conjunction with professional judgment.Not all aspects of this guide may be applicable in all circum-stances. This gu
11、ide is not intended to represent or replace thestandard of care by which the adequacy of a given professionalservice must be judged, nor should it be applied withoutconsideration of a projects many unique aspects. The word“Standard” in the title of this guide means only that thedocument has been app
12、roved through the ASTM consensusprocess.1.5 This standard does not purport to address all of thesafety concerns, if any, associated with its use. It is theresponsibility of the user of this standard to establish appro-priate safety and health practices and to determine theapplicability of regulatory
13、 limitations prior to use.2. Referenced Documents2.1 ASTM Standards:2D1356 Terminology Relating to Sampling and Analysis ofAtmospheres3. Terminology3.1 DefinitionsFor definitions of terms used in this guide,refer to Terminology D1356.3.2 Definitions of Terms Specific to This Standard:3.2.1 atmospher
14、ic dispersion model, nan idealization ofatmospheric physics and processes to calculate the magnitudeand location of pollutant concentrations based on fate, trans-port, and dispersion in the atmosphere. This may take the formof an equation, algorithm, or series of equations/algorithmsused to calculat
15、e average or time-varying concentration. Themodel may involve numerical methods for solution.1This guide is under the jurisdiction of ASTM Committee D22 on Air Qualityand is the direct responsibility of Subcommittee D22.11 on Meteorology.Current edition approved April 1, 2010. Published July 2010. O
16、riginallyapproved in 2000. Last previous edition approved in 2005 as D6589 - 05. DOI:10.1520/D6589-05R10E01.2For referenced ASTM standards, visit the ASTM website, www.astm.org, orcontact ASTM Customer Service at serviceastm.org. For Annual Book of ASTMStandards volume information, refer to the stan
17、dards Document Summary page onthe ASTM website.1Copyright ASTM International, 100 Barr Harbor Drive, PO Box C700, West Conshohocken, PA 19428-2959, United States.3.2.2 dispersion, absolute, nthe characterization of thespreading of material released into the atmosphere based on acoordinate system fix
18、ed in space.3.2.3 dispersion, relative, nthe characterization of thespreading of material released into the atmosphere based on acoordinate system that is relative to the local median positionof the dispersing material.3.2.4 evaluation objective, na feature or characteristic,which can be defined thr
19、ough an analysis of the observedconcentration pattern, for example, maximum centerline con-centration or lateral extent of the average concentration patternas a function of downwind distance, which one desires toassess the skill of the models to reproduce.3.2.5 evaluation procedure, nthe analysis st
20、eps to betaken to compute the value of the evaluation objective from theobserved and modeled patterns of concentration values.3.2.6 fate, nthe destiny of a chemical or biological pol-lutant after release into the environment.3.2.7 model input value, ncharacterizations that must beestimated or provid
21、ed by the model developer or user beforemodel calculations can be performed.3.2.8 regime, na repeatable narrow range of conditions,defined in terms of model input values, which may or may notbe explicitly employed by all models being tested, needed fordispersion model calculations. It is envisioned
22、that the disper-sion observed should be similar for all cases having similarmodel input values.3.2.9 uncertainty, nrefers to a lack of knowledge aboutspecific factors or parameters. This includes measurementerrors, sampling errors, systematic errors, and differencesarising from simplification of rea
23、l-world processes. In prin-ciple, uncertainty can be reduced with further information orknowledge (1)3.3.2.10 variability, nrefers to differences attributable totrue heterogeneity or diversity in atmospheric processes thatresult in part from natural random processes. Variabilityusually is not reduci
24、ble by further increases in knowledge, butit can in principle be better characterized (1).4. Summary of Guide4.1 Statistical evaluation of dispersion model performancewith field data is viewed as part of a larger process thatcollectively is called model evaluation. Section 6 discusses thecomponents
25、of model evaluation.4.2 To statistically assess model performance, one mustdefine an overall evaluation goal or purpose. This will suggestfeatures (evaluation objectives) within the observed and mod-eled concentration patterns to be compared, for example,maximum surface concentrations, lateral exten
26、t of a dispersingplume. The selection and definition of evaluation objectivestypically are tailored to the models capabilities and intendeduses. The very nature of the problem of characterizing airquality and the way models are applied make one single orabsolute evaluation objective impossible to de
27、fine that issuitable for all purposes. The definition of the evaluationobjectives will be restricted by the limited range conditionsexperienced in the available comparison data suitable for use.For each evaluation objective, a procedure will need to bedefined that allows definition of the evaluation
28、 objective fromthe available observations of concentration values.4.3 In assessing the performance of air quality models tocharacterize a particular evaluation objective, one shouldconsider what the models are capable of providing. As dis-cussed in Section 7, most models attempt to characterize thee
29、nsemble average concentration pattern. If such models shouldprovide favorable comparisons with observed concentrationmaxima, this is resulting from happenstance, rather than skill inthe model; therefore, in this discussion, it is suggested a modelbe assessed on its ability to reproduce what it was d
30、esigned toproduce, for at least in these comparisons, one can be assuredthat zero bias with the least amount of scatter is by definitiongood model performance.4.4 As an illustration of the principles espoused in thisguide, a procedure is provided in Appendix X1 for comparisonof observed and modeled
31、near-centerline concentration values,which accommodates the fact that observed concentrationvalues include a large component of stochastic, and possiblydeterministic, variability unaccounted for by current models.The procedure provides an objective statistical test of whetherdifferences seen in mode
32、l performance are significant.5. Significance and Use5.1 Guidance is provided on designing model evaluationperformance procedures and on the difficulties that arise instatistical evaluation of model performance caused by thestochastic nature of dispersion in the atmosphere. It is recog-nized there a
33、re examples in the literature where, knowingly orunknowingly, models were evaluated on their ability to de-scribe something which they were never intended to charac-terize. This guide is attempting to heighten awareness, andthereby, to reduce the number of “unknowing” comparisons. Agoal of this guid
34、e is to stimulate development and testing ofevaluation procedures that accommodate the effects of naturalvariability. A technique is illustrated to provide informationfrom which subsequent evaluation and standardization can bederived.6. Model Evaluation6.1 BackgroundAir quality simulation models hav
35、e beenused for many decades to characterize the transport anddispersion of material in the atmosphere (2-4). Early evalua-tions of model performance usually relied on linear least-squares analyses of observed versus modeled values, usingtraditional scatter plots of the values, (5-7). During the 1980
36、s,attempts have been made to encourage the standardization ofmethods used to judge air quality model performance (8-11).Further development of these proposed statistical evaluationprocedures was needed, as it was found that the rote applica-tion of statistical metrics, such as those listed in (8), w
37、asincapable of discerning differences in model performance (12),whereas if the evaluation results were sorted by stability anddistance downwind, then differences in modeling skill could bediscerned (13). It was becoming increasingly evident that the3The boldface numbers in parentheses refer to the l
38、ist of references at the end ofthis standard.D6589 05 (2010)12models were characterizing only a small portion of the ob-served variations in the concentration values (14). To betterdeduce the statistical significance of differences seen in modelperformance in the face of large unaccounted for uncert
39、aintiesand variations, investigators began to explore the use ofbootstrap techniques (15). By the late 1980s, most of the modelperformance evaluations involved the use of bootstrap tech-niques in the comparison of maximum values of modeled andobserved cumulative frequency distributions of the concen
40、tra-tions values (16). Even though the procedures and metrics to beemployed in describing the performance of air quality simula-tion models are still evolving (17-19), there has been a generalacceptance that defining performance of air quality modelsneeds to address the large uncertainties inherent
41、in attemptingto characterize atmospheric fate, transport and dispersionprocesses. There also has been a consensus reached on thephilosophical reasons that models of earth science processescan never be validated, in the sense of claiming that a model istruthfully representing natural processes. No ge
42、neral empiricalproposition about the natural world can be certain, since therewill always remain the prospect that future observations maycall the theory in question (20). It is seen that numerical modelsof air pollution are a form of a highly complex scientifichypothesis concerning natural processe
43、s, that can be confirmedthrough comparison with observations, but never validated.6.2 Components of Model EvaluationA model evaluationincludes science peer reviews and statistical evaluations withfield data. The completion of each of these componentsassumes specific model goals and evaluation object
44、ives (seeSection 10) have been defined.6.3 Science Peer ReviewsGiven the complexity of char-acterizing atmospheric processes, and the inevitable necessityof limiting model algorithms to a resolvable set, one compo-nent of a model evaluation is to review the models science toconfirm that the construc
45、t is reasonable and defensible for thedefined evaluation objectives. A key part of the scientific peerreview will include the review of residual plots where modeledand observed evaluation objectives are compared over a rangeof model inputs, for example, maximum concentrations as afunction of estimat
46、ed plume rise or as a function of distancedownwind.6.4 Statistical Evaluations with Field DataThe objectivecomparison of modeled concentrations with observed field dataprovides a means for assessing model performance. Due to thelimited supply of evaluation data sets, there are severe practicallimits
47、 in assessing model performance. For this reason, theconclusions reached in the science peer reviews (see 6.3) andthe supportive analyses (see 6.5) have particular relevance indeciding whether a model can be applied for the defined modelevaluation objectives. In order to conduct a statistical compar
48、i-son, one will have to define one or more evaluation objectivesfor which objective comparisons are desired (Section 10). Asdiscussed in 8.4.4, the process of summarizing the overallperformance of a model over the range of conditions experi-enced within a field experiment typically involves determin
49、ingtwo points for each of the model evaluation objectives: whichof the models being assessed has on average the smallestcombined bias and scatter in comparisons with observations,and whether the differences seen in the comparisons with theother models statistically are significant in light of the uncer-tainties in the observations.6.5 Other Tasks Supportive to Model EvaluationAs atmo-spheric dispersion models become more sophisticated, it is noteasy to detect coding errors in the implementation of the modelalgorithms. And as models become more complex, discer