1、Bias Correction Methods Adjusting Moments,Bo Cui*, Zoltan Toth Yuejian Zhu, Dingchen Hou*, and Richard Wobus * Environmental Modeling Center, NCEP/NWS * SAIC at Environmental Modeling Center, NCEP/NWS,Acknowledgements,Zoltan Toth Yuejian ZhuDingchen HouRichard Wobus,Tasks & GoalsBias-Correction Algo
2、rithm: Adjusting MomentsExperimental DesignEnsemble Forecast VerificationFuture Plans,Outline,Ensemble Postprocessing,NWP models, ensemble formation are imperfect Deficiencies due to various problems in NWP models Systematic errors in analysis induced by observations and model related Ensemble forma
3、tion Not appropriate initial spread Lack of representation of model related uncertainty Limited ensemble size Known model/ensemble problems addressed at their sources, no “perfect” solution exists Systematic errors remain and cause biases in 1st , 2nd moments of ensemble distribution,Tasks & Goals,T
4、asks Develop and implement a statistical post-processing scheme to reduce the biases in ensemble forecasts (height, temperature and other variables) Correct both the 1st and 2nd moments of the ensembleGoals Biased-corrected forecasts will have reduced or no bias with respect to the verifying analysi
5、s fields, given on the model grid,FIRST MOMENT B = DIFFERENCE BETWEEN Ensemble mean forecast and Verifying analysis,SECOND MOMENT R = RATIO BETWEEN RMS Error of Ensemble mean and Ensemble Spread,Moment Adjustment,Bias Assessment,Bias Correction,1st moment = Ensemble mean B,2nd moment = Ensemble mean
6、 B (Ensemble Forecast Ensemble Mean) * R,Implementation Facts,Bias assessment carried out separately at each forecast lead time individual grid point ensemble mean, GFS and ensemble control forecasts Bias correction tests - applied on all ensemble member forecasts for 00Z initial cycle only 2.5x2.5
7、lat/lon resolution 500 mb height, 850 mb temperature,Adaptive methods: Consider most recent past data with decaying averaging Use data from surrounding grid-points (with a Gaussian weighting function) Use large (climatological) sample data if available and forecast system is stable Adjust temporal/s
8、patial sampling domain to optimize performance Construct cumulative frequency distribution to match that of observed, QPF calibration (Yuejian Zhu) Regime dependent method (Jun Du) use correlation coefficients between circulation field today vs. that in recent past to determine weights given to data
9、 in estimating bias,Alternatives or Refinements of Bias-Correction Algorithm,Experimental Design,Implementation of decaying averaging for 1st moment bias,decaying averaging mean error = (1-w) * prior t.m.e + w * (f a),T0-46 day T0-16 day T0 day,a) Prior estimate to startup procedure: choose T0 as cu
10、rrent date (00Z), calculate the time mean errors between T-46 and T-16 day. b) Update: the prior estimate of the average state is multiplied by a factor 1-w (1). Then, most recent verification error (f - a) is added to the decaying average for each lead time with a weight of w. c) Cycling: repeat st
11、ep (b) every day.Three experiments with w of 1%, 2% and 10%,Experimental Design,Centered running mean error test for 1st moment bias,T0-15 day T0 day T0+15 day,Define +/- 15 day time average as bias. Use bias estimate (with dependent data) as “optimal” benchmark.Implementation: Four experiments: opt
12、imal test, three decaying averaging experiments (1%, 2% and 10% weight) 8-month period for these experiments (Spring and Summer 2004 ),OPT,W=1%,W=2%,W=10%,Temporal Cross Section: 500 mb Height Time Mean Error (40 N, 95 W, Jan. to Aug. 2004),May 22,Jun. 22,Jun. 22,Jun. 11,May 22,May 22,May 22,Tempora
13、l Cross Section: 850 mb Temp. Time Mean Error (40 N, 95 W, Jan. to Aug. 2004),OPT,W=1%,W=2%,W=10%,May 1,Jun. 2,Jun. 2,May 1,May 1,May 1,May 10,Ensemble Forecasts Verification,Verification of ensemble mean 500 mb height and 850 mb temperature Verification domains NH, SH and Tropics Verification data
14、set GFS final analysis Verification scores AC=pattern anomaly correlation coefficient RMS=root mean square error of ensemble mean ROC= relative operating characteristics RPSS=ranked probability skill score,AC,RMS,RMS error slightly reduced for first several days,3 bias-corrected ensembles with decay
15、ing average: AC scores slightly improved for week 1,AC and RMS 500 mb Height, Summer 2004,2% weight experiment improves performance over NH, and slightly over SH up to week 210% weight experiments performance improved over Tropics,NH,SH,TR,ROC: 500 mb Height, Summer 2004,NH,SH,TR,ROC: 500 mb Height,
16、 Spring 2004,NH and SH: ROC with some weight improved for most lead time Tropics: ROC improved at all leads indicting bias much reduced for sub-regions. 10% weight experiment has a better performance,2% weight experiment improve performance over NH, and slightly over SH as well10% weight experiment
17、improves performance over Tropics, especially for week 2,RPSS: 500 mb Height, Summer 2004,NH,SH,TR,Preliminary Results,In general, the time mean errors of 500 mb height increase with forecast lead time. The time mean errors growth of 500mb height with forecast lead time is nearly linear in some case
18、s. What determines linearity? The time mean error difference between 1% and 2% weight experiments is small. The 10% weight experiment has higher frequency details compared to the 1% and 2% experiments (better for short range?).The centred running mean error test (OPT) shows potential for significant
19、 improvement in the forecast of both 500 mb height and 850 mb temperature in term of all verification scores, compared to the raw ensembles.,Preliminary Results,For days 1 through 6, the AC scores for the raw ensemble and three bias corrected ensembles with decaying averaging are relatively close to
20、 each other on average. With some weights, AC and RMS performance can be improved. The 2% ensemble show large improvements of ROC, RPSS and BSS score over the North and South Hemisphere. The improvement of these scores in summer is more significant than in spring. On the other hand, the choice of 10
21、% weight works better for Tropics compared to 1% and 2%. Use different weights for Tropics?The decaying averaging approach to improve the NCEPs global ensemble forecast system seems promising. Problems with estimating bias for longer lead time with short sample.,Future Plans,Test 1st moment bias-cor
22、rection algorithm on longer period (four seasons, 5 years) for tuning. Start research on the 2nd moment calibration.Test refinements of bias correction algorithm listed before. Run 4 cycles per day, adding 06Z 12Z and18Z forecasts, to provide more timely information and increase sample size. Use dat
23、a with 1x1 lat/lon resolution. Add new ensemble forecast variables such as 2m temperature, U,V, cumulative frequency distribution for forecast QPF. Consider other methods and/or use of larger sample especially for longer lead times.,Refinements of Bias-Correction Algorithm,Details: Decaying averagin
24、g Use recent verification statistics in the calibration process, accumulated in a decaying averaging sense Achieved by using a recursive averaging procedure (Kalman Filtering),Toth, Z., and Y. Zhu, 2001,6.6%,3.3%,1.6%,Centered Running Mean Error: Summer 2004 Latitudinal Cross Section (95 W) Longitudinal Cross Section (40 N),z500,z500,T850,T850,40N,40N,95W,95W,