Using Bayesian Model Averaging to Calibrate Forecast Ensembles

Raftery, Adrian E.; Gneiting, Tilmann; Balabdaoui, Fadoua; Polakowski, Michael
May 2005
Monthly Weather Review;May2005, Vol. 133 Issue 5, p1155
Academic Journal
Ensembles used for probabilistic weather forecasting often exhibit a spread-error correlation, but they tend to be underdispersive. This paper proposes a statistical method for postprocessing ensembles based on Bayesian model averaging (BMA), which is a standard method for combining predictive distributions from different sources. The BMA predictive probability density function (PDF) of any quantity of interest is a weighted average of PDFs centered on the individual bias-corrected forecasts, where the weights are equal to posterior probabilities of the models generating the forecasts and reflect the models' relative contributions to predictive skill over the training period. The BMA weights can be used to assess the usefulness of ensemble members, and this can be used as a basis for selecting ensemble members; this can be useful given the cost of running large ensembles. The BMA PDF can be represented as an unweighted ensemble of any desired size, by simulating from the BMA predictive distribution. The BMA predictive variance can be decomposed into two components, one corresponding to the between-forecast variability, and the second to the within-forecast variability. Predictive PDFs or intervals based solely on the ensemble spread incorporate the first component but not the second. Thus BMA provides a theoretical explanation of the tendency of ensembles to exhibit a spread-error correlation but yet be underdispersive. The method was applied to 48-h forecasts of surface temperature in the Pacific Northwest in January–June 2000 using the University of Washington fifth-generation Pennsylvania State University–NCAR Mesoscale Model (MM5) ensemble. The predictive PDFs were much better calibrated than the raw ensemble, and the BMA forecasts were sharp in that 90% BMA prediction intervals were 66% shorter on average than those produced by sample climatology. As a by-product, BMA yields a deterministic point forecast, and this had root-mean-square errors 7% lower than the best of the ensemble members and 8% lower than the ensemble mean. Similar results were obtained for forecasts of sea level pressure. Simulation experiments show that BMA performs reasonably well when the underlying ensemble is calibrated, or even overdispersed.


Related Articles

  • The Unbearable Lightness of Probabilities. de Elía, Ramón; Laprise, René // Bulletin of the American Meteorological Society;Sep2005, Vol. 86 Issue 9, p1224 

    This article comments on the complexities of using probabilities in weather forecasting. For years there has been debate regarding the interpretation of probability. It includes fields of study such as philosophy, risk theory, and artificial intelligence. Most members of the meteorological...

  • The ROC Curve and the Area under It as Performance Measures. Marzban, Caren // Weather & Forecasting;Dec2004, Vol. 19 Issue 6, p1106 

    The receiver operating characteristic (ROC) curve is a two-dimensional measure of classification performance. The area under the ROC curve (AUC) is a scalar measure gauging one facet of performance. In this short article, five idealized models are utilized to relate the shape of the ROC curve,...

  • A One-dimensional Ensemble Forecast and Assimilation System for Fog Prediction. Müller, M. D.; Schmutz, C.; Parlow, E. // Pure & Applied Geophysics;Jun2007, Vol. 164 Issue 6/7, p1241 

    A probabilistic fog forecast system was designed based on two high resolution numerical 1-D models called COBEL and PAFOG. The 1-D models are coupled to several 3-D numerical weather prediction models and thus are able to consider the effects of advection. To deal with the large uncertainty...

  • Calibrated Probabilistic Forecasting Using Ensemble Model Output Statistics and Minimum CRPS Estimation. Gneiting, Tilmann; Raftery, Adrian E.; Westveld III, Anton H.; Goldman, Tom // Monthly Weather Review;May2005, Vol. 133 Issue 5, p1098 

    Ensemble prediction systems typically show positive spread-error correlation, but they are subject to forecast bias and dispersion errors, and are therefore uncalibrated. This work proposes the use of ensemble model output statistics (EMOS), an easy-to-implement postprocessing technique that...

  • Increasing the Reliability of Reliability Diagrams. Bröcker, Jochen; Smith, Leonard A. // Weather & Forecasting;Jun2007, Vol. 22 Issue 3, p651 

    The reliability diagram is a common diagnostic graph used to summarize and evaluate probabilistic forecasts. Its strengths lie in the ease with which it is produced and the transparency of its definition. While visually appealing, major long-noted shortcomings lie in the difficulty of...

  • Diversity in Interpretations of Probability: Implications for Weather Forecasting. de Elía, Ramón; Laprise, René // Monthly Weather Review;May2005, Vol. 133 Issue 5, p1129 

    Over the last years, probability weather forecasts have become increasingly popular due in part to the development of ensemble forecast systems. Despite its widespread use in atmospheric sciences, probability forecasting remains a subtle and ambiguous way of representing the uncertainty related...

  • On the Proper Order of Markov Chain Model for Daily Precipitation Occurrence in the Contiguous United States. Schoof, J. T.; Pryor, S. C. // Journal of Applied Meteorology & Climatology;Sep2008, Vol. 47 Issue 9, p2477 

    Markov chains are widely used tools for modeling daily precipitation occurrence. Given the assumption that the Markov chain model is the right model for daily precipitation occurrence, the choice of Markov model order was examined on a monthly basis for 831 stations in the contiguous United...

  • Bayesian Retrieval of Complete Posterior PDFs of Oceanic Rain Rate from Microwave Observations. Chiu, J. Christine; Petty, Grant W. // Journal of Applied Meteorology & Climatology;Aug2006, Vol. 45 Issue 8, p1073 

    A new Bayesian algorithm for retrieving surface rain rate from Tropical Rainfall Measuring Mission (TRMM) Microwave Imager (TMI) over the ocean is presented, along with validations against estimates from the TRMM Precipitation Radar (PR). The Bayesian approach offers a rigorous basis for...

  • Aspects of Effective Mesoscale, Short-Range Ensemble Forecasting. Eckel, F. Anthony; Mass, Clifford F. // Weather & Forecasting;Jun2005, Vol. 20 Issue 3, p328 

    This study developed and evaluated a short-range ensemble forecasting (SREF) system with the goal of producing useful, mesoscale forecast probability (FP). Real-time, 0–48-h SREF predictions were produced and analyzed for 129 cases over the Pacific Northwest. Eight analyses from different...


Read the Article


Sorry, but this item is not currently available from your library.

Try another library?
Sign out of this library

Other Topics