Economic and statistical forecasts are regularly assessed by strictly consistent scoring functions such as the mean squared error for mean, or the tick loss for quantile forecasts. While scoring functions allow for statistically meaningful rankings of overall forecast quality, they are silent about the specific deficiencies of the forecasts. We develop asymptotic inference for recently proposed score decompositions into miscalibration, discrimination and uncertainty terms, which allows for hypothesis tests and the construction of confidence intervals for the miscalibration and discrimination terms. These methods deliver more detailed insights into forecast performances in applications to mean forecasts for inflation rates and volatility forecasts in risk management.
Co-author: Marius Puke, University of Hohenheim