Choosing a Strictly Proper Scoring Rule
Abstract
Strictly proper scoring rules, including the Brier score and the logarithmic score, are standard metrics by which probability forecasters are assessed and compared. Researchers often find that one's choice of strictly proper scoring rule has minimal impact on one's conclusions, but this conclusion is typically drawn from a small set of popular rules. In the context of forecasting world events, we use a recently proposed family of proper scoring rules to study the properties of a wide variety of strictly proper rules. The results indicate that conclusions vary greatly across different scoring rules, so that one's choice of scoring rule should be informed by the forecasting domain. We then describe strategies for choosing a scoring rule that meets the needs of the forecast consumer, considering three unique families of proper scoring rules.

