May 11, 2021 in Probability Management

Chancification: Wiring Your Organization for Probability

SHARE: PRINT ARTICLE:print this page https://doi.org/10.1287/orms.2021.03.01

Editor's note. The main article is authored by Sam Savage, with sidebar contributions from Shayne Kavanagh and Aaron Brown.

In 1752, Benjamin Franklin performed the risky stunt of flying a kite in a thunderstorm, which in his own words proved “the sameness of the electric matter with that of lightning.” A mere 150 years later, electrification was delivering electric matter generated by engineers to the general public for lighting houses and powering factories. Similarly, “chancification” can now deliver stochastic information generated by analysts to managers for estimating the chances of meeting their goals.

For example, consider costing out a project to bring natural gas to a new housing development. A typical project cost plan would include columns for job code (type of pipe), number of feet required and cost per foot (see Figure 1).

A job cost calculation.
Figure 1. A job cost calculation.

The SUMPRODUCT formula should calculate the total project cost, but the costs per foot are typically uncertain and replaced by averages, leading to the “Flaw of Averages.” And suppose there was a big penalty for exceeding a cost of $1 million? How would we estimate the chances of that? With chancification it only takes two buttons, as we will show below. But first, some groundwork.

The Arithmetic of Uncertainty

Arithmetic can tell us that X + Y = Z. The Arithmetic of Uncertainty says: “What do you want Z to be? Here are your chances.” The discipline of probability management powers the arithmetic of uncertainty by representing uncertainties as data that obey both the laws of arithmetic and the laws of probability. This data consists of vectors of statistically coherent Monte Carlo trials called SIPs (Stochastic Information Packets) [1, 2]. In 2014, nonprofit ProbabilityManagement.org established the open cross-platform 2.0 PM SIPmath Standard to convey SIP libraries between Excel, XML, CSV and JSON formats.

A Grand Alliance

I have recently been joined by two other battle-hardened veterans of the War on Averages to advance chancification to the next level. The 2.0 PM SIPmath Standard required storing up to hundreds of millions of Monte Carlo trials per SIP library, making them inefficient to distribute. Now, three complementary technologies have greatly improved the efficiency, much as the switch from direct to alternating current improved the distribution of electricity. They are roughly analogous to the AC standard, generators and transformers of electrification.

The 3.0 PM SIPmath Standard. At ProbabilityManagement.org, we extended the 2.0 standard beyond actual trials to also interpret inverse cumulative functions coupled to cross platform pseudo-random number generators. These are joined through a “copula” layer that preserves interrelationships between variables. This requires a miniscule fraction of the previous storage.

The HDR Cross-Platform Generator. The counter-based HDR generator designed by Doug Hubbard [3], author of the popular “How to Measure Anything” series, performs well under the respected Dieharder random number tests, fits into a single cell in Excel, and is built into the SIPmath Tools from ProbabilityManagement.org. It can easily be used on its own in Excel, R, Python or any programming environment to generate either identical or independent streams of random numbers as required though a multidimensional seed. This maintains statistical coherence across simulations run on different platforms.

The Metalog System. Invented by Tom Keelin, former worldwide managing director of the prestigious Strategic Decisions Group, the metalog distributions [4] are an elegant family of formulas based directly on data. They have already solved the open problem of analytically summing IID lognormal distributions [5]. They not only replicate traditional statistical distributions but can detect multiple populations within a single data set and provide analytical expressions for multimodal distributions. Crucially for probability management, their natural form is an inverse cumulative function, which can transform uniform random variables into other distributions.

Implementation

The HDR and metalog formulas are implemented on the host computer in Excel, R, Python or other environments. Then the 3.0 PM SIPmath Standard can deliver identical streams of virtually any continuous random variate to any platform with roughly 20 numbers as shown below, instead of using separate formulas for every type of variable. See the sidebar on RSIPlibrary, in which the library created also had data for the sparkline (shown later in Figure 4).

metadata bar

Chancification in Action

Chancification allows simulations to be networked across and between enterprises to cure the Flaw of Averages on a broad scale [6]. Returning to the case above, distributions of cost per foot by job code are estimated with metalogs and stored in the cloud in 3.0 format. Next, we access them with ChanceCalc, the latest of the free tools from ProbabilityManagement.org (see the ribbon in Figure 2).

ChanceCalc Ribbon
Figure 2. The ChanceCalc Ribbon.

Clicking the SIP Input button brings up a browser allowing us to search for Libraries behind our corporate firewall (see Figure 3). No need to worry about IT approval; just save a few Excel files to SharePoint, or some other collaborative network.

SIPmath Network Browser
Figure 3. The SIPmath Network Browser.

Next, select the costs per foot in the project plan and paste in the Job Codes from the library. Sparklines of the distributions will appear in the cells. Then, add a “>” sign and $1,000,000 and invoke the Chance of Whatever Button (Figure 4).

Setup for the Chance of Whatever Button
Figure 4. Setup for the Chance of Whatever Button along with the result.

Click OK to see that the chance of exceeding $1 million is 30%. The resulting model is a stand-alone Excel model without macros and does not require ChanceCalc to run. Download it from Chancification.org and see what happens to that 30% if you increase the 1,500 feet of Job Code 1 to 1,800 feet. See the sidebar on the PRECISE Uncertainty Project for an application of chancification.

Chancify Now

Doug, Tom and I have been promoting probabilistic thinking for more than 75 years between us and it is a hard sell. But so was electricity before the power grid. ProbabilityManagement.org offers educational programs and free tools to help you get started, none of which will electrocute you.

References

  1. Sam Savage, Stefan Scholtes and Daniel Zweidler, 2006, “Probability Management,” OR/MS Today, Vol. 33, No. 1, February.
  2. https://en.wikipedia.org/wiki/Probability_management
  3. Douglas W. Hubbard, 2019, “A multidimensional, counter-based pseudo random number generator as a standard for Monte Carlo simulation,” Proceedings of the 2019 Winter Simulation Conference, INFORMS, https://www.informs-sim.org/wsc19papers/339.pdf.
  4. Thomas W. Keelin, 2016, “The metalog distribution,” Decision Analysis, Vol. 13, No. 4, pp. 243-277.
  5. Thomas W. Keelin, Lonnie Chrisman and Sam Savage, 2019, “The metalog distributions and extremely accurate sums of lognormals in closed form,” Proceedings of the 2019 Winter Simulation Conference, INFORMS, http://www.metalogdistributions.com/images/Metalogs_and_the_Sum_of_Lognormals_in_Closed_Form.pdf.
  6. Sam L. Savage and John Marc Thibault, 2015, “Towards a simulation network or the medium is the Monte Carlo (with apologies to Marshall McLuhan),” Proceedings of the 2015 Winter Simulation Conference, INFORMS, http://www.informs-sim.org/wsc15papers/467.pdf.

The PRECISE Uncertainty Project

Projected Revenue Estimation from Crowdsourced Information on Statistical Errors

By Sam Savage and Shayne Kavanagh

The Problem: Imagine that you are the CFO of a small city. For next year’s budget, you have forecasted tax revenues of $80 million. The city manager asks you to estimate the chance that revenues will meet your target. In this scenario, the city manager wants the CFO to give a certain answer, yet the only thing certain about such a forecast is that it will be wrong. To develop a truly accurate forecast, the CFO needs to first acknowledge and then manage the uncertainty inherent in every budget projection.

The Solution: To help its members develop chance-aware budgets in uncertain times, the Government Finance Officers Association (GFOA), a professional organization of nearly 22,000 financial managers, teamed up with ProbabilityManagement.org to apply chancification to the problem. Crowdsourced information on the statistical errors of past revenue forecasts of GFOA members were run through an R program to generate a correlated SIP library of historical accuracy (Figure 5). Several sizes of cities were surveyed over several distinct economic eras, such as pre-2007, and the Great Recession.

From data entry to SIP Library
Figure 5. From data entry to SIP Library. 

These libraries may be used in decision dashboards in Excel that estimate the chances of achieving revenue targets as shown in Figure 6.

Chance-informed decision dashboard
Figure 6. Chance-informed decision dashboard.

In particular, for a prioritized budget, one can quickly assess the chances of specific projects being funded. The libraries may also be accessed through ChanceCalc to create custom Excel dashboards, or in R, Python, or any other analytical environment.

We call the result the PRECISE Uncertainty Project, not just for the fun of an apparent oxymoron, but because the use of SIP libraries ensures auditability and precisely the same results regardless of analytical platform. Visit PreciseUncertainty.org for more information or email us at [email protected] to explore similar approaches to your own organization’s forecasting.

RSIPlibrary

Creating 3.0 Libraries from Data

By Aaron Brown

The RSIPlibrary package creates 3.0 PM SIPmath Standard Excel libraries from coherent statistical data or the outputs of simulation as shown in Figure 7. The 3.0 standard is based on the HDR random number generator and the metalog distribution, as previously discussed in the article on chancification. The package uses Isaac Faber’s “rmetalog” package to generate the metalog coefficients and Philipp Schauberger’s and Alexander Walker’s “openxlsx” package to write the Excel library. In this example, the input had 1,000 rows of simulation data. The output library had only 53 rows, 25 of which were devoted to a density plot of sparkline data as shown in Figure 7. The library in turn can be read by ChanceCalc or other tools.

RSIPlibrary routine delivers a PM SIPmath 3.0 Library
Figure 7. RSIPlibrary routine delivers a PM SIPmath 3.0 Library

 

Sam L. Savage
Shayne Kavanagh
([email protected])
Aaron Brown

SHARE:

INFORMS site uses cookies to store information on your computer. Some are essential to make our site work; Others help us improve the user experience. By using this site, you consent to the placement of these cookies. Please read our Privacy Statement to learn more.