Exporting Simulation Results

Exporting Scenario Data For Further Analysis In High-End Business Intelligence Tools

Jupyter Notebooks are a perfect environment to create System Dynamics and Agent-based Models and analyse them in-depth – at least if you are a data scientist or computational modeler.

But what if you want to present your results to people who are not quite so tech savvy or don’t have all the necessary tools installed?

We face such situations quite often with our clients and at one point we asked ourselves:

  • Why not use a high-end business intelligence tool such as Microsofts Power BI Desktop to create a polished dasboards?

  • Why not share the reports using the Power BI service? After all, Power BI was created to create such data intelligence apps!

  • Why not use Jupyter notebooks to create sophisticated simulation models (which is what the Jupyter+Python Ecosystem is good at) and then use Power BI for the fancy UI (which is what Power BI is good at)

To achieve this, all we really need BPTK-Py to do is to export the data generated by the simulation for those scenarios that are relevant for the report.

We can then import that data into Power BI and build the report using Power BI’s WYSIWIG tools.

To achieve this, we’ve added a new method called export_scenarios to BPTK-Py which writes the data for a set of scenarios and interactive settings to an Excel file.

Here is what the method looks like for the customer acquisition model, which is one of the models we’ve provided with the tutorial:

bptk.export_scenarios(
    scenario_manager="smCustomerAcquisition",
    equations=["customers","profit"],
    filename='/path/to/exported/data/customer_aquisition.xlsx',
    interactive_scenario="interactiveScenario",
    interactive_equations=["customers","profit"],
    interactive_settings= {
        "advertisingSuccessPct":(0,0.2,0.01),
        "referralFreeMonths":(0,40,10),
        "referralProgramAdoptionPct":(0,12,1),
        "referrals":(0,12,1)
    }
)

And here is a Power BI report we’ve created from the data, you can access it directly on PowerBI:

A Closer Look At BPTK-Py’s Export Function

This section takes a closer look at how the export function is implemented, just in case you would like to add some features or export the data in some other format.

## Load the BPTK Package
from BPTK_Py.bptk import bptk

bptk = bptk()

The first thing to remember is how scenario data is stored in a pandas dataframe:

df=bptk.plot_scenarios(
    scenario_managers=["smCustomerAcquisition"],
    scenarios=["referSomeonePlease"],
    equations=["customers","profit"],
    return_df=True
)
df[1:10]
customers profit
t
1 800.000000 -1.020000e+06
2 1623.890133 -1.036240e+06
3 2472.377134 -1.048608e+06
4 3346.188234 -1.056987e+06
5 4246.071738 -1.061259e+06
6 5172.797601 -1.061302e+06
7 6127.158034 -1.056989e+06
8 7109.968110 -1.048189e+06
9 8122.066399 -1.034770e+06

Notice that each indicator (aka equation, customers and profit in this example) has its own column and that the time dimension forms the index of the dataframe. Also notice that the name of the scenario referSomeonePlease is not stored in the dataframe itself.

Now when it comes to displaying the data in an interactive report (like the one above) we would like to be able to switch between scenarios. So one thing we need to do is to add a column containing a name of the scenario to the dataframe. We also would like one large dataframe containing the data from all the scenarios.

# get all the scenarios from the smCustomerAquisition scenario manager
scenario_manager="smCustomerAcquisition"
scenarios = bptk.get_scenario_names([scenario_manager])
scenarios
['base',
 'serviceFlop',
 'rethinkAdvertising',
 'referSomeonePlease',
 'hereWeGo',
 'boomButBust']
# List of equations we want to generate data for
equations=["customers","profit"]
# collect the scenarios into a list
scenario_dfs = []
for scenario in scenarios:
    # first create a dataframe for each scenario
    df = bptk.plot_scenarios(
        scenario_managers=[scenario_manager],
        scenarios=[scenario],
        equations=equations,
        return_df=True)
    # add a colum which will contain the name of the scenario
    df["scenario"] = [scenario] * len(df.index)
    # create a new column which will contain the time step (which won't be a unique index anymore, as we are concatenating many scenarios)
    df["time"] = df.index
    scenario_dfs += [df]
scenario_dfs[0:2]
[       customers        profit scenario  time
 t
 0       0.000000 -1.000000e+06     base     0
 1     800.000000 -1.010000e+06     base     1
 2    1599.893333 -1.016000e+06     base     2
 3    2399.680014 -1.018001e+06     base     3
 4    3199.360057 -1.016002e+06     base     4
 ..           ...           ...      ...   ...
 56  44636.126878  4.585242e+06     base    56
 57  45430.175394  4.798423e+06     base    57
 58  46224.118038  5.015574e+06     base    58
 59  47017.954822  5.236694e+06     base    59
 60  47811.685761  5.461784e+06     base    60

 [61 rows x 4 columns],        customers        profit     scenario  time
 t
 0       0.000000 -1.000000e+06  serviceFlop     0
 1      80.000000 -1.020000e+06  serviceFlop     1
 2     162.398901 -1.039624e+06  serviceFlop     2
 3     247.268571 -1.058861e+06  serviceFlop     3
 4     334.683026 -1.077699e+06  serviceFlop     4
 ..           ...           ...          ...   ...
 56  11276.933787 -1.053447e+06  serviceFlop    56
 57  11694.455595 -1.020439e+06  serviceFlop    57
 58  12124.449535 -9.854686e+05  serviceFlop    58
 59  12567.286351 -9.484763e+05  serviceFlop    59
 60  13023.347694 -9.094022e+05  serviceFlop    60

 [61 rows x 4 columns]]
import pandas as pd
# concatenate the data into one table using pandas concat function
scenarios_tab = pd.concat(scenario_dfs, ignore_index=True, sort=False)
scenarios_tab
customers profit scenario time
0 0.000000e+00 -1.000000e+06 base 0
1 8.000000e+02 -1.010000e+06 base 1
2 1.599893e+03 -1.016000e+06 base 2
3 2.399680e+03 -1.018001e+06 base 3
4 3.199360e+03 -1.016002e+06 base 4
5 3.998933e+03 -1.010005e+06 base 5
6 4.798400e+03 -1.000011e+06 base 6
7 5.597760e+03 -9.860187e+05 base 7
8 6.397014e+03 -9.680299e+05 base 8
9 7.196161e+03 -9.460448e+05 base 9
10 7.995202e+03 -9.200640e+05 base 10
11 8.794136e+03 -8.900880e+05 base 11
12 9.592963e+03 -8.561173e+05 base 12
13 1.039168e+04 -8.181525e+05 base 13
14 1.119030e+04 -7.761941e+05 base 14
15 1.198881e+04 -7.302426e+05 base 15
16 1.278721e+04 -6.802985e+05 base 16
17 1.358550e+04 -6.263625e+05 base 17
18 1.438369e+04 -5.684350e+05 base 18
19 1.518177e+04 -5.065165e+05 base 19
20 1.597975e+04 -4.406077e+05 base 20
21 1.677762e+04 -3.707089e+05 base 21
22 1.757538e+04 -2.968208e+05 base 22
23 1.837304e+04 -2.189439e+05 base 23
24 1.917059e+04 -1.370787e+05 base 24
25 1.996803e+04 -5.122577e+04 base 25
26 2.076537e+04 3.861440e+04 base 26
27 2.156260e+04 1.324412e+05 base 27
28 2.235973e+04 2.302543e+05 base 28
29 2.315675e+04 3.320529e+05 base 29
... ... ... ... ...
336 3.982866e+06 -1.789605e+07 boomButBust 31
337 4.384834e+06 -1.005268e+07 boomButBust 32
338 4.739161e+06 1.228133e+06 boomButBust 33
339 5.038095e+06 1.594096e+07 boomButBust 34
340 5.280532e+06 3.384218e+07 boomButBust 35
341 5.470586e+06 5.452607e+07 boomButBust 36
342 5.615467e+06 7.751470e+07 boomButBust 37
343 5.723485e+06 1.023330e+08 boomButBust 38
344 5.802653e+06 1.285565e+08 boomButBust 39
345 5.859936e+06 1.558321e+08 boomButBust 40
346 5.900993e+06 1.838806e+08 boomButBust 41
347 5.930218e+06 2.124892e+08 boomButBust 42
348 5.950919e+06 2.414996e+08 boomButBust 43
349 5.965529e+06 2.707961e+08 boomButBust 44
350 5.975816e+06 3.002952e+08 boomButBust 45
351 5.983045e+06 3.299375e+08 boomButBust 46
352 5.988119e+06 3.596806e+08 boomButBust 47
353 5.991678e+06 3.894945e+08 boomButBust 48
354 5.994172e+06 4.193581e+08 boomButBust 49
355 5.995920e+06 4.492565e+08 boomButBust 50
356 5.997143e+06 4.791794e+08 boomButBust 51
357 5.998000e+06 5.091195e+08 boomButBust 52
358 5.998600e+06 5.390715e+08 boomButBust 53
359 5.999020e+06 5.690319e+08 boomButBust 54
360 5.999314e+06 5.989982e+08 boomButBust 55
361 5.999520e+06 6.289686e+08 boomButBust 56
362 5.999664e+06 6.589419e+08 boomButBust 57
363 5.999765e+06 6.889172e+08 boomButBust 58
364 5.999835e+06 7.188939e+08 boomButBust 59
365 5.999885e+06 7.488716e+08 boomButBust 60

366 rows × 4 columns

We now have all the data for all the scenarios in one large dataframe. Each row is indexed by the scenario it belongs to. The timestamp is only unique within a given scenario.

Generating The Data For Scenario Comparison

The data we have generated so far is a table with a column for each indicator, indexed by scenario.

This is fine if you want to look at data scenario by scenario or plot two different indicators for the same scenario.

But what if you want to compare the same indicator for different scenarios?

In such a case, your data needs to be structured a little differently - essential we then want a table with a column for each scenario, indexed by the indicator.

To achieve this, we need to loop through the scenarios again:

# create a new dataframe with a column for each scenario, indexed by time and indicator
indicator_dfs = []
for scenario_no, scenario in enumerate(scenarios):

    scenario_dfs=[]
    # loop through the equations
    for equation in equations:
        # add a column which will contain the name of the indicator
        df = bptk.plot_scenarios(
            scenario_managers=[scenario_manager],
            scenarios=[scenario],
            equations=[equation],
            return_df=True)
        df.rename(columns={equation:scenario},inplace=True)
        if scenario_no is len(scenarios)-1:
            df["indicator"] = [equation] * len(df.index)
            df["time"] = df.index
        scenario_dfs +=[df]

    # conacate the indicators for the scenario (i.e. along axis 0)
    indicators_scenario_tab = pd.concat(scenario_dfs, axis=0,ignore_index=True,sort=False)

    # create a new column which will contain the time step (which won't be a unique index anymore, as we are concatenating many scenarios)

    indicator_dfs += [indicators_scenario_tab]

#concatenate all the scenario columns (i.e. along axis 1)
indicators_tab=pd.concat(indicator_dfs,axis=1,sort=False)
indicators_tab.index.name="id"
indicators_tab[0:10]
base serviceFlop rethinkAdvertising referSomeonePlease hereWeGo boomButBust indicator time
id
0 0.000000 0.000000 0.000000 0.000000 0.000000 0.000000 customers 0
1 800.000000 80.000000 80.000000 800.000000 800.000000 800.000000 customers 1
2 1599.893333 162.398901 183.998613 1623.890133 1839.861333 1839.861333 customers 2
3 2399.680014 247.268571 319.194051 2472.377134 3191.405164 3191.405164 customers 3
4 3199.360057 334.683026 494.942916 3346.188234 4947.891939 4947.891939 customers 4
5 3998.933476 424.718494 723.406944 4246.071738 7230.375720 7230.375720 customers 5
6 4798.400284 517.453484 1020.393216 5172.797601 10195.910470 10195.910470 customers 6
7 5597.760498 612.968850 1406.445515 6127.158034 14048.126326 14048.126326 customers 7
8 6397.014130 711.347864 1908.261512 7109.968110 19050.823648 19050.823648 customers 8
9 7196.161194 812.676285 2560.532449 8122.066399 25545.383938 25545.383938 customers 9

Generating The Data For Interactive Dashboards

In most cases creating an interactive report that just compares predefined scenarios is quite enough. But sometimes you would like to add a little dashboard to allow users to test different settings themselves, like the “Forecast” page in the example above.

The easiest way to achive this in Power BI is to use so called “What If” parameters to select a scenario from a set of pre-computed scenarios. We need to pre-compute them because currently Power BI doesn’t allow you to query data live with different parameters.

In most cases there will be thousands of “interactive” scenarios you need to pre-compute, so it is not feasible to enumerate them as a list. Instead the idea is to start with a base “interactive” scenario and then vary a set of parameters within a given range, much like in a Monte Carlo simulation.

# these are the scenarios initially defined
bptk.list_scenarios(["smCustomerAcquisition"])
* smCustomerAcquisition *
     base
     serviceFlop
     rethinkAdvertising
     referSomeonePlease
     hereWeGo
     boomButBust
# a new scenario which will act as a base for generating the data needed for interactive dashboards
bptk.register_scenarios(scenario_manager="smCustomerAcquisition",scenarios={
                          "interactiveScenario":{
                              "constants":{
                                 "referrals":0,
                                  "advertisingSuccessPct":0.1,
                                  "referralFreeMonths":3,
                                  "referralProgamAdoptionPct":10
                                }
                          }
}

)

Another call to list_scenarios shows that the interactive scenario has now been added:

bptk.list_scenarios(["smCustomerAcquisition"])
* smCustomerAcquisition *
     base
     serviceFlop
     rethinkAdvertising
     referSomeonePlease
     hereWeGo
     boomButBust
     interactiveScenario

Now that we have a scenario, we need to define the “What if” parameters:

# for each parameter we define the range of settings - e.g. referrals ranges from 0 to twelve, with a step of one: 1,2, 3... 12.
# advertisingSuccessPct ranges from 0 to 0.2, with a steop of 0.01: 0, 0.01, 0.02,....
interactive_settings= {
    "advertisingSuccessPct":(0,0.2,0.01),
    "referralFreeMonths":(0,40,10),
    "referralProgramAdoptionPct":(0,12,1),
    "referrals":(0,12,1)
}

Now we need to pre-compute all possible combinations, which is quite a few for the ranges defined above:

import numpy as np # will use the arange function to create ranges with fractional steps
len(np.arange(0,0.2,0.01))*len(np.arange(0,40,10))*len(np.arange(0,12,1))*len(np.arange(0,12,1))
11520
import itertools # will use the product function to iterate through all possible combinations

# generate all combinations of the settings
dimensions = [interactive_settings[key] for key in interactive_settings]
#now generate all possible settings
settings = list(itertools.product(*tuple(itertools.starmap(np.arange, dimensions))))

The last line of code uses some advanved functional programming to generates all possible combinations of the interactive parameters … let’s take a look at what it does using just two interactive parameters:

list(itertools.product(*tuple(itertools.starmap(np.arange,[(1,4,1),(4,7,1)]))))
[(1, 4), (1, 5), (1, 6), (2, 4), (2, 5), (2, 6), (3, 4), (3, 5), (3, 6)]
len(settings)
11520
# IMPORTANT: this code takes quite some time to run (ca. 4min on my machine)
# a variable to store the dataframes
interactive_dfs = []
interactive_scenario="interactiveScenario"
interactive_equations=["customers","profit"]
scenario = bptk.get_scenario(scenario_manager, interactive_scenario)
# now apply the settings to the scenario
for setting in settings:
    for setting_index, key in enumerate(interactive_settings):
        scenario.set_property_value(key, setting[setting_index])
    bptk.reset_simulation_model(
            scenario_manager=scenario_manager,
            scenario=interactive_scenario
    )
    df = bptk.plot_scenarios(
        scenario_managers=[scenario_manager],
        scenarios=[interactive_scenario],
        equations=interactive_equations,
        return_df=True
    )
    # add columns for the settings
    for setting_index, key in enumerate(interactive_settings):
        df[key] = [setting[setting_index]] * len(df.index)
    # explicitly set a time column
    df["time"] = df.index
    interactive_dfs += [df]
# concatenate the interactive scenarios
interactive_tab = pd.concat(interactive_dfs, ignore_index=True, sort=False)
len(interactive_tab)
702720
11520*61 # number of scenarios * number of timesteps
702720

Writing The Dataframes To An Excel File

Now that we have the dataframes containg the data, we can write them to an Excel file using Pandas ExcelWriter function. Note that this relies on the xlsxwriter package:

import xlsxwriter
#filename="/path/to/your/file/"
with pd.ExcelWriter(filename) as writer:
    scenarios_tab.to_excel(writer, sheet_name="scenarios")
    interactive_tab.to_excel(writer, sheet_name="interactive")

Calling The Export Function Directly

Here is how you would call the export_scenarios function directly – it you don’t pass a filename it returns a dictionary containing the dataframes for both the scenarios and the interactive dashboard.

Important: With the given parameters the export function generates over 11.000 interactive scenarios amounting to around 30MB of data. On my machine (a Macbook Pro with 16MB of RAM) the function takes just under four minutes to complete.

%%time

## save the file in the current working directory
import os
filename= os.path.join(os.getcwd(),"customer_acquisition.xlsx")

## Load the BPTK Package
bptk.export_scenarios(
    scenario_manager="smCustomerAcquisition",
    equations=["customers","profit"],
    filename=filename,
    interactive_scenario="interactiveScenario",
    interactive_equations=["customers","profit"],
    interactive_settings= {
        "advertisingSuccessPct":(0,0.2,0.01),
        "referralFreeMonths":(0,40,10),
        "referralProgramAdoptionPct":(0,12,1),
        "referrals":(0,12,1)
    }
)