Agent Based Models#

!pip install simpy
Requirement already satisfied: simpy in /Users/jeff/opt/anaconda3/lib/python3.7/site-packages (4.0.1)

Simpy: What we have learned so far#

  • Simpy provides a simulation environment that manages discrete-event simulations

  • Use Python generators to model behavior of components of a larger system.

  • Register instances of generators with the simulation environment

  • Append “who, what, value” to a data log for post-processing

What we will learn in this unit:

  • Modeling multiple units

  • Modeling a shared resource

  • Extracting data from a data log

Example: A room full of Roombas#

Let’s imagine a large facility that is being cleaned by a collection of Roomba-type robotic cleaning units. Each unit is characterized by time required to charge, and an amount of time it can clean before needing to be recharged. The facility must be cleaned during a 16 hour overnight shift. On average, 3 units must be operating continuously to meet the cleaning requirements, i.e., 3 x 16 = 48 hours machine cleaning each night. We would like to determine how many charging stations will be required.

Unit

Charge Time (hrs)

Clean Time (hrs)

A

1.0

2.5

B

0.5

1.5

C

0.8

2.0

D

1.4

3.5

E

0.5

1.2

roomba

import pandas as pd

roomba_data = [
    ["A", 1.0, 2.5],
    ["B", 0.5, 1.5],
    ["C", 0.8, 2.0],
    ["D", 1.4, 3.5],
    ["E", 0.5, 1.2],
]

roomba_df = pd.DataFrame(roomba_data, columns=["id", "charge_time", "clean_time"])
roomba_df
id charge_time clean_time
0 A 1.0 2.5
1 B 0.5 1.5
2 C 0.8 2.0
3 D 1.4 3.5
4 E 0.5 1.2

One Roomba#

The first challenge is to model the performance of a single Roomba. Our first attempt at a model consists of a simple Python generator. The data log consists of start and finish of each charge and cleaning cycle. For this first attempt, we’ll assume a charging station is always available when needed, and we’ll create just one instance of a Roomba to get started.

import simpy 
import pandas as pd

# create an empty data log
data_log = []

# roomba model is encapsulated as a Python generator. The id, charge_time, and clean_time
# parameters are the information needed to specify a particular instance of a Roomba.
# The model will log the begin and end of each charge and clean cycle
def roomba_model(id, charge_time, clean_time):
    while True:
        tic = env.now
        yield env.timeout(charge_time)
        toc = env.now
        data_log.append([id, "charging", tic, toc])
   
        tic = env.now
        yield env.timeout(clean_time)
        toc = env.now
        data_log.append([id, "cleaning", tic, toc])

# create the simulation environment
env = simpy.Environment()

# create the processes being simuulated
roomba = roomba_model("A", 1.0, 2.5)
env.process(roomba)

# run the simulation
env.run(until=16)

# convert the data_log to a Pandas DataFrame and display
df = pd.DataFrame(data_log, columns=["id", "event", "begin", "end"])
display(df)
id event begin end
0 A charging 0.0 1.0
1 A cleaning 1.0 3.5
2 A charging 3.5 4.5
3 A cleaning 4.5 7.0
4 A charging 7.0 8.0
5 A cleaning 8.0 10.5
6 A charging 10.5 11.5
7 A cleaning 11.5 14.0
8 A charging 14.0 15.0

Adding a full complement of Roombas#

The next step is to include all of the available Roombas to the simulation. We do this by looping over the data set that describes the available devices. For each interation, an instance of the Roomba model is created and an associated process added to the simulation environment.

import simpy 
import pandas as pd

data_log = []

def roomba_model(id, charge_time, clean_time):
    while True:
        tic = env.now
        yield env.timeout(charge_time)
        toc = env.now
        data_log.append([id, "charging", tic, toc])
   
        tic = env.now
        yield env.timeout(clean_time)
        toc = env.now
        data_log.append([id, "cleaning", tic, toc])
        
env = simpy.Environment()

for r in roomba_df.index:
    env.process(roomba_model(roomba_df["id"][r], roomba_df["charge_time"][r], roomba_df["clean_time"][r]))

env.run(until=16)
df = pd.DataFrame(data_log, columns=["id", "event", "begin", "end"])
display(df)
id event begin end
0 B charging 0.0 0.5
1 E charging 0.0 0.5
2 C charging 0.0 0.8
3 A charging 0.0 1.0
4 D charging 0.0 1.4
5 E cleaning 0.5 1.7
6 B cleaning 0.5 2.0
7 E charging 1.7 2.2
8 B charging 2.0 2.5
9 C cleaning 0.8 2.8
10 E cleaning 2.2 3.4
11 A cleaning 1.0 3.5
12 C charging 2.8 3.6
13 E charging 3.4 3.9
14 B cleaning 2.5 4.0
15 A charging 3.5 4.5
16 B charging 4.0 4.5
17 D cleaning 1.4 4.9
18 E cleaning 3.9 5.1
19 C cleaning 3.6 5.6
20 E charging 5.1 5.6
21 B cleaning 4.5 6.0
22 D charging 4.9 6.3
23 C charging 5.6 6.4
24 B charging 6.0 6.5
25 E cleaning 5.6 6.8
26 A cleaning 4.5 7.0
27 E charging 6.8 7.3
28 B cleaning 6.5 8.0
29 A charging 7.0 8.0
30 C cleaning 6.4 8.4
31 E cleaning 7.3 8.5
32 B charging 8.0 8.5
33 E charging 8.5 9.0
34 C charging 8.4 9.2
35 D cleaning 6.3 9.8
36 B cleaning 8.5 10.0
37 E cleaning 9.0 10.2
38 A cleaning 8.0 10.5
39 B charging 10.0 10.5
40 E charging 10.2 10.7
41 C cleaning 9.2 11.2
42 D charging 9.8 11.2
43 A charging 10.5 11.5
44 E cleaning 10.7 11.9
45 B cleaning 10.5 12.0
46 C charging 11.2 12.0
47 E charging 11.9 12.4
48 B charging 12.0 12.5
49 E cleaning 12.4 13.6
50 A cleaning 11.5 14.0
51 C cleaning 12.0 14.0
52 B cleaning 12.5 14.0
53 E charging 13.6 14.1
54 B charging 14.0 14.5
55 D cleaning 11.2 14.7
56 C charging 14.0 14.8
57 A charging 14.0 15.0
58 E cleaning 14.1 15.3
59 E charging 15.3 15.8

Extracting data from data logs#

As we can see, one property of discrete-event simulations is that they can produce voluminous amounts of data. Direct examination of the data log is generally difficult, and not well suited to measuring system performance. In this section we demonstrate several methods for processing the data log to determine if the performance target of 48 cleaning hours is being acheived.

Writing your own methods#

# get a sorted list of unique roomba names. This is extracted from the data log
roombas = list(set(df["id"]))
roombas.sort()

# total charge time
total_charge_time = 0.0
for r in roombas:
    idx = (df["id"]==r) & (df["event"]=="charging")
    dt = df[idx]["end"] - df[idx]["begin"]
    total_charge_time += sum(dt)

# total clean time
total_clean_time = 0.0
for r in roombas:
    idx = (df["id"]==r) & (df["event"]=="cleaning")
    dt = df[idx]["end"] - df[idx]["begin"]
    total_clean_time  += sum(dt)

print("total charge time =", total_charge_time)
print("total clean time =", total_clean_time)
total charge time = 23.0
total clean time = 51.8

Cut and Paste into Google Sheets#

Cut and paste the data log into a Google Sheet. Use the pivot table feature to compute the total charge and total cleam times.

Using Pandas pivot tables#

The Pandas library includes an extraordinarily useful pivot_table() function that can be used to process many different kinds of data that would normally be displayed as spreadsheets. The following cell demonstrates the use of pivot_table() to compute the total charging and cleaning time found in the simulation.

# import numpy to gain access to a sum function
import numpy as np

# add a new column to the DataFrame computing the desired data
df["time"] = df["end"] - df["begin"]

# create a pivot table that sums up the time spent on each event type.
pd.pivot_table(df, index=["event"], values="time", aggfunc={"time":np.sum} )
time
event
charging 23.0
cleaning 51.8

Introducing shared resources#

A charging station is an example of a shared resource. There are three types of resources that can be modeled in Simpy:

  • Resource Resources that can only used by a limited number of processes at a time.

  • Stores Resources that can store or release Python objects.

  • Containers Resources that model the production and consumption of bulk goods.

In this example, charging stations are an example of Resources.

Charging stations as shared resources#

The next cell shows how a charging station can be incorporated into the simulation as a shared resource. Three changes required:

  • Create a simply.Resource() by specifying the simulation environment and resource capacity.

  • The Roomba model must create request for use of a charger. The request is specified in a subsequent yield statement. Execution resumes once a charger becomes available.

  • When finished with a charger, the model must release the charger so that it is available for use in other model instances.

The following cell implements charger system with a capacity of 1. Examine the data log to verify that only one Roomba is charging at any point in time.

import simpy 
import pandas as pd

data_log = []

def roomba_model(id, charge_time, clean_time):
    while True:
        request = chargers.request()
        yield request
        tic = env.now
        yield env.timeout(charge_time)
        chargers.release(request)
        toc = env.now
        data_log.append([id, "charging", tic, toc])
   
        tic = env.now
        yield env.timeout(clean_time)
        toc = env.now
        data_log.append([id, "cleaning", tic, toc])
        
env = simpy.Environment()
chargers = simpy.Resource(env, capacity=1)

for r in roomba_df.index:
    env.process(roomba_model(roomba_df["id"][r], roomba_df["charge_time"][r], roomba_df["clean_time"][r]))

env.run(until=16)
df = pd.DataFrame(data_log, columns=["id", "event", "begin", "end"])
display(df)
id event begin end
0 A charging 0.0 1.0
1 B charging 1.0 1.5
2 C charging 1.5 2.3
3 B cleaning 1.5 3.0
4 A cleaning 1.0 3.5
5 D charging 2.3 3.7
6 E charging 3.7 4.2
7 C cleaning 2.3 4.3
8 B charging 4.2 4.7
9 E cleaning 4.2 5.4
10 A charging 4.7 5.7
11 B cleaning 4.7 6.2
12 C charging 5.7 6.5
13 E charging 6.5 7.0
14 D cleaning 3.7 7.2
15 B charging 7.0 7.5
16 A cleaning 5.7 8.2
17 E cleaning 7.0 8.2
18 C cleaning 6.5 8.5
19 D charging 7.5 8.9
20 B cleaning 7.5 9.0
21 A charging 8.9 9.9
22 E charging 9.9 10.4
23 C charging 10.4 11.2
24 E cleaning 10.4 11.6
25 B charging 11.2 11.7
26 E charging 11.7 12.2
27 D cleaning 8.9 12.4
28 A cleaning 9.9 12.4
29 C cleaning 11.2 13.2
30 B cleaning 11.7 13.2
31 E cleaning 12.2 13.4
32 D charging 12.4 13.8
33 A charging 13.8 14.8
34 C charging 14.8 15.6

How many charging stations are required?#

The following cell uses the Python with statement to request a changer and release it upon completion of a code block. The advantage of this construction is that it handles the release automatically, thereby avoiding a potential source of coding errors, and also provides a visual demarcation of the code block where the charger is being use.

The cell also uses the Pandas pivot_table to report the total amount of charging used and cleaning time provided. Find the minimum number of chargers required to produce 48 hours of cleaning time in a 16 hour shift.

import simpy 
import pandas as pd
import numpy as np

data_log = []

def roomba_model(id, charge_time, clean_time):
    while True:
        with chargers.request() as request:
            yield request
            tic = env.now
            yield env.timeout(charge_time)
            toc = env.now
            data_log.append([id, "charging", tic, toc])
            
        tic = env.now
        yield env.timeout(clean_time)
        toc = env.now
        data_log.append([id, "cleaning", tic, toc])
        
env = simpy.Environment()
chargers = simpy.Resource(env, capacity=1)

for r in roomba_df.index:
    env.process(roomba_model(roomba_df["id"][r], roomba_df["charge_time"][r], roomba_df["clean_time"][r]))

env.run(until=16)
df = pd.DataFrame(data_log, columns=["id", "event", "begin", "end"])

df["time"] = df["end"] - df["begin"]
pd.pivot_table(df, index=["event"], values="time", aggfunc={"time":np.sum} )
time
event
charging 15.4
cleaning 31.3

Exercises#

Exercise 1.#

Answer the question posed above: How many charging stations are needed to provide 48 hours cleaning services in the overnight shift?

Exercise 2.#

Modify the model to assume the changers are fully charged at the start of the cleaning shift. Does that reduce the number of chargers required?

Exercise 3.#

Assume each Roomba needs to dispose of waste after 20 minutes of cleaning, that it takes 5 minutes to dispose of the waste, and requires access to a waste disposal station.

Hints:

  • You will need to create a log a new event called ‘waste disposal’.

  • Model the waste disposal station as a shared resource.

  • You may need to make some decisions on how to handle the waste at the end of a cleaning cycle. Don’t get too bogged down, just make some reasonable assumptions. We’ll address this issue in the next class.

Exercise 4. (optional)#

Create a function that accepts df (the data log after converting to a pandas DataFrame) and, for every Roomba, shows a time line of events.

%matplotlib inline

import matplotlib.pyplot as plt
from matplotlib.lines import Line2D

def gantt(df, lw=10):
    
    # create sorted lists of the unique ids and events appearing in the data log
    ids = sorted(list(set(df["id"])))
    events = sorted(list(set(df["event"])))
    
    # create list of unique colors for each event
    colors = [f"C{i}" for i in range(len(events))]
    
    # create plot window
    fig, ax = plt.subplots(1, 1, figsize=(10, 3))
    
    # for each event and id, find entries in the data log and plot the begin and end points
    for i, event in enumerate(events):
        for j, id in enumerate(ids):  
            for k in df[(df["id"]==id) & (df["event"]==event)].index:
                ax.plot([df["begin"][k], df["end"][k]], [j, j], colors[i], solid_capstyle="butt", lw=lw)
                
    # create legend
    lines = [Line2D([0], [0], lw=lw, color=colors[i]) for i in range(len(events))]
    ax.legend(lines, events, bbox_to_anchor=(1.05, 1.0), loc="upper left")
    
    # annotate the axes
    ax.set_yticks(range(len(ids)))
    ax.set_yticklabels(ids)
    ax.grid(True)
    ax.set_xlabel("Time")
    ax.set_title("Gannt Chart")
    for sp in ['top', 'bottom', 'right', 'left']:
        ax.spines[sp].set_visible(False)
        
gantt(df)
../../_images/02-agent-based-models_28_0.png