Notebook 3.3 - Transportation Problem (Symbolic) using PuLP

   

By now we have established that explicit models are not at all practical, and always good to avoid as much as possible - whether we are formulating a model using pen and paper, or with a modelling framework.

In this notebook we will demonstrate how we can implement a symbolic model using PuLP.

image.png

As before, we are initialising the problem instance.

In [1]:
from pulp import *

prob = LpProblem('problem', LpMinimize)

This is going to be an indexed-symbolic formulation, but we still need to define the contents of our sets and the values of our parameters.

Usually, problem instances will be loaded from a separate data file, but for the purposes of these demonstrations we will be defining them using variable declarations.

We will be starting with our sets of source and depot nodes, implemented as arrays.

In [2]:
S = ['S1', 'S2', 'S3']
D = ['D1', 'D2', 'D3']

We will be now setiing up our demand and supply datasets. These will be stored as dictionaries, with Aand B containing the supply and demand values, respectively.

In [3]:
A = {'S1': 300, 'S2': 300, 'S3': 100}
B = {'D1': 200, 'D2': 200, 'D3': 300}

Note: If you need to refresh your memory on how dictionaries work, have a look on the relevant chapter from the Intermediate Python module.

We will be saving cost values using a list of lists.

In our case, we have arcs that connect every supply node will every demand node.

However, if a link had not existed, we would still have a place for it in the matrix - we would however record an infinite cost for the flow of goods between its endpoints.

The positive infinity value in Python is provided by the following statement:

float('inf')

This does not look particularly nice when embedded within other Python statements, and therefore we will create a variable, which we can conveniently call inf where we can store a positive infinity value.

In [4]:
inf = float('inf')

               # D1   D2   D3
cost_values = [[ 6,   3,   8 ], # S1
               [ 7,   5,   9 ], # S2
               [ 4,   5,   5 ]] # S3

The node IDs around the matrix are comments and not a part of the code.

Note: If you need to refresh your memory on lists of lists, have a look on the relevant chapter from the Introduction to Python module.

Before we pass these values to PuLP, we need to create a look up "table" that our code can refer while populating the PuLP model.

The cost_values list works fine for data entry. However, our model must be able to look up values within $C_{ij}$ using the two indexes $i$ (sources) and $j$ (destinations).

We can facilitate this using a dictionary of dictionaries. We do not need to type our data again - we will be using a nested for loop to achieve this.

In [5]:
C = {site:{company:cost_values[i][j] for j,company in enumerate(D)} for i,site in enumerate(S)}

Hang on a minute, what just happened here?

I know that this might be tricky to understand the fist time. This is a fine example of an inline nested for loop.

Let's see what are the contents of the C variable now:

In [6]:
C
Out[6]:
{'S1': {'D1': 6, 'D2': 3, 'D3': 8},
 'S2': {'D1': 7, 'D2': 5, 'D3': 9},
 'S3': {'D1': 4, 'D2': 5, 'D3': 5}}

The above makes sense - it is what we set out to do originally.

But how does it work? We can test this as follows - let's assume that we want to retrieve the cost of travelling between S1 and D1:

In [18]:
C['S1']['D1']
Out[18]:
6

Effectively, what the above command works as follows:

  • C is our nested dictionary

  • C['S1'] retrieves the "inner" dictionary that contains all costs between S1 and the set of D nodes

  • C['S1']['D1'] looks within the "inner" dictionary and retrieves the cost between S1 and D1

To better understand how the iniline declaration works, we can see the same dictionary could be implemented using a "traditional" for loop.

In [7]:
C1 = dict()
for i,site in enumerate(S):
    C1[site] = dict()
    for j,company in enumerate(D):
        C1[site][company] = cost_values[i][j]

Line 1: We are initialising an empty dictionary.

Line 2: We are using the enumerate function to go through the contents of the S dictionary.

Line 3: Each iteration returns a numeric index (ie. 1, 2, 3) and the side ID (ie. S1, S2, S3). We use the latter to initialise an empty dictionary for each site.

Line 4: With the 'inner' for loop we are iterating through every depot.

Line 5: We can now store the cost values in our "table". We are using the numeric indexes i and j to look up the values within the cost_values list.

In [8]:
C1
Out[8]:
{'S1': {'D1': 6, 'D2': 3, 'D3': 8},
 'S2': {'D1': 7, 'D2': 5, 'D3': 9},
 'S3': {'D1': 4, 'D2': 5, 'D3': 5}}

As you can see, C1 returns the same values as C.

Having now learned how inline loops work, we will be using one to automatically create entries for each arc within the set of arcs $E$. However, we will be skipping any combination of nodes with infinite costs - such nodes do not exist in our model instance, but it is good practice to include it in our model.

We will be using our inf variable that we established earlier for this.

In [9]:
E = [(i,j) for i in S for j in D if C[i][j] < inf]

E
Out[9]:
[('S1', 'D1'),
 ('S1', 'D2'),
 ('S1', 'D3'),
 ('S2', 'D1'),
 ('S2', 'D2'),
 ('S2', 'D3'),
 ('S3', 'D1'),
 ('S3', 'D2'),
 ('S3', 'D3')]

As we are using an indexed-symbolic model, we don't have to create our decision variables one by one.

Instead, we will be using the LpVariable.dicts() command, to create our $x_{ij}$ variables automatically using the contents of the set E.

In [10]:
x = LpVariable.dicts('x', E, lowBound=0)

As we have taken care to create indexed sets of cost and flow variables, we can do this using what looks like an algebraic operation

In [11]:
prob += lpSum([C[i][j]*x[i,j] for (i,j) in E])

The constraints are also easy to define now, and they almost look like an exact transliteration of our mathematical formulation.

In [12]:
for i in S:
    prob += lpSum([x[i,j] for j in D if (i,j) in E]) == A[i]
    
for j in D:
    prob += lpSum([x[i,j] for i in S if (i,j) in E]) == B[j]

Let us now print the model formulation. It will look slightly different to the one that we obtained in earlier notebooks, since the decision variables are autogenerated and PuLP follows its own naming convention.

In [13]:
print(prob)
problem:
MINIMIZE
6*x_('S1',_'D1') + 3*x_('S1',_'D2') + 8*x_('S1',_'D3') + 7*x_('S2',_'D1') + 5*x_('S2',_'D2') + 9*x_('S2',_'D3') + 4*x_('S3',_'D1') + 5*x_('S3',_'D2') + 5*x_('S3',_'D3') + 0
SUBJECT TO
_C1: x_('S1',_'D1') + x_('S1',_'D2') + x_('S1',_'D3') = 300

_C2: x_('S2',_'D1') + x_('S2',_'D2') + x_('S2',_'D3') = 300

_C3: x_('S3',_'D1') + x_('S3',_'D2') + x_('S3',_'D3') = 100

_C4: x_('S1',_'D1') + x_('S2',_'D1') + x_('S3',_'D1') = 200

_C5: x_('S1',_'D2') + x_('S2',_'D2') + x_('S3',_'D2') = 200

_C6: x_('S1',_'D3') + x_('S2',_'D3') + x_('S3',_'D3') = 300

VARIABLES
x_('S1',_'D1') Continuous
x_('S1',_'D2') Continuous
x_('S1',_'D3') Continuous
x_('S2',_'D1') Continuous
x_('S2',_'D2') Continuous
x_('S2',_'D3') Continuous
x_('S3',_'D1') Continuous
x_('S3',_'D2') Continuous
x_('S3',_'D3') Continuous

The above is not exactly easy on the eye, but at least it allows us to verify that it is the same exact formulation as the explicit model that we defined previously.

We can now solve our problem instance:

In [14]:
status = prob.solve()

print(f'STATUS\n{LpStatus[status]}\n')

print('SOLUTION:')
for v in prob.variables():
    print(f'\t\t{v.name} = {v.varValue}')

print('\n') # Prints a blank line
print(f'OBJECTIVE VALUE: {prob.objective.value()}')
STATUS
Optimal

SOLUTION:
		x_('S1',_'D1') = 100.0
		x_('S1',_'D2') = 200.0
		x_('S1',_'D3') = 0.0
		x_('S2',_'D1') = 100.0
		x_('S2',_'D2') = 0.0
		x_('S2',_'D3') = 200.0
		x_('S3',_'D1') = 0.0
		x_('S3',_'D2') = 0.0
		x_('S3',_'D3') = 100.0


OBJECTIVE VALUE: 4200.0

Summary

from pulp import *

inf = float('inf')

prob = LpProblem('problem', LpMinimize)

# INSTANCE DEFINITION
S = ['S1', 'S2', 'S3']
D = ['D1', 'D2', 'D3']

A = {'S1': 300, 'S2': 300, 'S3': 100}
B = {'D1': 200, 'D2': 200, 'D3': 300}

               # D1   D2   D3
cost_values = [[ 6,   3,   8 ], # S1
               [ 7,   5,   9 ], # S2
               [ 4,   5,   5 ]] # S3

# DECISION VARIABLE GENERATION
C = {site:{company:cost_values[i][j] for j,company in enumerate(D)} for i,site in enumerate(S)}

E = [(i,j) for i in S for j in D if C[i][j] < inf]

x = LpVariable.dicts('x', E, lowBound=0)

# PROBLEM FORMULATION
prob += lpSum([C[i][j]*x[i,j] for (i,j) in E])

for i in S:
    prob += lpSum([x[i,j] for j in D if (i,j) in E]) == A[i]

for j in D:
    prob += lpSum([x[i,j] for i in S if (i,j) in E]) == B[j]

# SOLUTION
status = prob.solve()

print(f'STATUS\n{LpStatus[status]}\n')

print('SOLUTION:')
for v in prob.variables():
    print(f'\t\t{v.name} = {v.varValue}')

print('\n') # Prints a blank line
print(f'OBJECTIVE VALUE: {prob.objective.value()}')