The art of not making it an art

Applying proven methodology to model engineering

Ronald van der Velden

Introduction

Can we learn from software engineering?

Five principles from Dave Farley:

  • Iterative: Improve and refine something that works
  • Feedback: Know if you move in the right direction
  • Incremental: Add features while keeping a solid foundation
  • Experimental: Learn from mistakes, instead of always being right
  • Empirical: Emphasize practical impact

Can we learn from software engineering?

But we are different, right?!

  • Iterative: “I first need the full data model, then the full optimization function and only then we can test”
  • Feedback: “There’s no point in testing yet, because constraints are missing”
  • Incremental: “Actually I need to build the full model before I can run anything”
  • Experimental: “If we don’t know the full model right from the start, we definitely need to throw everything away”
  • Empirical: “Models can only really be tested once embedded in full-scale applications”

About this talk

  • Best practices
  • Glimpse into the Gurobi API
  • Useful 3rd-party tools
  • Sample model, step-by-step

About the puzzle

  • Power plants
  • Hourly demand forecast
  • Create production schedule
  • Respect min/max capacity
  • Rules for switching on/off and ramping up/down the production at each plant
  • Minimize costs: health, (fixed) operating, fuel, startup/teardown

1 Identify the core of the model

What is the smallest model that we can implement and solve?

Starting small

  • Most important decision: Production per hour/plant
  • Minimum input data: Demand, capacity
  • Required data: Plants, hourly demand
  • (Easiest objective: Health and fuel cost)

2 Write down your mathematical model

Document the most important decision variable, constraint and objective

Initial model

  • Set \(P (P_N)\): set of (nuclear) power plants
  • Set \(H\): hours of the day
  • Variable \(z_{p,h}\): amount of power generated in power plant \(p\) during hour \(h\)
  • Parameter \(d_h\): power demand for each hour \(h\)
  • Parameter \(c_p\): capacity of each plant \(p\)
  • Constraint \(\sum_{p \in P} z_{p,h} = d_h \: \forall h \in H\) ensures generated power covers demand
  • Constraint \(z_{p,h} \leq c_p \: \forall p \in P, h \in H\) ensures we don’t exceed plant capacity

Tip

The model is the blueprint for your algorithm. Carefully checking this before writing code, is an extra check upfront. Documenting the model in mathematical form also simplifies extensions.

3 Collect at least one realistic dataset

Make sure data is realistic in size and structure

Real power data

We will use data from the State of Georgia (USA), for 99 power plants and 10.8 million people. Data is aggregated into 10 groups of plants.

Tip

The data is essential for proper testing. Mathematical optimization solvers may behave very differently on randomly generated data. Also performance does not scale linearly with the size of your model.

4 Design your model like a tiny black box

One function or class, turning a problem into a solution

Minimalistic template

For any optimization project, I start with:

problem = read_problem('path')
parameters = Parameters()
solver = Solver(parameters)
solution = solver.solve(problem)

Note

The mathematical model will be completely encapsulated in the solver, invisible to the outside world. In fact, the solver may use multiple models, or any other search strategy.

Problem definition

from dataclasses import dataclass, field

@dataclass(frozen=True, repr=False, order=True)
class Plant:
    id: str
    type: str
    fuel_type: str
    capacity: float
    
    def __repr__(self):
        return self.id

@dataclass(frozen=True)
class Problem:
    demand: dict[int, float]
    plants: set[Plant]    

Reading inputs

import pandas as pd
from os import path
def read_problem(folder: str) -> Problem:
  df_load_curves = pd.read_csv(path.join(folder, 'demand.csv'))
  df_subset = df_load_curves[(df_load_curves['YEAR']==2011)&(df_load_curves['MONTH']==7)&(df_load_curves['DAY']==1)] 
  demand = df_subset.set_index(['HOUR']).LOAD.to_dict()
  
  df_plant_info = pd.read_csv(path.join(folder, 'plant_capacities.csv'))
  P = set(df_plant_info['Plant'].unique())
  plant_type = df_plant_info.set_index('Plant').PlantType.to_dict()
  fuel_type = df_plant_info.set_index('Plant').FuelType.to_dict()
  capacity = df_plant_info.set_index('Plant').Capacity.to_dict()

  plants = set(
      Plant(p, plant_type[p], fuel_type[p], capacity[p]) 
      for p in P
  )
  return Problem(demand, plants)

Solution

@dataclass
class Solution:
    problem: Problem
    power: dict[tuple[Plant, int], float]

Solver

import gurobipy as gp
class Solver:
  def __init__(self, parameters: Parameters):
    self.parameters = parameters

  def solve(self, problem: Problem) -> Solution:
    with gp.Model("powergeneration") as model:
      P = problem.plants
      H = set(problem.demand.keys())
      
      z = model.addVars(P, H, name="z", lb=0)
      model.addConstrs((z.sum('*',h) == problem.demand[h] for h in H),
        name='demand')
      model.addConstrs((z[p,h] <= p.capacity for p in P for h in H),
        name='capacity')
      model.optimize()
      
      power = { (p,h): z[p,h].X for (p,h) in z }
      return Solution(problem, power)

Done, right?

Set parameter WLSAccessID
Set parameter WLSSecret
Set parameter LicenseID to value X
WLS license X - registered to X
Gurobi Optimizer version 12.0.0 build v12.0.0rc1 (mac64[arm] - Darwin 23.6.0 23G93)

CPU model: Apple M1
Thread count: 8 physical cores, 8 logical processors, using up to 8 threads

WLS license X - registered to X
Optimize a model with 264 rows, 240 columns and 480 nonzeros
Model fingerprint: 0xd1470513
Coefficient statistics:
  Matrix range     [1e+00, 1e+00]
  Objective range  [0e+00, 0e+00]
  Bounds range     [0e+00, 0e+00]
  RHS range        [8e+01, 1e+04]
Presolve removed 264 rows and 240 columns
Presolve time: 0.00s
Presolve: All rows and columns removed
Iteration    Objective       Primal Inf.    Dual Inf.      Time
       0    0.0000000e+00   0.000000e+00   0.000000e+00      0s

Solved in 0 iterations and 0.00 seconds (0.00 work units)
Optimal objective  0.000000000e+00

Really?

  • What is the solution to the puzzle?
  • Does the solution make practical sense?
  • Can we judge if the solution is optimal or not?
  • If we now make a change and get the same log, are we happy?

5 Visualize your decisions

Focus on the choices made by your model, not just the consequences

Pandas

Pandas is familiar for most Python developers (especially data scientists) and great for analyzing optimization solutions.

from dataclasses import asdict
class Solution:
    def get_dataframe(self) -> pd.DataFrame:                
        plants = pd.DataFrame.from_records(
          [asdict(plant) for plant in self.problem.plants]
        ).rename(columns={'id':'plant'})
        hours = pd.DataFrame({'hour': self.problem.demand.keys()})
        df_problem = plants.merge(hours, how='cross')
        df_problem = df_problem.set_index(['plant', 'hour'])
        df_solution = pd.DataFrame({'power': self.power})
        df_solution.index.names = ['plant', 'hour']
        df_solution = df_solution.rename(index=
          lambda pair: pair.id, level=0)
        result = df_problem.join(df_solution)        
        return result

Pandas output

type fuel_type capacity power
(‘BIOMASS’, 22) BIOMASS BLQ 470.96 470.96
(‘BIOMASS’, 23) BIOMASS BLQ 470.96 470.96
(‘BIOMASS’, 24) BIOMASS BLQ 470.96 0
(‘Bowen’, 1) COAL BIT 2622.37 2622.37
(‘Bowen’, 2) COAL BIT 2622.37 0
(‘Bowen’, 3) COAL BIT 2622.37 0
(‘Bowen’, 4) COAL BIT 2622.37 1494.82

Tip

With the data above, we can easily sum by hour to check power against capacity. We can also do a row-wise check to compare power against capacity.

Model export

Export the full mathematical model from Gurobi:

model.write('power.lp')

Examine the output in a text editor:

Minimize

Subject To
 demand[1]: z[Edwin_I_Hatch,1] + z[OIL,1] + z[Jack_McDonough,1]
   + z[BIOMASS,1] + z[Bowen,1] + z[HYDRO,1] + z[GAS,1] 
   + z[OTHER_COAL,1] + z[Scherer,1] + z[Vogtle,1] = 11212.28914
 demand[2]: z[Edwin_I_Hatch,2] + z[OIL,2] + z[Jack_McDonough,2]
   + z[BIOMASS,2] + z[Bowen,2] + z[HYDRO,2] + z[GAS,2] 
   + z[OTHER_COAL,2] + z[Scherer,2] + z[Vogtle,2] = 10144.43032
   ...
 capacity[Edwin_I_Hatch,1]: z[Edwin_I_Hatch,1] <= 1626.41228
 capacity[Edwin_I_Hatch,2]: z[Edwin_I_Hatch,2] <= 1626.41228
   ...   

6 Implement one aspect at a time

Extend the problem all the way through to the solution

A little bit of everything

  • Read new bits of input data
  • Handle the new data in the solver (your model)
  • Add relevant information to the solution data
  • Make sure your code works again
  • Extend your tests and visualization

Model and problem

Lets add fuel cost and health cost to the problem and minimize these as an objective.

Model formulation

  • Parameter \(f_p\): fuel cost for power plant \(p\)
  • Parameter \(a_{p,h}\): health cost for power plant \(p\) in hour \(h\),
  • Objective \(\sum_{p,h} (f_p + a_{p,h}) z_{p,h}\) to minimize health and fuel cost

Problem representation

@dataclass(frozen=True, repr=False, order=True)
class Plant:
    id: str
    type: str
    fuel_type: str
    capacity: float
    fuel_cost: float    
    health_cost: dict[int, float] = field(hash=False)

Reading inputs

# Fuel cost per fuel type
df_fuel_costs = pd.read_csv(path.join(folder, 'fuel_costs.csv'))
dict_fuel_costs = df_fuel_costs[df_fuel_costs['year']==2011].T.to_dict()[9]

# Health cost for each plant/hour
H = demand.keys()
df_health_costs = pd.read_csv(path.join(folder, 'health_costs.csv'))
dict_health_costs = df_health_costs[(df_health_costs['Year']==2007)&(df_health_costs['Day']==1)].set_index(['Plant','Hour']).to_dict()['Cost'] # Health cost/MWh (plant)
dict_health_costs.update({(p,h): 0 for p in P for h in H if p not in ['Bowen','Jack McDonough','Scherer']})

plants = set(
  Plant(
    (...)
    dict_fuel_costs[fuel_type[p]],
    { h: dict_health_costs[p, h] for h in H }
  ) for p in P
)

Solver

with gp.Model("powergeneration") as model:
  P = problem.plants
  H = set(problem.demand.keys())
  PH = [(p,h) for p in P for h in H]
  z = model.addVars(P, H, name="z", lb=0)
  model.addConstrs((z.sum('*',h) == problem.demand[h] for h in H),
    name='demand')
  model.addConstrs((z[p,h] <= p.capacity for (p,h) in PH),
    name='capacity')
  objective = gp.quicksum(p.fuel_cost*z[p,h] for (p,h) in PH)
  objective += gp.quicksum(p.health_cost[h]*z[p,h] for (p,h) in PH)
  model.setObjective(objective, sense=GRB.MINIMIZE)
  model.optimize()  

Solution

def get_dataframe(self) -> pd.DataFrame:                
    plants = pd.DataFrame.from_records(
      [asdict(plant) for plant in self.problem.plants]).rename(columns={'id':'plant'})
    hours = pd.DataFrame({'hour': self.problem.demand.keys()})
    df_problem = plants.merge(hours, how='cross')
    df_problem = df_problem.set_index(['plant', 'hour'])
    df_problem['health_cost'] = df_problem.apply(
      lambda row: row.health_cost[row.name[1]], axis=1)
    df_solution = pd.DataFrame({'power': self.power})
    df_solution.index.names = ['plant', 'hour']
    df_solution = df_solution.rename(index=
    lambda pair: pair.id, level=0)
    result = df_problem.join(df_solution)        
    result['total_fuel_cost'] = result['fuel_cost'] * result['power']
    result['total_health_cost'] = result['health_cost'] * result['power']
    result['total_cost'] = result['total_fuel_cost'] + result['total_health_cost']
    return result

Validation

Using Pandas

Our updated DataFrame now contains cost components, both per unit of power generated and total. This will be extended later to include the other cost factors.

fuel_cost health_cost power total_fuel_cost total_health_cost total_cost
(‘Scherer’, 6) 18.39 3.56861 2860.11 52597.4 10206.6 62804
(‘Scherer’, 7) 18.39 2.96348 2860.11 52597.4 8475.88 61073.3
(‘Scherer’, 8) 18.39 2.29133 2779.39 51112.9 6368.49 57481.4
(‘Scherer’, 9) 18.39 2.02691 2807.62 51632.2 5690.79 57322.9
(‘Scherer’, 10) 18.39 1.5784 2860.11 52597.4 4514.39 57111.8
(‘Scherer’, 11) 18.39 1.60359 2860.11 52597.4 4586.45 57183.9

Validation

Using Plotly Express

import plotly.express as px
df_melted = df.reset_index().melt(
   id_vars=['plant', 'hour'],
   value_vars=['total_fuel_cost', 'total_health_cost'],
   var_name='component'
)
fig = px.bar(
   df_melted.groupby(['plant', 'component']).sum().reset_index(),
   x='plant', y='value', color='component',
   title='Cost per plant', width=1000, height=400)

7 Make sure your code always runs

Go from one working version to the next

Typical Gurobi support situation

# Commented out because Q is undefined, need to investigate
# d = model.addVars(P, Q, name="d", lb=0)

# TODO: Read demand from input files
model.addConstrs((z.sum('*',h) == demand[h] for h in H))

# --> I think this line doesn't work properly, help!
model.addConstrs((v[p,h] <= u[p,h]) for p in P for h in H)  

model.optimize()

Note

The code we showed earlier is minimal, but works just fine. It is a solid starting point for future development of additional model aspects.

Model

Suppose we want to add a minimum amount of power generated when a plant is on. We also want to consider fixed operating cost per plant.

  • Parameter \(m_p\): minimum % power that must be generated from power plant \(p\)
  • Parameter \(o_p\): fixed, hourly operating cost for power plant \(p\)
  • Variable \(u_{p,h}\): binary indicator of whether plant \(p\) is on during hour \(h\)
  • Constraint \(z_{p,h} \leq c_p u_{p,h} \: \forall p \in P\): respect maximum plant capacity when used
  • Constraint \(z_{p,h} \geq m_p c_p u_{p,h} \: \forall p \in P\): respect minimum plant power when on
  • Objective \(\sum_{p,h} o_p u_{p,h}\) to minimize fixed operating cost

Input data

@dataclass(frozen=True, repr=False)
class Plant:
  id: str
  type: str
  fuel_type: str
  capacity: float
  fuel_cost: float
  health_cost: dict[int, float] = field(hash=False)
  operating_cost: float
      
  def is_nuclear(self):
    return self.type == 'NUCLEAR'
  
  def get_min_generation_fraction(self):
    return 0.8 if self.is_nuclear() else 0.01

Reading inputs

# Operating cost (per MWh) per fuel type
df_oper_costs = pd.read_csv(path.join(folder, 'operating_costs.csv'))
dict_oper_costs = df_oper_costs[df_oper_costs['year']==2011]
  .T.to_dict()[9]

plants = set(
  Plant(
    (...)
    dict_oper_costs[fuel_type[p]]
  ) for p in P
)
return Problem(demand, plants)

8 Add automated tests using a test framework

Construct tiny problems manually that challenge individual constraints.

Designing a test

  • Nuclear plants produce at least 80% of their capacity
  • Need a scenario where it’s attractive but forbidden to use the nuclear plant
    • Attractive: Give it a lower cost than a second plant
    • Forbidden: Create demand that is less than 80% of the nuclear capacity
  • So we need a scenario with (at least) one hour of demand and two plants
    • Say demand is 20, nuclear capacity 50 - nuclear should not be chosen
    • Add a second hour with demand 40 - nuclear should be preferred

Automated test

import unittest
from ucp import Parameters, Plant, Problem, Solver

class TestCapacityConstraints(unittest.TestCase):
  def test_min_power(self):
    demand = { 0: 20, 1: 40 }
    health_cost = { 0: 0, 1: 0 }
    plants = [
        Plant('Free', 'COAL', '', 100, 3, health_cost, 0),
        Plant('Limited', 'NUCLEAR', '', 50, 1, health_cost, 0)
    ]
    problem = Problem(demand, plants)
    parameters = Parameters()
    solver = Solver(parameters)
    solution = solver.solve(problem)
    self.assertEqual(20, solution.power[plants[0], 0])
    self.assertEqual(40, solution.power[plants[1], 1])

if __name__ == '__main__':
    unittest.main()

Solver

z = model.addVars(P, H, name="z", lb=0)             
u = model.addVars(P, H, name="u", vtype=GRB.BINARY)

model.addConstrs((z.sum('*', h) == problem.demand[h]) for h in H,
  name='demand')  
model.addConstrs((z[p,h] <= p.capacity*u[p,h] for (p,h) in PH),
  name='max_capacity')
model.addConstrs((z[p,h] >= 
  p.get_min_generation_fraction()*p.capacity*u[p,h] for (p,h) in PH),
  name='min_capacity')  

objective = gp.quicksum(p.fuel_cost*z[p,h] for (p,h) in PH)
objective += gp.quicksum(p.health_cost[h]*z[p,h] for (p,h) in PH)
objective += gp.quicksum(p.operating_cost*u[p,h] for (p,h) in PH)
model.setObjective(objective, sense=GRB.MINIMIZE)

model.optimize()
power = { (p,h): z[p,h].X for (p,h) in z }
active = { (p,h): u[p,h].X>.5 for (p,h) in u }
return Solution(problem, power, active)

Solution

@dataclass
class Solution:
    problem: Problem
    power: dict[tuple[Plant, int], float]
    active: dict[tuple[Plant, int], float]

Pandas

def get_dataframe(self) -> pd.DataFrame:                
  (...)
  df_solution = pd.DataFrame({
    'power': self.power, 
    'active':self.active
  })
  (...)
  result = df_problem.join(df_solution)
  result['total_fuel_cost'] = result['fuel_cost'] * result['power']
  result['total_health_cost'] = 
    result['health_cost'] * result['power']
  result['total_operating_cost'] = 
    result['operating_cost'] * result['active']
  result['total_cost'] = result['total_fuel_cost'] 
    + result['total_health_cost'] + result['total_operating_cost']
  return result

Pandas output

Note how the amount of power changes quite rapidly in hours 1-3 and 13-15 for plants of type OTHER COAL:

hour operating_cost active power total_operating_cost capacity
1 7.3845 True 2060.91 7.3845 4396.83
2 7.3845 True 993.053 7.3845 4396.83
3 7.3845 False 0 0 4396.83
4 7.3845 False 0 0 4396.83
12 7.3845 False 0 0 4396.83
13 7.3845 False 0 0 4396.83
14 7.3845 True 991.383 7.3845 4396.83
15 7.3845 True 2180.32 7.3845 4396.83
16 7.3845 True 2638.4 7.3845 4396.83

Ramp up/down

Model formulation

  • Parameter \(r_p\): ramp up/ramp down speed for power plant \(p\)
  • Constraint \(-r_p c_p \leq z_{p,h} - z_{p,h-1} \leq r_p c_p \: \forall p \in P \: \forall h \in H\): respect maximum ramp-up and ramp-down
  • Bonus constraint \(z_{p,h} \geq m_p c_p \: \forall p \in P_N\): nuclear plants must be always on

Problem data model

@dataclass(frozen=True, repr=False, order=True)
class Plant:
  def get_ramp_factor(self):
    if self.type in ['BIOMASS','GAS','HYDRO','OIL']:
      return 1
    if self.is_nuclear():
      return .2
    return .25

Ramp up/down

Solver

with gp.Model("powergeneration") as model:
  (...)
  model.addConstrs((z[p,h]-z[p,h-1] >= -p.get_ramp_factor()*p.capacity) 
    for (p,h) in PH if h>1)  
  model.addConstrs((z[p,h]-z[p,h-1] <= p.get_ramp_factor()*p.capacity) 
    for (p,h) in PH if h>1)  
  model.addConstrs((z[p,h] >= 
    p.get_min_generation_fraction()*p.capacity) 
    for p in P if p.is_nuclear() for h in H)     

Ramp up/down

Note how the OTHER COAL ramp up/down is now correct (steps are never more than 25% of capacity), but we shutdown the GAS plants for only two hours:

hour GAS OTHER COAL
1 1761.31 2060.91
2 1761.31 993.053
3 1590.96 0
4 902.57 0
5 598.1 0
6 453.394 0
7 86.3351 0
8 0 0
9 0 0
10 292.806 0
hour GAS OTHER COAL
11 543.982 0
12 786.906 0
13 1472.95 0
14 1671.57 1081.12
15 1761.31 2180.32
16 1761.31 2638.4

Startup and teardown

Model formulation

  • Parameters \(s_p, t_p\): startup and teardown costs for power plant \(p\)
  • Variables \(v_{p,h}, w_{p,h}\): indicators of whether plant \(p\) is started up or shut down, respectively, for hour \(h\)
  • Constraint \(v_{p,h} \leq u_{p,h} \: \forall p \in P \: \forall h \in H\): plant is on when switched on
  • Constraint \(w_{p,h} \leq 1 - u_{p,h} \: \forall p \in P \: \forall h \in H\): plant is off when switched off
  • Constraint \(v_{p,h} - w_{p,h} = u_{p,h} - u_{p,h-1} \: \forall p \in P \: \forall h \in H\): link startup/teardown variables to on/off variables
  • Objective \(\sum_{p,h} (s_p v_{p,h} + t_p w_{p,h})\) to include startup/teardown cost

Startup and teardown

Problem data model

@dataclass(frozen=True, repr=False, order=True)
class Plant:
  (...)
  startup_cost: float
  teardown_cost: float

Input file reader

# Startup/teardown cost per fuel type
df_startup_costs = pd.read_csv(path.join(folder, 'startup_costs.csv'))
dict_startup_costs = df_startup_costs[df_startup_costs['year']==2011].T.to_dict()[9]

plants = set(
  Plant(
    (...)
    dict_startup_costs[fuel_type[p]],
    dict_startup_costs[fuel_type[p]],
  ) for p in P
)

Startup and teardown

Solver

with gp.Model("powergeneration") as model:
  (...)
  v = model.addVars(P, H, name="v", vtype=GRB.BINARY) # start up
  w = model.addVars(P, H, name="w", vtype=GRB.BINARY) # tear down
  (...)
  model.addConstrs((v[p,h] <= u[p,h]) for (p,h) in PH )  
  model.addConstrs((w[p,h] <= 1-u[p,h]) for (p,h) in PH )
  model.addConstrs((v[p,h] - w[p,h] == u[p,h] - u[p,h-1]) 
    for (p,h) in PH if h > 1) 
  (...)
  objective += gp.quicksum(p.startup_cost*v[p,h] for (p,h) in PH)
  objective += gp.quicksum(p.teardown_cost*w[p,h] for (p,h) in PH)
  (...)
  startup = { (p,h): v[p,h].X>.5 for (p,h) in v }
  teardown = { (p,h): w[p,h].X>.5 for (p,h) in w }
  return Solution(problem, power, active, startup, teardown)

Startup and teardown

Solution

@dataclass
class Solution:
  (...)
  startup: dict[tuple[Plant, int], bool]
  teardown: dict[tuple[Plant, int], bool]

  def get_dataframe(self) -> pd.DataFrame:                
    (...)
    df_solution = pd.DataFrame({
      ...
      'startup':self.startup,
      'teardown':self.teardown,
    })
    (...)
    result['total_startup_cost'] = result['startup_cost'] * result['startup']
    result['total_teardown_cost'] = result['teardown_cost'] * result['teardown']
    result['total_cost'] = result['total_fuel_cost'] 
      + result['total_health_cost'] + result['total_operating_cost'] 
      + result['total_startup_cost'] + result['total_teardown_cost']
    return result

Validation with Streamlit

Streamlit is nice to add a little user interface on top of your algorithm

import streamlit as st
import pandas as pd
import plotly.express as px

from final import read_dataset, Parameters, Solver
problem = read_dataset('data/small')
parameters = Parameters()
solver = Solver(parameters)
solution = solver.solve(problem)
df = solution.get_dataframe()

st.set_page_config(layout="wide")

col_demand, col_cost = st.columns(2)

with col_demand:
    st.header('Demand')
    df_demand = pd.DataFrame({'demand': pd.Series(problem.demand) })    
    st.line_chart(df_demand, width=500)

with col_cost:
    st.header('Cost summary')
    cost_columns = [col for col in df.columns if col.startswith('total_')]
    df_melted = df.reset_index().melt(
        id_vars=['plant', 'hour'],
        value_vars=cost_columns,
        var_name='component'
    )
    fig = px.bar(
        df_melted.groupby(['plant', 'component']).sum().reset_index(),
        x='plant', y='value', color='component', width=500, height=350
    )
    st.plotly_chart(fig)

tab_schedule, tab_details = st.tabs(['Schedule', 'Details'])
with tab_schedule:
    st.header('Schedule')
    schedule = pd.pivot_table(df.reset_index(), index='hour', columns='plant', values='power', aggfunc='sum')
    st.dataframe(schedule)
with tab_details:
    st.header('Details')
    st.write(df.reset_index().sort_values(['plant', 'hour']))from Pandas

Validation with Streamlit

9 Focus on correctness over performance

“Premature optimization is the root of all evil”

Correctness over performance

Typical questions:

  • Do we really need the \(v_{p,h}\) and \(w_{p,h}\) variables if we have \(u_{p,h}\)?
  • Shall we test some Gurobi parameters after the first model version?
  • Should we use column generation for the schedule per plant?
  • Can we provide a starting solution to speed up the process?

However:

  • Have you defined and implemented all requirements already?
  • Did you try Gurobi on the full, intuitive formulation first?
  • Did you discuss performance requirements with your key users?

Closing

  • Optimization work can be broken down into smaller pieces
  • Individual pieces can be implemented and tested one by one
  • Code structure helps you organize and isolate optimization logic
  • Proven tools for analysis, visualization and testing can be reused