Prescriptive Analytics in the Python Ecosystem with Gurobi

PyConDE & PyData Berlin 2024

Robert Luce

Principal Developer, Gurobi Optimization

Agenda

  • What is Gurobi?
  • Prescriptive Analytics use cases
  • What is a mathematical optimization problem?
  • Project-Team assignment with gurobipy-pandas
  • Data-driven optimization examples: Gurobi OptiMods
  • Adversarial ML

What is Gurobi?

What is Gurobi?

  • Commercial solver engine for mathematical optimization problems
  • Founded in 2008 by Bob Bixby, Zonghao Gu, Ed Rothberg
  • Deploy on laptop, on-prem DC, cloud, or use our SaaS product
  • Free for academia and non-profit
  • Descriptive and Predictive Analytics: Based on what happenend in the past, forecast the future!
  • Prescriptive Analytics: Based on the forecast, decide what to do!
  • Gurobi is a prescriptive analytics tool

Who uses Gurobi?

  • Used by 3000+ global customers across 40+ industries
  • 60+ customer case studies on our website

Gurobi is the Engine

  • Gurobi is a generic math optimization solver
  • Core libraries implemented in C
  • No application- or domain-specific API, no GUI
  • Customers integrate it in their business applications (or have partners to do so)

How to use Gurobi?

  • pip install gurobipy
  • APIs for Python, C, C++, Java, C#, R, Matlab
  • Links to standard mathematical modeling languages: AIMMS, AMPL, GAMS, MPL
  • Third-party frameworks (not officially supported by Gurobi): cvxpy, Pyomo, PuLP, JuMP (Julia), Google OR-Tools, etc.

Prescriptive Analytics Use Cases

National Football League (NFL)

  • Create game schedule for the whole season
  • From boards to computers
  • Many complicated constraints:
    • broadcast time slots and rights
    • free-agents
    • away-game limitations
  • Decomposition and parallelization approach

Air France

  • Tail assignment problem:
    • assigning sequence of flights to each individual aircraft
    • while respecting operational constraints
    • multiple objectives: fleet utilization, on-time performance, fuel consumption, operational costs, preferential assignments
  • Schedules are re-built every day (disruptions, etc.)
  • Short-haul and medium-haul fleets are more difficult to schedule than long-haul fleet

Verge

  • Optimizing seasonal farm field operations
  • Path Planner to find route covering the whole field
  • Minimize time, distance, fuel consumption
  • Avoid overlaps and obstacles
  • Headland management: boundaries, turnrow

What is a mathematical optimization problem?

Key components of optimization problems


Decision variables

  • Should I stock cat litter boxes in fulfilment center Ludwigsfelde from hub Schkeuditz? (shipping)
  • How much should I invest into asset NVDA? (portfolio optmimization)
  • Should I assign Robert’s talk to room B09 in the Wed 10:30 time slot? (scheduling)

Key components of optimization problems


An objective function measuring the KPIs of interest

  • Minimize total transportation cost (shipping)
  • Minimize total risk for a portfolio of investments (portfolio optimization)
  • Maximize sum of speaker time slot preferences across the conference schedule (scheduling)

Key components of optimization problems


Constraints involving the decision variables

  • Predicted demand of litter boxes at fulfilment center must be met (Shipping)
  • Transaction should not open more than 20 new positions (portfolio optimization)
  • Each room can only host at most one session at a time (scheduling)

Which problem types can be solved by Gurobi?

  • Linear optimization problem \[\begin{align*} \min_x \quad & c^T x \\ \mbox{s.t.}\quad & A x = b\\ & x \ge 0 \end{align*}\]

  • Mixed-integer linear optimization problem: \[\begin{align*} \min_x \quad & c^T x \\ \mbox{s.t.}\quad & A x = b\\ & x \ge 0\\ & x_i \in \mathbf{Z}, i \in I \end{align*}\]

  • Mixed-integer quadratically constrained optimization problems

  • Mixed-integer nonlinear, nonconvex optimization problems

  • Many other variations…

Example: Project-Team assignment

We’re running a fantasy consulting company that is organized by a set of fixed teams \(J\), and we have a set of projects \(I\) that await to be worked on. Which team should be assigned to which project?

Data

  • Profit of completing project \(i \in I\): \(p_i\)
  • Resource requirement for project \(i \in J\): \(w_i\)
  • Capacity of team \(j \in J\): \(c_j\)

Decision variables

\(x_{ij} \in \{0,1\}\): assign project \(i\) to team \(j\)?

Objective function

Maximize profit from completed projects

Constraints

  • Don’t oversubscribe the teams
  • At most one team works on each project

\[\begin{align*} \max_{x} \quad & \sum_{i,j} p_i x_{ij}\\ & \sum_i w_i x_{ij} \le c_j \quad \mbox{for all $j$}\\ & \sum_j x_{ij} \le 1 \quad \mbox{for all $i$}\\ & x_{ij} \in \{0,1\} \end{align*}\]

Project-Team assignment with gurobipy-pandas

Input data

>>> projects.head(3)  # w_i 
      resource
      project          
      p0            1.1
      p1            1.4
      p2            1.2
>>> teams.head(3)  # c_j
      capacity
      team          
      t0         2.4
      t1         1.8
      t2         1.1


>>> project_values.head(5)
                    profit
      project team        
      p0      t4       0.4
      p1      t4       1.3
      p2      t0       1.7
              t1       1.7
              t2       1.7

Attach decision variables to data frame

import gurobipy as gp
import gurobipy_pandas as gppd
model = gp.Model()
model.ModelSense = GRB.MAXIMIZE
assignments = project_values.gppd.add_vars(
    model, vtype=GRB.BINARY, obj="profit", name="x"
)
assignments.head()  # p_ij & x_ij


              profit                      x
project team                               
p0      t4       0.4  <gurobi.Var x[p0,t4]>
p1      t4       1.3  <gurobi.Var x[p1,t4]>
p2      t0       1.7  <gurobi.Var x[p2,t0]>
        t1       1.7  <gurobi.Var x[p2,t1]>
        t2       1.7  <gurobi.Var x[p2,t2]>

Resource constraint

capacity_constraints = gppd.add_constrs(
    model,
    (projects["resource"] * assignments["x"]).groupby("team").sum(),
    GRB.LESS_EQUAL,
    teams["capacity"],
    name='capacity',
)
capacity_constraints.apply(model.getRow).head()


team
t0    1.2 x[p2,t0] + 0.9 x[p4,t0] + 1.3 x[p5,t0] + x...
t1    1.2 x[p2,t1] + 0.9 x[p4,t1] + 1.3 x[p5,t1] + x...
t2    1.2 x[p2,t2] + 0.9 x[p4,t2] + 1.3 x[p5,t2] + x...
t3    1.2 x[p2,t3] + 0.9 x[p4,t3] + 1.3 x[p5,t3] + x...
t4    1.1 x[p0,t4] + 1.4 x[p1,t4] + 1.2 x[p2,t4] + 1...
Name: capacity, dtype: object

Assign each project at most once

allocate_once = gppd.add_constrs(
    model,
    assignments['x'].groupby('project').sum(),
    GRB.LESS_EQUAL,
    1.0,
    name="allocate_once",
)
allocate_once.apply(model.getRow).head()


project
p0                                              x[p0,t4]
p1                                              x[p1,t4]
p10    x[p10,t0] + x[p10,t1] + x[p10,t2] + x[p10,t3] ...
p11    x[p11,t0] + x[p11,t1] + x[p11,t2] + x[p11,t3] ...
p12                                x[p12,t3] + x[p12,t4]
Name: allocate_once, dtype: object

Solve and query solution

model.optimize()
(
    assignments["x"].gppd.X.to_frame()
    .query("x >= 0.9").reset_index()
    .groupby("team").agg({"project": list})
)
              project
team                 
t0           [p4, p5]
t1               [p2]
t2              [p11]
t3          [p6, p29]
t4    [p14, p15, p26]



https://github.com/Gurobi/gurobipy-pandas

Data-driven optimization examples:
Gurobi OptiMods

An OptiMod …

  • Is a tool to solve a specific, practical problem
  • Has a data-driven API for a common optimization problem
  • Takes data in “natural” form, returns a solution in “natural” form
  • Solves a mathematical optimization problem using Gurobi technology

Least absolute value (LAD) regression

from sklearn import datasets
from sklearn.model_selection import train_test_split

from gurobi_optimods.regression import LADRegression

# Load the diabetes dataset
diabetes = datasets.load_diabetes()

# Split data for fit assessment
X_train, X_test, y_train, y_test = train_test_split(
    diabetes["data"], diabetes["target"], random_state=42
)

# Fit model and obtain predictions
lad = LADRegression()
lad.fit(X_train, y_train)
y_pred = lad.predict(X_test)

(LAD is more robust than ordinary linear regression w.r.t. outliers )

Use trained regressors as constraints:
Gurobi-ML

Gurobi ML

A package to use trained regression models in mathematical optimization models.

  • Input variable vector \(x\)
  • Output variable vector \(y\)
  • Trained regressor \(f\)
  • Use like a constraint \(y = f(x)\)
  • Optimize model over input and output variables ensuring that input is mapped to output.


Supported APIs: sklearn, Keras, PyTorch, XGBoost, LightGBM

Example: Adversarial ML

Given a trained network, how robust is the classification w.r.t. noise?


Classified as “4”

Classified as “9”

Summary & Takeaways

  • Gurobi is a mathematical optimization solver
  • Mathematical optimization is a powerful, industry agnostic tool
  • pip install gurobipy-pandas

Thanks!