Prescriptive Analytics in the Python Ecosystem with Gurobi

PyConDE & PyData Berlin 2024

Robert Luce

Principal Developer, Gurobi Optimization

Agenda

What is Gurobi?
Prescriptive Analytics use cases
What is a mathematical optimization problem?
Project-Team assignment with gurobipy-pandas
Data-driven optimization examples: Gurobi OptiMods
Adversarial ML

What is Gurobi?

Commercial solver engine for mathematical optimization problems
Founded in 2008 by Bob Bixby, Zonghao Gu, Ed Rothberg
Deploy on laptop, on-prem DC, cloud, or use our SaaS product
Free for academia and non-profit
Descriptive and Predictive Analytics: Based on what happenend in the past, forecast the future!
Prescriptive Analytics: Based on the forecast, decide what to do!
Gurobi is a prescriptive analytics tool

Who uses Gurobi?

Used by 3000+ global customers across 40+ industries

60+ customer case studies on our website

Gurobi is the Engine

Gurobi is a generic math optimization solver
Core libraries implemented in C
No application- or domain-specific API, no GUI
Customers integrate it in their business applications (or have partners to do so)

How to use Gurobi?

pip install gurobipy

APIs for Python, C, C++, Java, C#, R, Matlab

Additional Python packages:
- gurobipy-pandas: Integration with Pandas
- Gurobi Optimods: Examples for data driven optimization interfaces
- gurobi-machinelearning: Integrate trained regressors into optimization models

Links to standard mathematical modeling languages: AIMMS, AMPL, GAMS, MPL
Third-party frameworks (not officially supported by Gurobi): cvxpy, Pyomo, PuLP, JuMP (Julia), Google OR-Tools, etc.

Prescriptive Analytics Use Cases

National Football League (NFL)

Create game schedule for the whole season
From boards to computers
Many complicated constraints:
- broadcast time slots and rights
- free-agents
- away-game limitations
Decomposition and parallelization approach

Air France

Tail assignment problem:
- assigning sequence of flights to each individual aircraft
- while respecting operational constraints
- multiple objectives: fleet utilization, on-time performance, fuel consumption, operational costs, preferential assignments
Schedules are re-built every day (disruptions, etc.)
Short-haul and medium-haul fleets are more difficult to schedule than long-haul fleet

Verge

Optimizing seasonal farm field operations
Path Planner to find route covering the whole field
Minimize time, distance, fuel consumption
Avoid overlaps and obstacles
Headland management: boundaries, turnrow

What is a mathematical optimization problem?

Key components of optimization problems

Decision variables

Should I stock cat litter boxes in fulfilment center Ludwigsfelde from hub Schkeuditz? (shipping)
How much should I invest into asset NVDA? (portfolio optmimization)
Should I assign Robert’s talk to room B09 in the Wed 10:30 time slot? (scheduling)

Key components of optimization problems

An objective function measuring the KPIs of interest

Minimize total transportation cost (shipping)
Minimize total risk for a portfolio of investments (portfolio optimization)
Maximize sum of speaker time slot preferences across the conference schedule (scheduling)

Key components of optimization problems

Constraints involving the decision variables

Predicted demand of litter boxes at fulfilment center must be met (Shipping)
Transaction should not open more than 20 new positions (portfolio optimization)
Each room can only host at most one session at a time (scheduling)

Which problem types can be solved by Gurobi?

Linear optimization problem \[\begin{align*} \min_x \quad & c^T x \\ \mbox{s.t.}\quad & A x = b\\ & x \ge 0 \end{align*}\]
Mixed-integer linear optimization problem: \[\begin{align*} \min_x \quad & c^T x \\ \mbox{s.t.}\quad & A x = b\\ & x \ge 0\\ & x_i \in \mathbf{Z}, i \in I \end{align*}\]
Mixed-integer quadratically constrained optimization problems
Mixed-integer nonlinear, nonconvex optimization problems
Many other variations…

Example: Project-Team assignment

We’re running a fantasy consulting company that is organized by a set of fixed teams $J$, and we have a set of projects $I$ that await to be worked on. Which team should be assigned to which project?

Data

Profit of completing project $i \in I$: $p_i$
Resource requirement for project $i \in J$: $w_i$
Capacity of team $j \in J$: $c_j$

Decision variables

$x_{ij} \in \{0,1\}$: assign project $i$ to team $j$?

Objective function

Maximize profit from completed projects

Constraints

Don’t oversubscribe the teams
At most one team works on each project

\[\begin{align*} \max_{x} \quad & \sum_{i,j} p_i x_{ij}\\ & \sum_i w_i x_{ij} \le c_j \quad \mbox{for all $j$}\\ & \sum_j x_{ij} \le 1 \quad \mbox{for all $i$}\\ & x_{ij} \in \{0,1\} \end{align*}\]

Project-Team assignment with gurobipy-pandas

Input data

>>> projects.head(3)  # w_i 
      resource
      project          
      p0            1.1
      p1            1.4
      p2            1.2

>>> teams.head(3)  # c_j
      capacity
      team          
      t0         2.4
      t1         1.8
      t2         1.1

>>> project_values.head(5)
                    profit
      project team        
      p0      t4       0.4
      p1      t4       1.3
      p2      t0       1.7
              t1       1.7
              t2       1.7

Attach decision variables to data frame

import gurobipy as gp
import gurobipy_pandas as gppd
model = gp.Model()
model.ModelSense = GRB.MAXIMIZE
assignments = project_values.gppd.add_vars(
    model, vtype=GRB.BINARY, obj="profit", name="x"
)
assignments.head()  # p_ij & x_ij

              profit                      x
project team                               
p0      t4       0.4  <gurobi.Var x[p0,t4]>
p1      t4       1.3  <gurobi.Var x[p1,t4]>
p2      t0       1.7  <gurobi.Var x[p2,t0]>
        t1       1.7  <gurobi.Var x[p2,t1]>
        t2       1.7  <gurobi.Var x[p2,t2]>

Resource constraint

capacity_constraints = gppd.add_constrs(
    model,
    (projects["resource"] * assignments["x"]).groupby("team").sum(),
    GRB.LESS_EQUAL,
    teams["capacity"],
    name='capacity',
)
capacity_constraints.apply(model.getRow).head()

team
t0    1.2 x[p2,t0] + 0.9 x[p4,t0] + 1.3 x[p5,t0] + x...
t1    1.2 x[p2,t1] + 0.9 x[p4,t1] + 1.3 x[p5,t1] + x...
t2    1.2 x[p2,t2] + 0.9 x[p4,t2] + 1.3 x[p5,t2] + x...
t3    1.2 x[p2,t3] + 0.9 x[p4,t3] + 1.3 x[p5,t3] + x...
t4    1.1 x[p0,t4] + 1.4 x[p1,t4] + 1.2 x[p2,t4] + 1...
Name: capacity, dtype: object

Assign each project at most once

allocate_once = gppd.add_constrs(
    model,
    assignments['x'].groupby('project').sum(),
    GRB.LESS_EQUAL,
    1.0,
    name="allocate_once",
)
allocate_once.apply(model.getRow).head()

project
p0                                              x[p0,t4]
p1                                              x[p1,t4]
p10    x[p10,t0] + x[p10,t1] + x[p10,t2] + x[p10,t3] ...
p11    x[p11,t0] + x[p11,t1] + x[p11,t2] + x[p11,t3] ...
p12                                x[p12,t3] + x[p12,t4]
Name: allocate_once, dtype: object

Solve and query solution

model.optimize()
(
    assignments["x"].gppd.X.to_frame()
    .query("x >= 0.9").reset_index()
    .groupby("team").agg({"project": list})
)

              project
team                 
t0           [p4, p5]
t1               [p2]
t2              [p11]
t3          [p6, p29]
t4    [p14, p15, p26]

https://github.com/Gurobi/gurobipy-pandas

Data-driven optimization examples:
Gurobi OptiMods

An OptiMod …

Is a tool to solve a specific, practical problem
Has a data-driven API for a common optimization problem
Takes data in “natural” form, returns a solution in “natural” form
Solves a mathematical optimization problem using Gurobi technology

Least absolute value (LAD) regression

from sklearn import datasets
from sklearn.model_selection import train_test_split

from gurobi_optimods.regression import LADRegression

# Load the diabetes dataset
diabetes = datasets.load_diabetes()

# Split data for fit assessment
X_train, X_test, y_train, y_test = train_test_split(
    diabetes["data"], diabetes["target"], random_state=42
)

# Fit model and obtain predictions
lad = LADRegression()
lad.fit(X_train, y_train)
y_pred = lad.predict(X_test)

(LAD is more robust than ordinary linear regression w.r.t. outliers )

Use trained regressors as constraints:
Gurobi-ML

Gurobi ML

A package to use trained regression models in mathematical optimization models.

Input variable vector $x$
Output variable vector $y$
Trained regressor $f$
Use like a constraint $y = f(x)$
Optimize model over input and output variables ensuring that input is mapped to output.

Supported APIs: sklearn, Keras, PyTorch, XGBoost, LightGBM

Example: Adversarial ML

Given a trained network, how robust is the classification w.r.t. noise?

Classified as “4”

Classified as “9”

Summary & Takeaways

Gurobi is a mathematical optimization solver
Mathematical optimization is a powerful, industry agnostic tool
pip install gurobipy-pandas

Thanks!

Prescriptive Analytics in the Python Ecosystem with Gurobi

Agenda

What is Gurobi?

What is Gurobi?

Who uses Gurobi?

Gurobi is the Engine

How to use Gurobi?

Prescriptive Analytics Use Cases

National Football League (NFL)

Air France

Verge

What is a mathematical optimization problem?

Key components of optimization problems

Key components of optimization problems

Key components of optimization problems

Which problem types can be solved by Gurobi?

Example: Project-Team assignment

Project-Team assignment with gurobipy-pandas

Input data

Attach decision variables to data frame

Resource constraint

Assign each project at most once

Solve and query solution

Data-driven optimization examples:Gurobi OptiMods

An OptiMod …

Least absolute value (LAD) regression

(LAD is more robust than ordinary linear regression w.r.t. outliers )

Use trained regressors as constraints:Gurobi-ML

Gurobi ML

Example: Adversarial ML

Summary & Takeaways

Data-driven optimization examples:
Gurobi OptiMods

Use trained regressors as constraints:
Gurobi-ML