Gurobi in the Python Ecosystem

Robert Luce

Principal Developer

The state of Python

After 30+ years, why is Python so successful?

Productive languge
- Clean syntax
- Dynamic (but can be pestered with type hints…)
Expansive ecosystem and community
- PyPI in 2023: 1.2+ Trillion requests, 607+ Petabytes served
- PyCon conference tour
Embraced by schools and academia
- Next generation of Python developers getting industry ready while we speak
Domain agnostic
- “Good bye!” domain specific languages

`import this`

The Zen of Python, by Tim Peters

Beautiful is better than ugly.
Explicit is better than implicit.
Simple is better than complex.
Complex is better than complicated.
Flat is better than nested.
Sparse is better than dense.
Readability counts.
Special cases aren't special enough to break the rules.
Although practicality beats purity.
Errors should never pass silently.
Unless explicitly silenced.
In the face of ambiguity, refuse the temptation to guess.
There should be one-- and preferably only one --obvious way to do it.
Although that way may not be obvious at first unless you're Dutch.
Now is better than never.
Although never is often better than *right* now.
If the implementation is hard to explain, it's a bad idea.
If the implementation is easy to explain, it may be a good idea.
Namespaces are one honking great idea -- let's do more of those!

From GitHub Octoverse 2024…

Source: https://github.blog/news-insights/octoverse/octoverse-2024/

PYPL – PopularitY of Programming Language

Source https://pypl.github.io/PYPL.html

TIOBE

Source https://www.tiobe.com/tiobe-index/

Gurobi has a heart for Python

gurobipy: Your pythonic and fast Gurobi API
gurobipy-pandas: Build optimization models directly out of data frames
gurobi-machinelearning: Incorporate trained regressors in optimization models
gurobi-modelanalyzer: Analyze and understand numerical aspects of your optimization model
gurobi-optimods: Data-driven APIs for common optimization tasks

Many projects hosted at https://github.com/Gurobi

gurobipy: Your pythonic and fast Gurobi API

How to get started?

Type: pip install gurobipy

Fetches and installs gurobipy including optimizer libraries
All Gurobi features supported, no restrictions
Add your license information as usual

Term-based API

\[ \begin{align*} \max \quad & x + y + 2z\\ \mbox{s.t.} \quad & x + 2y + 3z \le 4\\ & x + y \ge 1\\ & x, y, z \; \text{binary} \end{align*} \]

import gurobipy as gp

m = gp.Model("mip1")

x = m.addVar(vtype=GRB.BINARY)
y = m.addVar(vtype=GRB.BINARY)
z = m.addVar(vtype=GRB.BINARY)

m.setObjective(x + y + 2 * z, gp.GRB.MAXIMIZE)
m.addConstr(x + 2 * y + 3 * z <= 4)
m.addConstr(x + y >= 1)

Write model entities term-by-term
Use dicts for sparse multidimensional index sets
Extends nicely to iteration via dictionary comprehension

Matrix-Friendly API

Build models entities based on ndarray-like data, using matrix variables (MVar):

x = model.addMVar(shape)

A class representing an ndarray of optimization variables
Understands Python’s matmul operator @
Very efficient if data comes in naturally
- as an ndarray or
- a SciPy sparse matrix
Indexing like in NumPy
Slicing like in NumPy
Broadcasting like in NumPy
… like in NumPy!

Matrix-friendly example

\[ \begin{align*} \min \; & x^T Q x \\ \mbox{s.t.} \; & Ax \geq b \\ & 0 \leq x \leq 1 \end{align*} \]

import gurobipy as gp
import numpy as np
import scipy.sparse as sp

m = gp.Model()

Q = sp.diags([1, 2, 3])
A = np.array([ [1, 2, 3], [1, 1, 0] ])

x = m.addMVar(3, ub=1.0)
m.setObjective(x @ Q @ x)
m.addConstr(A @ x >= 1)

Resources

Modeling performance

A lot of effort is spent on making gurobipy fast

Automatic performance monitoring of common modelling patterns

gurobipy-pandas

Project-Team assignment

We’re running a fantasy consulting company that is organized by a set of fixed teams $J$, and we have a set of projects $I$ that await to be worked on. Which team should be assigned to which project?

Data

Profit of completing project $i \in I$: $p_i$
Resource requirement for project $i \in I$: $w_i$
Capacity of team $j \in J$: $c_j$

Decision variables

$x_{ij} \in \{0,1\}$: assign project $i$ to team $j$?

Objective function

Maximize profit from completed projects

Constraints

Don’t oversubscribe the teams
At most one team works on each project

\[\begin{align*} \max_{x} \quad & \sum_{i,j} p_i x_{ij}\\ & \sum_i w_i x_{ij} \le c_j \quad \mbox{for all $j$}\\ & \sum_j x_{ij} \le 1 \quad \mbox{for all $i$}\\ & x_{ij} \in \{0,1\} \end{align*}\]

Input data

>>> projects.head(3)  # w_i 
      resource
      project          
      p0            1.1
      p1            1.4
      p2            1.2

>>> teams.head(3)  # c_j
      capacity
      team          
      t0         2.4
      t1         1.8
      t2         1.1

>>> project_values.head(5)
                    profit
      project team        
      p0      t4       0.4
      p1      t4       1.3
      p2      t0       1.7
              t1       1.7
              t2       1.7

Attach decision variables to data frame

import gurobipy as gp
import gurobipy_pandas as gppd
model = gp.Model()
model.ModelSense = GRB.MAXIMIZE
assignments = project_values.gppd.add_vars(
    model, vtype=GRB.BINARY, obj="profit", name="x"
)
assignments.head()  # p_ij & x_ij

              profit                      x
project team                               
p0      t4       0.4  <gurobi.Var x[p0,t4]>
p1      t4       1.3  <gurobi.Var x[p1,t4]>
p2      t0       1.7  <gurobi.Var x[p2,t0]>
        t1       1.7  <gurobi.Var x[p2,t1]>
        t2       1.7  <gurobi.Var x[p2,t2]>

Resource constraint

capacity_constraints = gppd.add_constrs(
    model,
    (projects["resource"] * assignments["x"]).groupby("team").sum(),
    GRB.LESS_EQUAL,
    teams["capacity"],
    name="capacity",
)
capacity_constraints.apply(model.getRow).head()

team
t0    1.2 x[p2,t0] + 0.9 x[p4,t0] + 1.3 x[p5,t0] + x...
t1    1.2 x[p2,t1] + 0.9 x[p4,t1] + 1.3 x[p5,t1] + x...
t2    1.2 x[p2,t2] + 0.9 x[p4,t2] + 1.3 x[p5,t2] + x...
t3    1.2 x[p2,t3] + 0.9 x[p4,t3] + 1.3 x[p5,t3] + x...
t4    1.1 x[p0,t4] + 1.4 x[p1,t4] + 1.2 x[p2,t4] + 1...
Name: capacity, dtype: object

Assign each project at most once

allocate_once = gppd.add_constrs(
    model,
    assignments["x"].groupby("project").sum(),
    GRB.LESS_EQUAL,
    1.0,
    name="allocate_once",
)
allocate_once.apply(model.getRow).head()

project
p0                                              x[p0,t4]
p1                                              x[p1,t4]
p10    x[p10,t0] + x[p10,t1] + x[p10,t2] + x[p10,t3] ...
p11    x[p11,t0] + x[p11,t1] + x[p11,t2] + x[p11,t3] ...
p12                                x[p12,t3] + x[p12,t4]
Name: allocate_once, dtype: object

Solve and query solution

model.optimize()
(
    assignments["x"].gppd.X.to_frame()
    .query("x >= 0.5").reset_index()
    .groupby("team").agg({"project": list})
)

              project
team                 
t0           [p4, p5]
t1               [p2]
t2              [p11]
t3          [p6, p29]
t4    [p14, p15, p26]

https://github.com/Gurobi/gurobipy-pandas

Deployment Architectures

Typical Databricks workflow

Data-driven optimization models can (and should!) be integrated directly in your ETL pipelines

Solver deployment

Run locally (size-limited license)

env = gp.Env()

Run locally on the databricks cluster

env = gp.Env(params={
    "WLSAccessID": "xxx",
    ...
})

Run remotely using Gurobi Instant Cloud

env = gp.Env(params={
    "CloudAccessID": "xxx",
    ...
})

Run remotely on your own servers

env = gp.Env(params={
    "CSManager": "xxx",
    ...
})

Solving with Instant Cloud

Use trained regressors as constraints:
gurobi-machinelearning

Gurobi ML

A package to use trained regression models in mathematical optimization models.

Input variable vector $x$
Output variable vector $y$
Trained regressor $f$
Use like a constraint $y = f(x)$
Optimize model over input and output variables ensuring that input is mapped to output.

Supported APIs: sklearn, Keras, PyTorch, XGBoost, LightGBM

https://github.com/Gurobi/gurobi-machinelearning

Example: Adversarial ML

Given a trained network, how robust is the classification w.r.t. noise?

Classified as “4”

Classified as “9”

gurobi-modelanalyzer

A Numerical model analysis tool and solution checker

Consists of two main modules:

1. Explainer for ill-conditioned LPs:

kappa-explain: Provides a row- or column-based ill-conditioning explanation
angle-explain: Finds pairwise explanations for ill-conditioning

2. Feasibility checks for solution files:

Why is a given solution infeasible?
How far is a solution away from feasibility or optimality?

https://github.com/Gurobi/gurobi-modelanalyzer

Data-driven optimization examples:
Gurobi OptiMods

An OptiMod …

Is a tool to solve a specific, practical problem
Has a data-driven API for a common optimization problem
Takes data in “natural” form, returns a solution in “natural” form
Solves a mathematical optimization problem using Gurobi technology

Least absolute value (LAD) regression

from sklearn import datasets
from sklearn.model_selection import train_test_split

from gurobi_optimods.regression import LADRegression

# Load the diabetes dataset
diabetes = datasets.load_diabetes()

# Split data for fit assessment
X_train, X_test, y_train, y_test = train_test_split(
    diabetes["data"], diabetes["target"], random_state=42
)

# Fit model and obtain predictions
lad = LADRegression()
lad.fit(X_train, y_train)
y_pred = lad.predict(X_test)

LAD is more robust than ordinary linear regression w.r.t. outliers

Takeaways

Gurobi has a heart for Python
Gain productivity from our provided tools
https://github.com/Gurobi