Gurobi in the Python Ecosystem

Robert Luce

Principal Developer

The state of Python

After 30+ years, why is Python so successful?

  • Productive languge
    • Clean syntax
    • Dynamic (but can be pestered with type hints…)
  • Expansive ecosystem and community
    • PyPI in 2023: 1.2+ Trillion requests, 607+ Petabytes served
    • PyCon conference tour
  • Embraced by schools and academia
    • Next generation of Python developers getting industry ready while we speak
  • Domain agnostic
    • “Good bye!” domain specific languages

import this

The Zen of Python, by Tim Peters

Beautiful is better than ugly.
Explicit is better than implicit.
Simple is better than complex.
Complex is better than complicated.
Flat is better than nested.
Sparse is better than dense.
Readability counts.
Special cases aren't special enough to break the rules.
Although practicality beats purity.
Errors should never pass silently.
Unless explicitly silenced.
In the face of ambiguity, refuse the temptation to guess.
There should be one-- and preferably only one --obvious way to do it.
Although that way may not be obvious at first unless you're Dutch.
Now is better than never.
Although never is often better than *right* now.
If the implementation is hard to explain, it's a bad idea.
If the implementation is easy to explain, it may be a good idea.
Namespaces are one honking great idea -- let's do more of those!

From GitHub Octoverse 2024…


Source: https://github.blog/news-insights/octoverse/octoverse-2024/

PYPL – PopularitY of Programming Language


Source https://pypl.github.io/PYPL.html

TIOBE


Source https://www.tiobe.com/tiobe-index/

Gurobi has a heart for Python

  • gurobipy: Your pythonic and fast Gurobi API

  • gurobipy-pandas: Build optimization models directly out of data frames

  • gurobi-machinelearning: Incorporate trained regressors in optimization models

  • gurobi-modelanalyzer: Analyze and understand numerical aspects of your optimization model

  • gurobi-optimods: Data-driven APIs for common optimization tasks

Many projects hosted at https://github.com/Gurobi

gurobipy: Your pythonic and fast Gurobi API

How to get started?

Type: pip install gurobipy

  • Fetches and installs gurobipy including optimizer libraries
  • All Gurobi features supported, no restrictions
  • Add your license information as usual

Term-based API

\[ \begin{align*} \max \quad & x + y + 2z\\ \mbox{s.t.} \quad & x + 2y + 3z \le 4\\ & x + y \ge 1\\ & x, y, z \; \text{binary} \end{align*} \]

import gurobipy as gp

m = gp.Model("mip1")

x = m.addVar(vtype=GRB.BINARY)
y = m.addVar(vtype=GRB.BINARY)
z = m.addVar(vtype=GRB.BINARY)

m.setObjective(x + y + 2 * z, gp.GRB.MAXIMIZE)
m.addConstr(x + 2 * y + 3 * z <= 4)
m.addConstr(x + y >= 1)
  • Write model entities term-by-term
  • Use dicts for sparse multidimensional index sets
  • Extends nicely to iteration via dictionary comprehension

Matrix-Friendly API

Build models entities based on ndarray-like data, using matrix variables (MVar):

x = model.addMVar(shape)
  • A class representing an ndarray of optimization variables
  • Understands Python’s matmul operator @
  • Very efficient if data comes in naturally
    • as an ndarray or
    • a SciPy sparse matrix
  • Indexing like in NumPy
  • Slicing like in NumPy
  • Broadcasting like in NumPy
  • … like in NumPy!

Matrix-friendly example

\[ \begin{align*} \min \; & x^T Q x \\ \mbox{s.t.} \; & Ax \geq b \\ & 0 \leq x \leq 1 \end{align*} \]

import gurobipy as gp
import numpy as np
import scipy.sparse as sp

m = gp.Model()

Q = sp.diags([1, 2, 3])
A = np.array([ [1, 2, 3], [1, 1, 0] ])

x = m.addMVar(3, ub=1.0)
m.setObjective(x @ Q @ x)
m.addConstr(A @ x >= 1)

Modeling performance

  • A lot of effort is spent on making gurobipy fast
  • Automatic performance monitoring of common modelling patterns

gurobipy-pandas

Project-Team assignment

We’re running a fantasy consulting company that is organized by a set of fixed teams \(J\), and we have a set of projects \(I\) that await to be worked on. Which team should be assigned to which project?

Data

  • Profit of completing project \(i \in I\): \(p_i\)
  • Resource requirement for project \(i \in I\): \(w_i\)
  • Capacity of team \(j \in J\): \(c_j\)

Decision variables

\(x_{ij} \in \{0,1\}\): assign project \(i\) to team \(j\)?

Objective function

Maximize profit from completed projects

Constraints

  • Don’t oversubscribe the teams
  • At most one team works on each project

\[\begin{align*} \max_{x} \quad & \sum_{i,j} p_i x_{ij}\\ & \sum_i w_i x_{ij} \le c_j \quad \mbox{for all $j$}\\ & \sum_j x_{ij} \le 1 \quad \mbox{for all $i$}\\ & x_{ij} \in \{0,1\} \end{align*}\]

Input data

>>> projects.head(3)  # w_i 
      resource
      project          
      p0            1.1
      p1            1.4
      p2            1.2
>>> teams.head(3)  # c_j
      capacity
      team          
      t0         2.4
      t1         1.8
      t2         1.1


>>> project_values.head(5)
                    profit
      project team        
      p0      t4       0.4
      p1      t4       1.3
      p2      t0       1.7
              t1       1.7
              t2       1.7

Attach decision variables to data frame

import gurobipy as gp
import gurobipy_pandas as gppd
model = gp.Model()
model.ModelSense = GRB.MAXIMIZE
assignments = project_values.gppd.add_vars(
    model, vtype=GRB.BINARY, obj="profit", name="x"
)
assignments.head()  # p_ij & x_ij


              profit                      x
project team                               
p0      t4       0.4  <gurobi.Var x[p0,t4]>
p1      t4       1.3  <gurobi.Var x[p1,t4]>
p2      t0       1.7  <gurobi.Var x[p2,t0]>
        t1       1.7  <gurobi.Var x[p2,t1]>
        t2       1.7  <gurobi.Var x[p2,t2]>

Resource constraint

capacity_constraints = gppd.add_constrs(
    model,
    (projects["resource"] * assignments["x"]).groupby("team").sum(),
    GRB.LESS_EQUAL,
    teams["capacity"],
    name="capacity",
)
capacity_constraints.apply(model.getRow).head()


team
t0    1.2 x[p2,t0] + 0.9 x[p4,t0] + 1.3 x[p5,t0] + x...
t1    1.2 x[p2,t1] + 0.9 x[p4,t1] + 1.3 x[p5,t1] + x...
t2    1.2 x[p2,t2] + 0.9 x[p4,t2] + 1.3 x[p5,t2] + x...
t3    1.2 x[p2,t3] + 0.9 x[p4,t3] + 1.3 x[p5,t3] + x...
t4    1.1 x[p0,t4] + 1.4 x[p1,t4] + 1.2 x[p2,t4] + 1...
Name: capacity, dtype: object

Assign each project at most once

allocate_once = gppd.add_constrs(
    model,
    assignments["x"].groupby("project").sum(),
    GRB.LESS_EQUAL,
    1.0,
    name="allocate_once",
)
allocate_once.apply(model.getRow).head()


project
p0                                              x[p0,t4]
p1                                              x[p1,t4]
p10    x[p10,t0] + x[p10,t1] + x[p10,t2] + x[p10,t3] ...
p11    x[p11,t0] + x[p11,t1] + x[p11,t2] + x[p11,t3] ...
p12                                x[p12,t3] + x[p12,t4]
Name: allocate_once, dtype: object

Solve and query solution

model.optimize()
(
    assignments["x"].gppd.X.to_frame()
    .query("x >= 0.5").reset_index()
    .groupby("team").agg({"project": list})
)
              project
team                 
t0           [p4, p5]
t1               [p2]
t2              [p11]
t3          [p6, p29]
t4    [p14, p15, p26]

Deployment Architectures

Typical Databricks workflow

Data-driven optimization models can (and should!) be integrated directly in your ETL pipelines

Solver deployment

  • Run locally (size-limited license)
env = gp.Env()
  • Run locally on the databricks cluster
env = gp.Env(params={
    "WLSAccessID": "xxx",
    ...
})
  • Run remotely using Gurobi Instant Cloud
env = gp.Env(params={
    "CloudAccessID": "xxx",
    ...
})
  • Run remotely on your own servers
env = gp.Env(params={
    "CSManager": "xxx",
    ...
})

Solving with Instant Cloud

Use trained regressors as constraints:
gurobi-machinelearning

Gurobi ML

A package to use trained regression models in mathematical optimization models.

  • Input variable vector \(x\)
  • Output variable vector \(y\)
  • Trained regressor \(f\)
  • Use like a constraint \(y = f(x)\)
  • Optimize model over input and output variables ensuring that input is mapped to output.


Supported APIs: sklearn, Keras, PyTorch, XGBoost, LightGBM

https://github.com/Gurobi/gurobi-machinelearning

Example: Adversarial ML

Given a trained network, how robust is the classification w.r.t. noise?


Classified as “4”

Classified as “9”

gurobi-modelanalyzer

A Numerical model analysis tool and solution checker

Consists of two main modules:

1. Explainer for ill-conditioned LPs:

  • kappa-explain: Provides a row- or column-based ill-conditioning explanation
  • angle-explain: Finds pairwise explanations for ill-conditioning

2. Feasibility checks for solution files:

  • Why is a given solution infeasible?
  • How far is a solution away from feasibility or optimality?

https://github.com/Gurobi/gurobi-modelanalyzer

Data-driven optimization examples:
Gurobi OptiMods

An OptiMod …

  • Is a tool to solve a specific, practical problem
  • Has a data-driven API for a common optimization problem
  • Takes data in “natural” form, returns a solution in “natural” form
  • Solves a mathematical optimization problem using Gurobi technology

Least absolute value (LAD) regression

from sklearn import datasets
from sklearn.model_selection import train_test_split

from gurobi_optimods.regression import LADRegression

# Load the diabetes dataset
diabetes = datasets.load_diabetes()

# Split data for fit assessment
X_train, X_test, y_train, y_test = train_test_split(
    diabetes["data"], diabetes["target"], random_state=42
)

# Fit model and obtain predictions
lad = LADRegression()
lad.fit(X_train, y_train)
y_pred = lad.predict(X_test)

LAD is more robust than ordinary linear regression w.r.t. outliers

Takeaways