Local Nonlinear Optimization in Gurobi 13.0

Andreas Waechter

Simon Bowly

Gurobi: Problem Types

Local vs Global Optima

Global Optimum

Best solution over entire feasible region
The default in Gurobi, for all problem types

Local Optimum

Best feasible solution in a neighborhood
What Gurobi’s NL barrier algorithm provides

Consider: $\min f(x) = x^4 - 5x^2 + 4 + 1.5x$

Local Nonlinear Optimization in Gurobi 13.0

Still build models using algebraic expressions:

model.addConstr(z == sin(x) * exp(y))

Just change one parameter to use NL barrier:

model.Params.OptimalityTarget = 1

What we’ll cover today:

How the NL barrier algorithm works
New parameters & status codes
Starting points & solution quality
Walk through several examples

NL Barrier Overview

Newton’s Method (Step Computation)

Second order approximations animation

Minimize quadratic Taylor model in each iteration to compute next iterate
Results in fast local convergence
Can be attracted to different local optima

Requires safeguard by reducing step size to ensure convergence
Constraints handled by linearization

Barrier Function (Handling Inequality Constraints)

\[ \begin{aligned} \min_{x\in [0,10]} x, \ \text{s.t.} \ \ \color{red}{0\leq x \leq 10} \end{aligned} \]

$\quad\longrightarrow\quad$

\[ \begin{aligned} \min_{x\in\mathbb{R}} x - \color{red}{\mu \log(x)} - \color{red}{\mu \log(10 - x)} \end{aligned} \]

Barrier animation

Replace inequality constraints by logarithmic barrier terms ($\color{red}{\mu}>0$: barrier parameter)
- This avoids combinatorial complexity of identifying binding constraints

Barrier Algorithm

\[ \begin{aligned} \min_{x\in\mathbb{R}^n} \ & f(x) &&& \min_{x\in\mathbb{R}^n} \ & f(x) \color{red}{- \mu\sum_{i=1}^n\log(x_i)}\\ \text{s.t.} \ & c(x) = 0 & \longrightarrow && \text{s.t.} \ & c(x) = 0 & \text{(BP$_\mu$)}\\ & \color{red}{{x\geq 0}} \end{aligned} \]

Idea

Solve a sequence of barrier problems as $\mu\to0$

Primal-dual interior-point algorithm with line search:

Choose $x_0\in\mathbb{R}^n$. Set $k\gets 0$.
Choose barrier parameter $\mu_k$.
Solve linear system to compute Newton step $d_k$ for (BP$_{\mu}$) for current $\mu_k$.
Choose step size $\alpha_k\in (0,1]$ and take a step to get new iterate $x_{k+1}= x_k + \alpha_kd_k$.
Set $k\gets k+1$; go to 2.

Turn on NL Barrier: `OptimalityTarget=1`

Possible parameter values:
- -1: default (set to 0)
- 0: use global MINLP solver
- 1: use NL barrier

“Preview Feature” in Gurobi 13
- Fully tested and supported
- No dual information returned
- Not all solution quality attributes available
- May change API behavior in future

Only used for nonlinear continuous problems
- Error if there are discrete variables, SOS constraints, or non-differentiable functions
- Ignored if problem is LP or convex QP/QCP

New Parameters and Attributes

Parameters:

NLBarIterLimit: Iteration limit
NLBarPFeasTol: Primal residual tolerance

NLBarDFeasTol: Dual residual tolerance
NLBarCFeasTol: Complementarity residual tolerance

New Attribute:

NLBarIterCount: Iterations performed by NL barrier

Quality attributes:

Available: MaxVio, ConstrVioIndex, etc
Not yet available: DualVio, MaxSvio, ConstrResidualIndex, etc

Status codes:

LOCALLY_OPTIMAL and LOCALLY_INFEASIBLE

A First Example

Find Minima of a Quartic Function

\[\min\ x^4 - 2x^2 + 0.5x + 1\]

model = gp.Model(env=env)
x = model.addVar(lb=-GRB.INFINITY)
z = model.addVar(lb=-GRB.INFINITY)
model.addConstr(z == x**4 - 2 * x**2 + 0.5 * x + 1)
model.setObjective(z, sense=GRB.MINIMIZE)

`OptimalityTarget=1` selects NL barrier

model.Params.OptimalityTarget = 1
model.optimize()

Set parameter OptimalityTarget to value 1
                           Residuals
Iter     Objective      Primal     Dual      Compl     Step     Time
   0   8.88631776e-01  9.94e-02  1.00e+00  0.00e+00  0.00e+00     0s
   1  -3.66317993e-01  1.26e+01  2.30e+01  0.00e+00  4.88e-04     0s
   2  -1.26827514e+01  1.27e+01  3.96e+00  0.00e+00  1.00e+00     0s
   3  -8.70429470e-01  3.82e-01  7.50e-01  0.00e+00  1.00e+00     0s
   4  -5.38133569e-01  2.36e-02  5.84e-02  0.00e+00  1.00e+00     0s
   5  -5.14930637e-01  1.77e-04  4.74e-04  0.00e+00  1.00e+00     0s
   6  -5.14753653e-01  1.19e-08  3.21e-08  0.00e+00  1.00e+00     0s
   7  -5.14753641e-01  1.11e-16  8.88e-16  0.00e+00  1.00e+00     0s
First-order optimal solution

Optimality Status:

print(f"status.............: {model.status}") 
print(f"LOCALLY_OPTIMAL....: {GRB.LOCALLY_OPTIMAL}")

status.............: 18
LOCALLY_OPTIMAL....: 18

Solution Visualization (Default Starting Point)

Starting from x = 0

model.reset()
x.PStart = 0
model.optimize()

Discarded solution information
Loaded warm-start solution
                           Residuals
Iter     Objective      Primal     Dual      Compl     Step     Time
   0   1.00000000e+00  0.00e+00  1.00e+00  0.00e+00  0.00e+00     0s
   1   2.34375000e-02  6.92e+00  1.72e+01  0.00e+00  4.88e-04     0s
   2  -6.87203560e+00  6.73e+00  3.25e+00  0.00e+00  1.00e+00     0s
   3  -7.79935113e-01  2.81e-01  5.74e-01  0.00e+00  1.00e+00     0s
   4  -5.29177542e-01  1.45e-02  3.66e-02  0.00e+00  1.00e+00     0s
   5  -5.14823680e-01  7.00e-05  1.88e-04  0.00e+00  1.00e+00     0s
   6  -5.14753643e-01  1.87e-09  5.05e-09  0.00e+00  1.00e+00     0s
First-order optimal solution

Optimality Status:

print(f"status.............: {model.status}") 
print(f"LOCALLY_OPTIMAL....: {GRB.LOCALLY_OPTIMAL}")

status.............: 18
LOCALLY_OPTIMAL....: 18

Solution Visualization

Starting from x = 0.5

model.reset()
x.PStart = 0.5
model.optimize()

Discarded solution information
Loaded warm-start solution
                           Residuals
Iter     Objective      Primal     Dual      Compl     Step     Time
   0   8.12500000e-01  0.00e+00  1.00e+00  0.00e+00  0.00e+00     0s
   1  -1.62890625e+00  6.17e+01  4.53e+01  0.00e+00  4.88e-04     0s
   2  -1.04179861e+02  1.05e+02  1.72e+00  0.00e+00  1.00e+00     0s
   3   3.73335585e-01  1.16e-01  3.03e-01  0.00e+00  1.00e+00     0s
   4   4.77332212e-01  5.95e-03  1.94e-02  0.00e+00  1.00e+00     0s
   5   4.83222504e-01  2.90e-05  1.01e-04  0.00e+00  1.00e+00     0s
   6   4.83251491e-01  7.96e-10  2.78e-09  0.00e+00  1.00e+00     0s
First-order optimal solution

Optimality Status:

print(f"status.............: {model.status}") 
print(f"LOCALLY_OPTIMAL....: {GRB.LOCALLY_OPTIMAL}")

status.............: 18
LOCALLY_OPTIMAL....: 18

Solution Visualization

Starting from x = 1.2

model.reset()
x.PStart = 1.2
model.optimize()

Discarded solution information
Loaded warm-start solution
                           Residuals
Iter     Objective      Primal     Dual      Compl     Step     Time
   0   7.93600000e-01  1.11e-16  1.00e+00  0.00e+00  0.00e+00     0s
   1  -7.72362998e+00  1.72e+01  3.36e+00  0.00e+00  9.77e-04     0s
   2  -1.05396252e+02  1.26e+02  4.08e+01  0.00e+00  1.00e+00     0s
   3  -7.51723070e+00  1.12e+01  1.15e+01  0.00e+00  1.00e+00     0s
  ...
   7   4.82916826e-01  3.35e-04  1.15e-03  0.00e+00  1.00e+00     0s
   8   4.83251388e-01  1.04e-07  3.62e-07  0.00e+00  1.00e+00     0s
   9   4.83251492e-01  9.88e-15  3.60e-14  0.00e+00  1.00e+00     0s
First-order optimal solution

Optimality Status:

print(f"status.............: {model.status}") 
print(f"LOCALLY_OPTIMAL....: {GRB.LOCALLY_OPTIMAL}")

status.............: 18
LOCALLY_OPTIMAL....: 18

Solution Visualization

Starting from x = 1.5

model.reset()
x.PStart = 1.5
model.optimize()

Discarded solution information
Loaded warm-start solution
                           Residuals
Iter     Objective      Primal     Dual      Compl     Step     Time
   0   2.31250000e+00  0.00e+00  1.00e+00  0.00e+00  0.00e+00     0s
   1  -1.69182692e+01  1.65e+01  9.85e-01  0.00e+00  1.95e-03     0s
   2  -1.68966995e+01  1.64e+01  5.00e-01  0.00e+00  1.56e-02     0s
   3  -5.25944987e-01  1.12e-02  2.86e-02  0.00e+00  1.00e+00     0s
   4  -5.14796516e-01  4.29e-05  1.15e-04  0.00e+00  1.00e+00     0s
   5  -5.14753642e-01  7.04e-10  1.90e-09  0.00e+00  1.00e+00     0s
First-order optimal solution

Optimality Status:

print(f"status.............: {model.status}") 
print(f"LOCALLY_OPTIMAL....: {GRB.LOCALLY_OPTIMAL}")

status.............: 18
LOCALLY_OPTIMAL....: 18

Solution Visualization

Comparison of All Cases

Each run finds a local minimum (not necessarily in the vicinity of the starting point)

Constrained Problem

Minimize $x$ subject to $\exp(\color{red}{x^4 - 2 x^2 + \frac{x}{2} + 1}) \le 1$

x = model.addVar(lb=-GRB.INFINITY)
z = model.addVar(lb=-GRB.INFINITY, ub=1)
model.addConstr(z == gp.nlfunc.exp(x**4 - 2 * x**2 + 0.5 * x + 1))
model.setObjective(x, sense=GRB.MINIMIZE)

Default starting point

model.Params.OptimalityTarget = 1
model.optimize()

Set parameter OptimalityTarget to value 1
                           Residuals
Iter     Objective      Primal     Dual      Compl     Step     Time
   0  -2.20509451e-02  2.14e+00  1.00e+00  5.00e+00  0.00e+00     0s
   1  -1.39319205e+00  6.88e-01  4.63e+00  9.92e-01  1.00e+00     0s
   2  -1.25651502e+00  3.21e-01  2.06e+00  1.00e+00  1.00e+00     0s
   3  -1.10374445e+00  1.33e-01  1.06e+00  9.91e-01  1.00e+00     0s
  ...
  19  -1.34960322e+00  1.33e-11  9.39e-13  1.43e-05  1.00e+00     0s
  20  -1.34961735e+00  3.33e-09  6.62e-09  1.66e-07  1.00e+00     0s
  21  -1.34961752e+00  4.58e-13  9.00e-13  1.00e-09  1.00e+00     0s
First-order optimal solution

Optimality Status:

print(f"status.............: {model.status}") 
print(f"LOCALLY_OPTIMAL....: {GRB.LOCALLY_OPTIMAL}")

status.............: 18
LOCALLY_OPTIMAL....: 18

Solution Visualization

Starting from x = 0.5

model.reset()
x.PStart = 0.5
model.optimize()

Discarded solution information
Loaded warm-start solution
                           Residuals
Iter     Objective      Primal     Dual      Compl     Step     Time
   0   5.00000000e-01  1.26e+00  1.00e+00  5.00e+00  0.00e+00     0s
   1   1.10398246e+00  9.29e-01  2.19e+00  1.02e+00  1.00e+00     0s
   2   6.53690292e-01  1.31e+00  1.62e+00  1.13e+00  1.00e+00     0s
   3   9.62585702e-01  9.48e-01  1.82e+00  1.01e+00  5.00e-01     0s
  ...
  24*  9.30402927e-01  6.21e-01  5.12e-09  1.43e-05  1.00e+00     0s
  25*  9.30402927e-01  6.21e-01  4.39e-11  1.65e-07  1.00e+00     0s
  26*  9.30402927e-01  6.21e-01  1.77e-12  1.00e-09  1.00e+00     0s
NL barrier performed 26 iterations in 0.00 seconds (0.00 work units)
First-order infeasible model

Optimality Status:

print(f"status.............: {model.status}") 
print(f"LOCALLY_INFEASIBLE.: {GRB.LOCALLY_INFEASIBLE}")

status.............: 19
LOCALLY_INFEASIBLE.: 19

Local Infeasibility (Minimize Constraint Violation)

Summary

Usage:

Build models “as usual”, with algebraic expressions
Set OptimalityTarget = 1 to invoke NL barrier

Important considerations:

Local vs global optimality:
- The algorithm finds local optima, which may not be globally optimal
- For convex problems, every local optimum is a global optimum
  - In that case, NL barrier finds the global optimum

Starting point matters:
- Different initial points can lead to different local optima for nonconvex problems

Local infeasibility:
- Algorithm found local minimizer of constraint violation
- Some starting points may lead to local infeasibility even when the problem is feasible

Fair Logistic Regression

Problem Description

Given training data:

Features: $X \in \mathbb{R}^{n \times d}$ (e.g., income, credit score)
Labels: $y \in \{0, 1\}^n$ (e.g., repaid/defaulted)
Protected groups: $g \in \{0, 1\}^n$ (demographic group)

Learn model weights:

Weights: $w \in \mathbb{R}^d$
Bias: $b \in \mathbb{R}$

To predict label probabilities:

Probability of $y=1$:
- $p = \sigma(w^Tx + b) = \frac{1}{1 + e^{-(w^T x + b)}}$

Important

Fairness constraint:

Limit probability deviation of separate groups
e.g. demographic parity across protected groups

Loss Function

Minimize cross-entropy loss + L2 regularization

\[ \begin{align} \min_{w, b} \quad & \frac{1}{n}\sum_{i=1}^n \left[-y_i \log(p_i) - (1-y_i)\log(1-p_i)\right] + \lambda \|w\|^2 \\ \end{align} \]

Where probabilities are defined using a logistic function: \[ \begin{align} p_i = \frac{1}{1 + e^{-(w^T x_i + b)}} \\ \end{align} \]

Subject to demographic parity constraint

\[ \begin{align} \left| \frac{1}{|G_0|}\sum_{i \in G_0} p_i - \frac{1}{|G_1|}\sum_{i \in G_1} p_i \right| \leq \epsilon \end{align} \]

Fair Logistic Regression: Input Data

n_samples = 30
n_features = 5

Synthetic data:

X[:3, :]

array([[ 0.49671415, -0.1382643 ,  0.64768854,  1.52302986, -0.23415337],
       [-0.23413696,  1.57921282,  0.76743473, -0.46947439,  0.54256004],
       [-0.46341769, -0.46572975,  0.24196227, -1.91328024, -1.72491783]])

array([0, 1, 0, 0, 0, 0, 0, 0, 0, 0, 1, 1, 0, 1, 0, 0, 0, 0, 0, 0, 0, 1,
       1, 1, 1, 0, 1, 0, 1, 0])

protected_groups

array([1, 1, 1, 0, 0, 1, 1, 0, 0, 1, 0, 0, 1, 1, 1, 0, 1, 0, 0, 1, 1, 1,
       1, 1, 0, 0, 0, 0, 1, 0])

Model parameters:

fairness_tol = 0.10
lambda_reg = 0.0001

Decision Variables for Learned Weights

Create a new model:

env = gp.Env()
model = gp.Model("FairLogisticRegression", env=env)

Add variables for learned parameters:

w = model.addVars(n_features, lb=-GRB.INFINITY, name="weight")

{0: <gurobi.Var weight[0]>,
 1: <gurobi.Var weight[1]>,
 2: <gurobi.Var weight[2]>,
 3: <gurobi.Var weight[3]>,
 4: <gurobi.Var weight[4]>}

b = model.addVar(lb=-GRB.INFINITY, name="bias")

<gurobi.Var bias>

We need to represent, for each sample:

\[ p = \sigma(w^Tx + b) = \frac{1}{1 + e^{-(w^T x + b)}} \]

where features $x$ are known.

Expressions for Output Probabilities

Linear score: $z_i = w^T x_i + b$

i = 0  # First sample

z_0 = gp.quicksum(w[j] * X[i,j] for j in range(n_features)) + b
z_0

<gurobi.LinExpr: 0.4967141530112327 weight[0] + -0.13826430117118466 weight[1] + 0.6476885381006925 weight[2] + 1.5230298564080254 weight[3] + -0.23415337472333597 weight[4] + bias>

Logistic function: $p_i = \frac{1}{1 + exp(-z_i)}$

from gurobipy import nlfunc

p_0 = nlfunc.logistic(z_0)
p_0

<gurobi.NLExpr: logistic(0.4967141530112327 * weight[0] + -0.13826430117118466 * weight[1] + 0.6476885381006925 * weight[2] + 1.5230298564080254 * weight[3] + -0.23415337472333597 * weight[4] + bias)>

Loss Function

Create and store probability expressions:

p = [
    nlfunc.logistic(gp.quicksum(w[j] * X[i,j] for j in range(n_features)) + b)
    for i in range(n_samples)
]

Build loss function expression for training:

\[ \sum_{i=1}^n \left[-y_i \log(p_i) - (1-y_i)\log(1-p_i)\right] \]

sum_loss = 0
for i in range(n_samples):
    log_p = nlfunc.log(p[i])
    log_1mp = nlfunc.log(1.0 - p[i])
    loss = -(y[i] * log_p + (1-y[i]) * log_1mp)
    sum_loss += loss

<gurobi.NLExpr: 0.0 + -(0.0 * log(logistic(0.4967141530112327 * weight[0] + -0.13826430117118466 * weight[1] + 0.6476885381006925 * weight[2] + 1.5230298564080254 * weight[3] + -0.23415337472333597 * weight[4] + bias)) + 1.0 * log(1.0 - logistic(0.4967141530112327 * weight[0] + -0.13826430117118466 * weight[1] + 0.6476885381006925 * weight[2] + 1.5230298564080254 * weight[3] + -0.23415337472333597 * weight[4] + bias))) + -(1.0 * log(logistic(-0.23413695694918055 * weight[0] + 1.5792128155073915 * weight[1] + 0.7674347291529088 * weight[2] + -0.4694743859349521 * weight[3] + 0.5425600435859647 * weight[4] + bias)) + 0.0 * log(1.0 - logistic(-0.23413695694918055 * weight[0] + 1.5792128155073915 * weight[1] + 0.7674347291529088 * weight[2] + -0.4694743859349521 * weight[3] + 0.5425600435859647 * weight[4] + bias))) + -(0.0 * log(logistic(-0.46341769281246226 * weight[0] + -0.46572975357025687 * weight[1] + 0.24196227156603412 * weight[2] + -1.913280244657798 * weight[3] + -1.7249178325130328 * weight[4] + bias)) + 1.0 * log(1.0 - logistic(-0.46341769281246226 * weight[0] + -0.46572975357025687 * weight[1] + 0.24196227156603412 * weight[2] + -1.913280244657798 * weight[3] + -1.7249178325130328 * weight[4] + bias))) + -(0.0 * log(logistic(-0.5622875292409727 * weight[0] + -1.0128311203344238 * weight[1] + 0.3142473325952739 * weight[2] + -0.9080240755212111 * weight[3] + -1.4123037013352917 * weight[4] + bias)) + 1.0 * log(1.0 - logistic(-0.5622875292409727 * weight[0] + -1.0128311203344238 * weight[1] + 0.3142473325952739 * weight[2] + -0.9080240755212111 * weight[3] + -1.4123037013352917 * weight[4] + bias))) + -(0.0 * log(logistic(1.465648768921554 * weight[0] + -0.22577630048653566 * weight[1] + 0.06752820468792384 * weight[2] + -1.4247481862134568 * weight[3] + -0.5443827245251827 * weight[4] + bias)) + 1.0 * log(1.0 - logistic(1.465648768921554 * weight[0] + -0.22577630048653566 * weight[1] + 0.06752820468792384 * weight[2] + -1.4247481862134568 * weight[3] + -0.5443827245251827 * weight[4] + bias))) + -(0.0 * log(logistic(0.11092258970986608 * weight[0] + -1.1509935774223028 * weight[1] + 0.37569801834567196 * weight[2] + -0.600638689918805 * weight[3] + -0.2916937497932768 * weight[4] + bias)) + 1.0 * log(1.0 - logistic(0.11092258970986608 * weight[0] + -1.1509935774223028 * weight[1] + 0.37569801834567196 * weight[2] + -0.600638689918805 * weight[3] + -0.2916937497932768 * weight[4] + bias))) + -(0.0 * log(logistic(-0.6017066122293969 * weight[0] + 1.8522781845089378 * weight[1] + -0.013497224737933914 * weight[2] + -1.0577109289559 * weight[3] + 0.822544912103189 * weight[4] + bias)) + 1.0 * log(1.0 - logistic(-0.6017066122293969 * weight[0] + 1.8522781845089378 * weight[1] + -0.013497224737933914 * weight[2] + -1.0577109289559 * weight[3] + 0.822544912103189 * weight[4] + bias))) + -(0.0 * log(logistic(-1.2208436499710222 * weight[0] + 0.2088635950047554 * weight[1] + -1.9596701238797756 * weight[2] + -1.3281860488984305 * weight[3] + 0.19686123586912352 * weight[4] + bias)) + 1.0 * log(1.0 - logistic(-1.2208436499710222 * weight[0] + 0.2088635950047554 * weight[1] + -1.9596701238797756 * weight[2] + -1.3281860488984305 * weight[3] + 0.19686123586912352 * weight[4] + bias))) + -(0.0 * log(logistic(0.7384665799954104 * weight[0] + 0.1713682811899705 * weight[1] + -0.11564828238824053 * weight[2] + -0.3011036955892888 * weight[3] + -1.4785219903674274 * weight[4] + bias)) + 1.0 * log(1.0 - logistic(0.7384665799954104 * weight[0] + 0.1713682811899705 * weight[1] + -0.11564828238824053 * weight[2] + -0.3011036955892888 * weight[3] + -1.4785219903674274 * weight[4] + bias))) + -(0.0 * log(logistic(-0.7198442083947086 * weight[0] + -0.4606387709597875 * weight[1] + 1.0571222262189157 * weight[2] + 0.3436182895684614 * weight[3] + -1.763040155362734 * weight[4] + bias)) + 1.0 * log(1.0 - logistic(-0.7198442083947086 * weight[0] + -0.4606387709597875 * weight[1] + 1.0571222262189157 * weight[2] + 0.3436182895684614 * weight[3] + -1.763040155362734 * weight[4] + bias))) + -(1.0 * log(logistic(0.324083969394795 * weight[0] + -0.38508228041631654 * weight[1] + -0.6769220003059587 * weight[2] + 0.6116762888408679 * weight[3] + 1.030999522495951 * weight[4] + bias)) + 0.0 * log(1.0 - logistic(0.324083969394795 * weight[0] + -0.38508228041631654 * weight[1] + -0.6769220003059587 * weight[2] + 0.6116762888408679 * weight[3] + 1.030999522495951 * weight[4] + bias))) + -(1.0 * log(logistic(0.9312801191161986 * weight[0] + -0.8392175232226385 * weight[1] + -0.3092123758512146 * weight[2] + 0.33126343140356396 * weight[3] + 0.9755451271223592 * weight[4] + bias)) + 0.0 * log(1.0 - logistic(0.9312801191161986 * weight[0] + -0.8392175232226385 * weight[1] + -0.3092123758512146 * weight[2] + 0.33126343140356396 * weight[3] + 0.9755451271223592 * weight[4] + bias))) + -(0.0 * log(logistic(-0.47917423784528995 * weight[0] + -0.18565897666381712 * weight[1] + -1.1063349740060282 * weight[2] + -1.1962066240806708 * weight[3] + 0.812525822394198 * weight[4] + bias)) + 1.0 * log(1.0 - logistic(-0.47917423784528995 * weight[0] + -0.18565897666381712 * weight[1] + -1.1063349740060282 * weight[2] + -1.1962066240806708 * weight[3] + 0.812525822394198 * weight[4] + bias))) + -(1.0 * log(logistic(1.356240028570823 * weight[0] + -0.07201012158033385 * weight[1] + 1.0035328978920242 * weight[2] + 0.36163602504763415 * weight[3] + -0.6451197546051243 * weight[4] + bias)) + 0.0 * log(1.0 - logistic(1.356240028570823 * weight[0] + -0.07201012158033385 * weight[1] + 1.0035328978920242 * weight[2] + 0.36163602504763415 * weight[3] + -0.6451197546051243 * weight[4] + bias))) + -(0.0 * log(logistic(0.36139560550841393 * weight[0] + 1.5380365664659692 * weight[1] + -0.03582603910995153 * weight[2] + 1.564643655814006 * weight[3] + -2.6197451040897444 * weight[4] + bias)) + 1.0 * log(1.0 - logistic(0.36139560550841393 * weight[0] + 1.5380365664659692 * weight[1] + -0.03582603910995153 * weight[2] + 1.564643655814006 * weight[3] + -2.6197451040897444 * weight[4] + bias))) + -(0.0 * log(logistic(0.8219025043752238 * weight[0] + 0.08704706823817134 * weight[1] + -0.29900735046586785 * weight[2] + 0.0917607765355023 * weight[3] + -1.9875689146008928 * weight[4] + bias)) + 1.0 * log(1.0 - logistic(0.8219025043752238 * weight[0] + 0.08704706823817134 * weight[1] + -0.29900735046586785 * weight[2] + 0.0917607765355023 * weight[3] + -1.9875689146008928 * weight[4] + bias))) + -(0.0 * log(logistic(-0.21967188783751193 * weight[0] + 0.3571125715117464 * weight[1] + 1.477894044741516 * weight[2] + -0.5182702182736474 * weight[3] + -0.8084936028931876 * weight[4] + bias)) + 1.0 * log(1.0 - logistic(-0.21967188783751193 * weight[0] + 0.3571125715117464 * weight[1] + 1.477894044741516 * weight[2] + -0.5182702182736474 * weight[3] + -0.8084936028931876 * weight[4] + bias))) + -(0.0 * log(logistic(-0.5017570435845365 * weight[0] + 0.9154021177020741 * weight[1] + 0.32875110965968446 * weight[2] + -0.5297602037670388 * weight[3] + 0.5132674331133561 * weight[4] + bias)) + 1.0 * log(1.0 - logistic(-0.5017570435845365 * weight[0] + 0.9154021177020741 * weight[1] + 0.32875110965968446 * weight[2] + -0.5297602037670388 * weight[3] + 0.5132674331133561 * weight[4] + bias))) + -(0.0 * log(logistic(0.09707754934804039 * weight[0] + 0.9686449905328892 * weight[1] + -0.7020530938773524 * weight[2] + -0.3276621465977682 * weight[3] + -0.39210815313215763 * weight[4] + bias)) + 1.0 * log(1.0 - logistic(0.09707754934804039 * weight[0] + 0.9686449905328892 * weight[1] + -0.7020530938773524 * weight[2] + -0.3276621465977682 * weight[3] + -0.39210815313215763 * weight[4] + bias))) + -(0.0 * log(logistic(-1.4635149481321186 * weight[0] + 0.29612027706457605 * weight[1] + 0.26105527217988933 * weight[2] + 0.0051134566424609 * weight[3] + -0.23458713337514742 * weight[4] + bias)) + 1.0 * log(1.0 - logistic(-1.4635149481321186 * weight[0] + 0.29612027706457605 * weight[1] + 0.26105527217988933 * weight[2] + 0.0051134566424609 * weight[3] + -0.23458713337514742 * weight[4] + bias))) + -(0.0 * log(logistic(-1.4153707420504142 * weight[0] + -0.42064532276535904 * weight[1] + -0.3427145165267695 * weight[2] + -0.8022772692216189 * weight[3] + -0.16128571166600914 * weight[4] + bias)) + 1.0 * log(1.0 - logistic(-1.4153707420504142 * weight[0] + -0.42064532276535904 * weight[1] + -0.3427145165267695 * weight[2] + -0.8022772692216189 * weight[3] + -0.16128571166600914 * weight[4] + bias))) + -(1.0 * log(logistic(0.4040508568145384 * weight[0] + 1.8861859012105302 * weight[1] + 0.17457781283183896 * weight[2] + 0.257550390722764 * weight[3] + -0.0744459157661671 * weight[4] + bias)) + 0.0 * log(1.0 - logistic(0.4040508568145384 * weight[0] + 1.8861859012105302 * weight[1] + 0.17457781283183896 * weight[2] + 0.257550390722764 * weight[3] + -0.0744459157661671 * weight[4] + bias))) + -(1.0 * log(logistic(-1.9187712152990415 * weight[0] + -0.026513875449216878 * weight[1] + 0.06023020994102644 * weight[2] + 2.463242112485286 * weight[3] + -0.19236096478112252 * weight[4] + bias)) + 0.0 * log(1.0 - logistic(-1.9187712152990415 * weight[0] + -0.026513875449216878 * weight[1] + 0.06023020994102644 * weight[2] + 2.463242112485286 * weight[3] + -0.19236096478112252 * weight[4] + bias))) + -(1.0 * log(logistic(0.30154734233361247 * weight[0] + -0.03471176970524331 * weight[1] + -1.168678037619532 * weight[2] + 1.1428228145150205 * weight[3] + 0.7519330326867741 * weight[4] + bias)) + 0.0 * log(1.0 - logistic(0.30154734233361247 * weight[0] + -0.03471176970524331 * weight[1] + -1.168678037619532 * weight[2] + 1.1428228145150205 * weight[3] + 0.7519330326867741 * weight[4] + bias))) + -(1.0 * log(logistic(0.7910319470430469 * weight[0] + -0.9093874547947389 * weight[1] + 1.4027943109360992 * weight[2] + -1.4018510627922809 * weight[3] + 0.5868570938002703 * weight[4] + bias)) + 0.0 * log(1.0 - logistic(0.7910319470430469 * weight[0] + -0.9093874547947389 * weight[1] + 1.4027943109360992 * weight[2] + -1.4018510627922809 * weight[3] + 0.5868570938002703 * weight[4] + bias))) + -(0.0 * log(logistic(2.1904556258099785 * weight[0] + -0.9905363251306883 * weight[1] + -0.5662977296027719 * weight[2] + 0.09965136508764122 * weight[3] + -0.5034756541161992 * weight[4] + bias)) + 1.0 * log(1.0 - logistic(2.1904556258099785 * weight[0] + -0.9905363251306883 * weight[1] + -0.5662977296027719 * weight[2] + 0.09965136508764122 * weight[3] + -0.5034756541161992 * weight[4] + bias))) + -(1.0 * log(logistic(-1.5506634310661327 * weight[0] + 0.06856297480602733 * weight[1] + -1.0623037137261049 * weight[2] + 0.4735924306351816 * weight[3] + -0.9194242342338034 * weight[4] + bias)) + 0.0 * log(1.0 - logistic(-1.5506634310661327 * weight[0] + 0.06856297480602733 * weight[1] + -1.0623037137261049 * weight[2] + 0.4735924306351816 * weight[3] + -0.9194242342338034 * weight[4] + bias))) + -(0.0 * log(logistic(1.5499344050175399 * weight[0] + -0.7832532923362371 * weight[1] + -0.3220615162056756 * weight[2] + 0.8135172173696698 * weight[3] + -1.2308643164339552 * weight[4] + bias)) + 1.0 * log(1.0 - logistic(1.5499344050175399 * weight[0] + -0.7832532923362371 * weight[1] + -0.3220615162056756 * weight[2] + 0.8135172173696698 * weight[3] + -1.2308643164339552 * weight[4] + bias))) + -(1.0 * log(logistic(0.22745993460412942 * weight[0] + 1.307142754282428 * weight[1] + -1.6074832345612275 * weight[2] + 0.1846338585323042 * weight[3] + 0.2598827942484235 * weight[4] + bias)) + 0.0 * log(1.0 - logistic(0.22745993460412942 * weight[0] + 1.307142754282428 * weight[1] + -1.6074832345612275 * weight[2] + 0.1846338585323042 * weight[3] + 0.2598827942484235 * weight[4] + bias))) + -(0.0 * log(logistic(0.78182287177731 * weight[0] + -1.236950710878082 * weight[1] + -1.3204566130842763 * weight[2] + 0.5219415656168976 * weight[3] + 0.29698467323318606 * weight[4] + bias)) + 1.0 * log(1.0 - logistic(0.78182287177731 * weight[0] + -1.236950710878082 * weight[1] + -1.3204566130842763 * weight[2] + 0.5219415656168976 * weight[3] + 0.29698467323318606 * weight[4] + bias)))>

Regularized Objective

L2 regularization component:

l2_reg = gp.quicksum(w[j] * w[j] for j in range(n_features))
l2_reg

<gurobi.QuadExpr: 0.0 + [ weight[0] ^ 2 + weight[1] ^ 2 + weight[2] ^ 2 + weight[3] ^ 2 + weight[4] ^ 2 ]>

\[ \begin{align} \min_{w, b} \quad & \frac{1}{n}\sum_{i=1}^n \left[-y_i \log(p_i) - (1-y_i)\log(1-p_i)\right] + \lambda \|w\|^2 \\ \end{align} \]

Set the objective using a resultant variable:

objvar = model.addVar(lb=-GRB.INFINITY, name="objective")
model.addConstr(objvar == sum_loss / n_samples + lambda_reg * l2_reg)

model.setObjective(objvar, GRB.MINIMIZE)

Fairness Constraint

Compute average predictions per group:

group0_idx = [i for i in range(n_samples) if protected_groups[i] == 0]
group1_idx = [i for i in range(n_samples) if protected_groups[i] == 1]

avg_p0 = gp.quicksum(p[i] for i in group0_idx) / len(group0_idx)
avg_p1 = gp.quicksum(p[i] for i in group1_idx) / len(group1_idx)

Enforce maximum difference over the training set:

gap = model.addVar(lb=-fairness_tol, ub=fairness_tol, name="fairness_gap")
model.addConstr(gap == avg_p0 - avg_p1)

\[ \begin{align} \left| \frac{1}{|G_0|}\sum_{i \in G_0} p_i - \frac{1}{|G_1|}\sum_{i \in G_1} p_i \right| \leq \epsilon \end{align} \]

Run Local Optimization

model.Params.OptimalityTarget = 1
model.optimize()

                           Residuals
Iter     Objective      Primal     Dual      Compl     Step     Time
   0   9.29389150e-01  1.99e-02  9.00e+00  3.76e+00  0.00e+00     0s
   1   9.36440941e-01  1.82e-03  2.33e-16  9.79e-01  1.00e+00     0s
   2   1.50531648e-01  3.42e-01  7.86e-05  1.50e-01  1.00e+00     0s
   3   1.50235162e-01  4.88e-01  2.23e-04  1.50e-01  1.00e+00     0s
   4   1.50164053e-01  7.22e-01  2.69e-04  1.50e-01  1.00e+00     0s
   5   1.50098036e-01  3.38e-01  6.85e-05  1.50e-01  1.00e+00     0s
  ...
  21   4.11209993e-01  7.75e-06  2.76e-06  1.90e-05  1.00e+00     0s
  22   4.11200317e-01  2.28e-08  6.92e-09  1.43e-05  1.00e+00     0s
  23   4.11186199e-01  1.52e-08  5.45e-09  1.74e-07  1.00e+00     0s
  24   4.11186180e-01  8.87e-14  2.70e-14  1.65e-07  1.00e+00     0s
  25   4.11186016e-01  2.06e-12  7.37e-13  1.00e-09  1.00e+00     0s
First-order optimal solution

Important

Running global solver afterwards confirms global optimalty in 12 sec

Extract Learned Parameters

if model.status in [GRB.OPTIMAL, GRB.LOCALLY_OPTIMAL]:
    learned_w = np.array([w[j].X for j in range(n_features)])
    learned_b = b.X

def predict_proba(X_new):
    """ For given features, return output probabilities """
    linear = X_new @ learned_w + learned_b
    return 1.0 / (1.0 + np.exp(-linear))

Check against the training data:

probs_pred = predict_proba(X)
preds = (probs_pred >= 0.5).astype(int)
accuracy = np.mean(preds == y)
print(f"Accuracy: {accuracy:.1%}")

avg_pred_g0 = np.mean(probs_pred[protected_groups == 0])
avg_pred_g1 = np.mean(probs_pred[protected_groups == 1])
fairness_gap = abs(avg_pred_g0 - avg_pred_g1)
print(f"Fairness gap: {fairness_gap:.4f}")

Accuracy: 83.3%
Fairness gap: 0.1000

Fairness Gap

Fair Logistic Regression: Takeaways

ML models can be trained via local optimization:

Directly model nonlinear loss functions
Easily incorporate constraints

Complex nonlinear expressions are handled natively:

Preferable to carry complex expressions right through
- (Here we did not introduce variables for $p_i$)
NL barrier often converges better if we avoid auxiliary variables

sum_loss / n_samples + lambda_reg * l2_reg

<gurobi.NLExpr: (0.0 + -(0.0 * log(logistic(0.4967141530112327 * weight[0] + -0.13826430117118466 * weight[1] + 0.6476885381006925 * weight[2] + 1.5230298564080254 * weight[3] + -0.23415337472333597 * weight[4] + bias)) + 1.0 * log(1.0 - logistic(0.4967141530112327 * weight[0] + -0.13826430117118466 * weight[1] + 0.6476885381006925 * weight[2] + 1.5230298564080254 * weight[3] + -0.23415337472333597 * weight[4] + bias))) + -(1.0 * log(logistic(-0.23413695694918055 * weight[0] + 1.5792128155073915 * weight[1] + 0.7674347291529088 * weight[2] + -0.4694743859349521 * weight[3] + 0.5425600435859647 * weight[4] + bias)) + 0.0 * log(1.0 - logistic(-0.23413695694918055 * weight[0] + 1.5792128155073915 * weight[1] + 0.7674347291529088 * weight[2] + -0.4694743859349521 * weight[3] + 0.5425600435859647 * weight[4] + bias))) + -(0.0 * log(logistic(-0.46341769281246226 * weight[0] + -0.46572975357025687 * weight[1] + 0.24196227156603412 * weight[2] + -1.913280244657798 * weight[3] + -1.7249178325130328 * weight[4] + bias)) + 1.0 * log(1.0 - logistic(-0.46341769281246226 * weight[0] + -0.46572975357025687 * weight[1] + 0.24196227156603412 * weight[2] + -1.913280244657798 * weight[3] + -1.7249178325130328 * weight[4] + bias))) + -(0.0 * log(logistic(-0.5622875292409727 * weight[0] + -1.0128311203344238 * weight[1] + 0.3142473325952739 * weight[2] + -0.9080240755212111 * weight[3] + -1.4123037013352917 * weight[4] + bias)) + 1.0 * log(1.0 - logistic(-0.5622875292409727 * weight[0] + -1.0128311203344238 * weight[1] + 0.3142473325952739 * weight[2] + -0.9080240755212111 * weight[3] + -1.4123037013352917 * weight[4] + bias))) + -(0.0 * log(logistic(1.465648768921554 * weight[0] + -0.22577630048653566 * weight[1] + 0.06752820468792384 * weight[2] + -1.4247481862134568 * weight[3] + -0.5443827245251827 * weight[4] + bias)) + 1.0 * log(1.0 - logistic(1.465648768921554 * weight[0] + -0.22577630048653566 * weight[1] + 0.06752820468792384 * weight[2] + -1.4247481862134568 * weight[3] + -0.5443827245251827 * weight[4] + bias))) + -(0.0 * log(logistic(0.11092258970986608 * weight[0] + -1.1509935774223028 * weight[1] + 0.37569801834567196 * weight[2] + -0.600638689918805 * weight[3] + -0.2916937497932768 * weight[4] + bias)) + 1.0 * log(1.0 - logistic(0.11092258970986608 * weight[0] + -1.1509935774223028 * weight[1] + 0.37569801834567196 * weight[2] + -0.600638689918805 * weight[3] + -0.2916937497932768 * weight[4] + bias))) + -(0.0 * log(logistic(-0.6017066122293969 * weight[0] + 1.8522781845089378 * weight[1] + -0.013497224737933914 * weight[2] + -1.0577109289559 * weight[3] + 0.822544912103189 * weight[4] + bias)) + 1.0 * log(1.0 - logistic(-0.6017066122293969 * weight[0] + 1.8522781845089378 * weight[1] + -0.013497224737933914 * weight[2] + -1.0577109289559 * weight[3] + 0.822544912103189 * weight[4] + bias))) + -(0.0 * log(logistic(-1.2208436499710222 * weight[0] + 0.2088635950047554 * weight[1] + -1.9596701238797756 * weight[2] + -1.3281860488984305 * weight[3] + 0.19686123586912352 * weight[4] + bias)) + 1.0 * log(1.0 - logistic(-1.2208436499710222 * weight[0] + 0.2088635950047554 * weight[1] + -1.9596701238797756 * weight[2] + -1.3281860488984305 * weight[3] + 0.19686123586912352 * weight[4] + bias))) + -(0.0 * log(logistic(0.7384665799954104 * weight[0] + 0.1713682811899705 * weight[1] + -0.11564828238824053 * weight[2] + -0.3011036955892888 * weight[3] + -1.4785219903674274 * weight[4] + bias)) + 1.0 * log(1.0 - logistic(0.7384665799954104 * weight[0] + 0.1713682811899705 * weight[1] + -0.11564828238824053 * weight[2] + -0.3011036955892888 * weight[3] + -1.4785219903674274 * weight[4] + bias))) + -(0.0 * log(logistic(-0.7198442083947086 * weight[0] + -0.4606387709597875 * weight[1] + 1.0571222262189157 * weight[2] + 0.3436182895684614 * weight[3] + -1.763040155362734 * weight[4] + bias)) + 1.0 * log(1.0 - logistic(-0.7198442083947086 * weight[0] + -0.4606387709597875 * weight[1] + 1.0571222262189157 * weight[2] + 0.3436182895684614 * weight[3] + -1.763040155362734 * weight[4] + bias))) + -(1.0 * log(logistic(0.324083969394795 * weight[0] + -0.38508228041631654 * weight[1] + -0.6769220003059587 * weight[2] + 0.6116762888408679 * weight[3] + 1.030999522495951 * weight[4] + bias)) + 0.0 * log(1.0 - logistic(0.324083969394795 * weight[0] + -0.38508228041631654 * weight[1] + -0.6769220003059587 * weight[2] + 0.6116762888408679 * weight[3] + 1.030999522495951 * weight[4] + bias))) + -(1.0 * log(logistic(0.9312801191161986 * weight[0] + -0.8392175232226385 * weight[1] + -0.3092123758512146 * weight[2] + 0.33126343140356396 * weight[3] + 0.9755451271223592 * weight[4] + bias)) + 0.0 * log(1.0 - logistic(0.9312801191161986 * weight[0] + -0.8392175232226385 * weight[1] + -0.3092123758512146 * weight[2] + 0.33126343140356396 * weight[3] + 0.9755451271223592 * weight[4] + bias))) + -(0.0 * log(logistic(-0.47917423784528995 * weight[0] + -0.18565897666381712 * weight[1] + -1.1063349740060282 * weight[2] + -1.1962066240806708 * weight[3] + 0.812525822394198 * weight[4] + bias)) + 1.0 * log(1.0 - logistic(-0.47917423784528995 * weight[0] + -0.18565897666381712 * weight[1] + -1.1063349740060282 * weight[2] + -1.1962066240806708 * weight[3] + 0.812525822394198 * weight[4] + bias))) + -(1.0 * log(logistic(1.356240028570823 * weight[0] + -0.07201012158033385 * weight[1] + 1.0035328978920242 * weight[2] + 0.36163602504763415 * weight[3] + -0.6451197546051243 * weight[4] + bias)) + 0.0 * log(1.0 - logistic(1.356240028570823 * weight[0] + -0.07201012158033385 * weight[1] + 1.0035328978920242 * weight[2] + 0.36163602504763415 * weight[3] + -0.6451197546051243 * weight[4] + bias))) + -(0.0 * log(logistic(0.36139560550841393 * weight[0] + 1.5380365664659692 * weight[1] + -0.03582603910995153 * weight[2] + 1.564643655814006 * weight[3] + -2.6197451040897444 * weight[4] + bias)) + 1.0 * log(1.0 - logistic(0.36139560550841393 * weight[0] + 1.5380365664659692 * weight[1] + -0.03582603910995153 * weight[2] + 1.564643655814006 * weight[3] + -2.6197451040897444 * weight[4] + bias))) + -(0.0 * log(logistic(0.8219025043752238 * weight[0] + 0.08704706823817134 * weight[1] + -0.29900735046586785 * weight[2] + 0.0917607765355023 * weight[3] + -1.9875689146008928 * weight[4] + bias)) + 1.0 * log(1.0 - logistic(0.8219025043752238 * weight[0] + 0.08704706823817134 * weight[1] + -0.29900735046586785 * weight[2] + 0.0917607765355023 * weight[3] + -1.9875689146008928 * weight[4] + bias))) + -(0.0 * log(logistic(-0.21967188783751193 * weight[0] + 0.3571125715117464 * weight[1] + 1.477894044741516 * weight[2] + -0.5182702182736474 * weight[3] + -0.8084936028931876 * weight[4] + bias)) + 1.0 * log(1.0 - logistic(-0.21967188783751193 * weight[0] + 0.3571125715117464 * weight[1] + 1.477894044741516 * weight[2] + -0.5182702182736474 * weight[3] + -0.8084936028931876 * weight[4] + bias))) + -(0.0 * log(logistic(-0.5017570435845365 * weight[0] + 0.9154021177020741 * weight[1] + 0.32875110965968446 * weight[2] + -0.5297602037670388 * weight[3] + 0.5132674331133561 * weight[4] + bias)) + 1.0 * log(1.0 - logistic(-0.5017570435845365 * weight[0] + 0.9154021177020741 * weight[1] + 0.32875110965968446 * weight[2] + -0.5297602037670388 * weight[3] + 0.5132674331133561 * weight[4] + bias))) + -(0.0 * log(logistic(0.09707754934804039 * weight[0] + 0.9686449905328892 * weight[1] + -0.7020530938773524 * weight[2] + -0.3276621465977682 * weight[3] + -0.39210815313215763 * weight[4] + bias)) + 1.0 * log(1.0 - logistic(0.09707754934804039 * weight[0] + 0.9686449905328892 * weight[1] + -0.7020530938773524 * weight[2] + -0.3276621465977682 * weight[3] + -0.39210815313215763 * weight[4] + bias))) + -(0.0 * log(logistic(-1.4635149481321186 * weight[0] + 0.29612027706457605 * weight[1] + 0.26105527217988933 * weight[2] + 0.0051134566424609 * weight[3] + -0.23458713337514742 * weight[4] + bias)) + 1.0 * log(1.0 - logistic(-1.4635149481321186 * weight[0] + 0.29612027706457605 * weight[1] + 0.26105527217988933 * weight[2] + 0.0051134566424609 * weight[3] + -0.23458713337514742 * weight[4] + bias))) + -(0.0 * log(logistic(-1.4153707420504142 * weight[0] + -0.42064532276535904 * weight[1] + -0.3427145165267695 * weight[2] + -0.8022772692216189 * weight[3] + -0.16128571166600914 * weight[4] + bias)) + 1.0 * log(1.0 - logistic(-1.4153707420504142 * weight[0] + -0.42064532276535904 * weight[1] + -0.3427145165267695 * weight[2] + -0.8022772692216189 * weight[3] + -0.16128571166600914 * weight[4] + bias))) + -(1.0 * log(logistic(0.4040508568145384 * weight[0] + 1.8861859012105302 * weight[1] + 0.17457781283183896 * weight[2] + 0.257550390722764 * weight[3] + -0.0744459157661671 * weight[4] + bias)) + 0.0 * log(1.0 - logistic(0.4040508568145384 * weight[0] + 1.8861859012105302 * weight[1] + 0.17457781283183896 * weight[2] + 0.257550390722764 * weight[3] + -0.0744459157661671 * weight[4] + bias))) + -(1.0 * log(logistic(-1.9187712152990415 * weight[0] + -0.026513875449216878 * weight[1] + 0.06023020994102644 * weight[2] + 2.463242112485286 * weight[3] + -0.19236096478112252 * weight[4] + bias)) + 0.0 * log(1.0 - logistic(-1.9187712152990415 * weight[0] + -0.026513875449216878 * weight[1] + 0.06023020994102644 * weight[2] + 2.463242112485286 * weight[3] + -0.19236096478112252 * weight[4] + bias))) + -(1.0 * log(logistic(0.30154734233361247 * weight[0] + -0.03471176970524331 * weight[1] + -1.168678037619532 * weight[2] + 1.1428228145150205 * weight[3] + 0.7519330326867741 * weight[4] + bias)) + 0.0 * log(1.0 - logistic(0.30154734233361247 * weight[0] + -0.03471176970524331 * weight[1] + -1.168678037619532 * weight[2] + 1.1428228145150205 * weight[3] + 0.7519330326867741 * weight[4] + bias))) + -(1.0 * log(logistic(0.7910319470430469 * weight[0] + -0.9093874547947389 * weight[1] + 1.4027943109360992 * weight[2] + -1.4018510627922809 * weight[3] + 0.5868570938002703 * weight[4] + bias)) + 0.0 * log(1.0 - logistic(0.7910319470430469 * weight[0] + -0.9093874547947389 * weight[1] + 1.4027943109360992 * weight[2] + -1.4018510627922809 * weight[3] + 0.5868570938002703 * weight[4] + bias))) + -(0.0 * log(logistic(2.1904556258099785 * weight[0] + -0.9905363251306883 * weight[1] + -0.5662977296027719 * weight[2] + 0.09965136508764122 * weight[3] + -0.5034756541161992 * weight[4] + bias)) + 1.0 * log(1.0 - logistic(2.1904556258099785 * weight[0] + -0.9905363251306883 * weight[1] + -0.5662977296027719 * weight[2] + 0.09965136508764122 * weight[3] + -0.5034756541161992 * weight[4] + bias))) + -(1.0 * log(logistic(-1.5506634310661327 * weight[0] + 0.06856297480602733 * weight[1] + -1.0623037137261049 * weight[2] + 0.4735924306351816 * weight[3] + -0.9194242342338034 * weight[4] + bias)) + 0.0 * log(1.0 - logistic(-1.5506634310661327 * weight[0] + 0.06856297480602733 * weight[1] + -1.0623037137261049 * weight[2] + 0.4735924306351816 * weight[3] + -0.9194242342338034 * weight[4] + bias))) + -(0.0 * log(logistic(1.5499344050175399 * weight[0] + -0.7832532923362371 * weight[1] + -0.3220615162056756 * weight[2] + 0.8135172173696698 * weight[3] + -1.2308643164339552 * weight[4] + bias)) + 1.0 * log(1.0 - logistic(1.5499344050175399 * weight[0] + -0.7832532923362371 * weight[1] + -0.3220615162056756 * weight[2] + 0.8135172173696698 * weight[3] + -1.2308643164339552 * weight[4] + bias))) + -(1.0 * log(logistic(0.22745993460412942 * weight[0] + 1.307142754282428 * weight[1] + -1.6074832345612275 * weight[2] + 0.1846338585323042 * weight[3] + 0.2598827942484235 * weight[4] + bias)) + 0.0 * log(1.0 - logistic(0.22745993460412942 * weight[0] + 1.307142754282428 * weight[1] + -1.6074832345612275 * weight[2] + 0.1846338585323042 * weight[3] + 0.2598827942484235 * weight[4] + bias))) + -(0.0 * log(logistic(0.78182287177731 * weight[0] + -1.236950710878082 * weight[1] + -1.3204566130842763 * weight[2] + 0.5219415656168976 * weight[3] + 0.29698467323318606 * weight[4] + bias)) + 1.0 * log(1.0 - logistic(0.78182287177731 * weight[0] + -1.236950710878082 * weight[1] + -1.3204566130842763 * weight[2] + 0.5219415656168976 * weight[3] + 0.29698467323318606 * weight[4] + bias)))) / 30.0 + 0.0001 * weight[0]**2 + 0.0001 * weight[1]**2 + 0.0001 * weight[2]**2 + 0.0001 * weight[3]**2 + 0.0001 * weight[4]**2>

Circle Packing

Problem Description

Pack $N$ circles in a unit square to maximize sum of radii

Circle Packing: Problem Formulation

Problem data is just the number of circles to place:

circles = range(50)

Decision variables ($x_i$, $y_i$, $r_i$) are simple to state:

import gurobipy as gp
from gurobipy import GRB

model = gp.Model(env=env)
x = model.addVars(circles, ub=1.0, name="x") # x-coordinate of centers
y = model.addVars(circles, ub=1.0, name="y") # y-coordinate of centers
r = model.addVars(circles, ub=0.5, name="r") # Radii (≤0.5 to fit in square)

Circle Packing: Constraints

Pairs of circles cannot overlap (nonconvex constraint!):

from itertools import combinations

for i, j in combinations(circles, r=2):
    model.addConstr(
        (x[i] - x[j])**2 + (y[i] - y[j])**2 >= (r[i] + r[j])**2
    )

Each circle must stay inside unit square:

for i in circles:
    model.addConstr(x[i] - r[i] >= 0)
    model.addConstr(x[i] + r[i] <= 1)
    model.addConstr(y[i] - r[i] >= 0)
    model.addConstr(y[i] + r[i] <= 1)

Maximize the total radius:

model.setObjective(r.sum(), GRB.MAXIMIZE)

Run Local Optimization

model.Params.OptimalityTarget = 1
model.optimize()

                           Residuals
Iter     Objective      Primal     Dual      Compl     Step     Time
   0   1.25058900e+01  1.03e+01  5.53e+02  8.33e+01  0.00e+00     0s
   1   6.33643586e+00  1.81e-01  2.97e+02  1.60e+00  9.89e-01     0s
   2   4.04667381e+00  3.70e-02  3.25e+02  6.20e-01  8.54e-01     0s
   3   9.10471390e-01  4.22e-02  2.25e+02  9.03e-01  9.27e-01     0s
   4   7.98674534e-01  1.05e-02  2.91e+02  9.80e-01  9.25e-01     0s
   5   7.08424644e-01  2.13e-02  3.24e+02  9.11e-01  6.38e-01     0s
  ...
 219   3.55966588e+00  3.10e-09  6.99e-08  1.43e-05  1.00e+00     0s
 220   3.56145171e+00  6.07e-06  1.71e-03  7.33e-07  8.37e-01     0s
 221   3.56179122e+00  2.69e-09  2.08e-07  1.65e-07  1.00e+00     0s
 222   3.56179121e+00  6.55e-11  9.64e-10  1.65e-07  1.00e+00     0s
 223   3.56181582e+00  5.98e-10  7.97e-09  9.99e-10  1.00e+00     0s
First-order optimal solution

Solutions For Different Starting Points ($N=50$)

# Extract solution
xsol = [x[i].X for i in circles]
ysol = [y[i].X for i in circles]
rsol = [r[i].X for i in circles]

Robotic Arm Motion

Problem Description

Actuated robot arm
- Fixed geometry
- Known weight / inertia
Actuator forces are limited
Goal: Take the fastest path between initial and final positions
Model using time-discretised equations of motion
Nonlinear constraints, complex expressions

Robot Arm: Problem Formulation

State Variables (at each time step $i = 0, 1, \ldots, n_h$):

$ρ_i, \dot{ρ}_i$: radial position and velocity
$θ_i, \dot{θ}_i$: azimuthal angle and angular velocity
$φ_i, \dot{φ}_i$: polar angle and angular velocity

Control Variables (at each time step $i = 0, 1, \ldots, n_h$):

$u_ρ^i$: control input for radial motion, $|u_ρ^i| \leq u_ρ^{\max}$
$u_θ^i$: control input for azimuthal rotation, $|u_θ^i| \leq u_θ^{\max}$
$u_φ^i$: control input for polar rotation, $|u_φ^i| \leq u_φ^{\max}$

Time Variable:

$t_f$: final time (to be minimized)
$h = \frac{t_f}{n_h}$: variable time step size

Define Variables

State variables:

# Number of time intervals
nh = 200

# Position
rho = m.addVars(nh+1, lb=0, ub=L, name="rho")
the = m.addVars(nh+1, lb=-pi, ub=pi, name="the")
phi = m.addVars(nh+1, lb=0, ub=pi, name="phi")

# Velocities
rho_dot = m.addVars(nh+1, lb=-GRB.INFINITY, name="rho_dot")
the_dot = m.addVars(nh+1, lb=-GRB.INFINITY, name="the_dot")
phi_dot = m.addVars(nh+1, lb=-GRB.INFINITY, name="phi_dot")

Control Inputs (with bounds as limits):

u_rho = m.addVars(nh+1, lb=-max_u_rho, ub=max_u_rho, name="u_rho")
u_the = m.addVars(nh+1, lb=-max_u_the, ub=max_u_the, name="u_the")
u_phi = m.addVars(nh+1, lb=-max_u_phi, ub=max_u_phi, name="u_phi")
tf = m.addVar(lb=0, name="tf")

Objective is to minimize final time:

m.setObjective(tf, GRB.MINIMIZE)

Dynamics Constraints: Position Update

\[ \begin{align} ρ_j &= ρ_{j-1} + \frac{h}{2}(\dot{ρ}_j + \dot{ρ}_{j-1}), \quad j = 1, \ldots, n_h \\ θ_j &= θ_{j-1} + \frac{h}{2}(\dot{θ}_j + \dot{θ}_{j-1}), \quad j = 1, \ldots, n_h \\ φ_j &= φ_{j-1} + \frac{h}{2}(\dot{φ}_j + \dot{φ}_{j-1}), \quad j = 1, \ldots, n_h \end{align} \]

for j in range(1, nh+1):
    m.addConstr(rho[j] == rho[j-1] + 0.5*(tf/nh)*(rho_dot[j] + rho_dot[j-1]))
    m.addConstr(the[j] == the[j-1] + 0.5*(tf/nh)*(the_dot[j] + the_dot[j-1]))
    m.addConstr(phi[j] == phi[j-1] + 0.5*(tf/nh)*(phi_dot[j] + phi_dot[j-1]))

Boundary conditions:

rho[0].lb = rho[0].ub = rho_0
the[0].lb = the[0].ub = the_0
phi[0].lb = phi[0].ub = phi_0
rho_dot[0].lb = rho_dot[0].ub = 0
the_dot[0].lb = the_dot[0].ub = 0
phi_dot[0].lb = phi_dot[0].ub = 0

rho[nh].lb = rho[nh].ub = rho_n
the[nh].lb = the[nh].ub = the_n
phi[nh].lb = phi[nh].ub = phi_n
rho_dot[nh].lb = rho_dot[nh].ub = 0
the_dot[nh].lb = the_dot[nh].ub = 0
phi_dot[nh].lb = phi_dot[nh].ub = 0

Dynamics Constraints: Velocity Update

\[ \begin{align} I_θ^i &= \frac{(L-ρ_i)^3 + ρ_i^3}{3} \sin^2(φ_i) \\ I_φ^i &= \frac{(L-ρ_i)^3 + ρ_i^3}{3} \end{align} \]

\[ \begin{align} \dot{ρ}_j &= \dot{ρ}_{j-1} + \frac{h}{2L}(u_ρ^j + u_ρ^{j-1}), \quad j = 1, \ldots, n_h \\ \dot{θ}_j &= \dot{θ}_{j-1} + \frac{h}{2}\left(\frac{u_θ^j}{I_θ^j} + \frac{u_θ^{j-1}}{I_θ^{j-1}}\right), \quad j = 1, \ldots, n_h \\ \dot{φ}_j &= \dot{φ}_{j-1} + \frac{h}{2}\left(\frac{u_φ^j}{I_φ^j} + \frac{u_φ^{j-1}}{I_φ^{j-1}}\right), \quad j = 1, \ldots, n_h \end{align} \]

for j in range(1, nh+1):
    m.addConstr(rho_dot[j] == rho_dot[j-1] + 0.5*(tf/nh)*(u_rho[j] + u_rho[j-1])/L)
    m.addConstr(the_dot[j] == the_dot[j-1] + 0.5*(tf/nh) * (
        u_the[j] / (((L-rho[j])**3 + rho[j]**3) * nlfunc.sin(phi[j])**2 / 3.0) + 
        u_the[j-1] / (((L-rho[j-1])**3 + rho[j-1]**3) * nlfunc.sin(phi[j-1])**2 / 3.0)
    ))
    m.addConstr(phi_dot[j] == phi_dot[j-1] + 0.5*(tf/nh) * (
        u_phi[j] / (((L-rho[j])**3 + rho[j]**3) / 3.0) + 
        u_phi[j-1] / (((L-rho[j-1])**3 + rho[j-1]**3) / 3.0)
    ))

Run Local Optimization

m.Params.OptimalityTarget = 1
m.optimize()

                           Residuals
Iter     Objective      Primal     Dual      Compl     Step     Time
   0   9.76273934e-01  2.37e+00  9.00e+00  1.70e+01  0.00e+00     0s
   1   1.77239676e+00  2.14e+00  3.65e+03  1.50e+01  9.72e-02     0s
   2   1.77239676e-02  2.08e+00  1.76e+03  1.45e+01  3.08e-02     0s
   3   3.89952943e-02  1.99e+00  2.32e+03  1.30e+01  4.40e-02     0s
   4   1.58136087e+00  2.90e-02  3.97e+04  5.00e+00  1.00e+00     0s
   5   1.56389201e+00  6.31e-02  3.95e+04  4.85e+00  1.46e-02     0s
  ...
  60   9.14991648e+00  1.93e-11  1.32e-08  1.43e-05  1.00e+00     0s
  61   9.14152788e+00  7.07e-07  2.34e-04  1.46e-06  1.00e+00     0s
  62   9.14149417e+00  1.43e-10  2.15e-08  1.65e-07  1.00e+00     0s
  63   9.14149419e+00  5.05e-15  7.94e-12  1.65e-07  1.00e+00     0s
  64   9.14139610e+00  1.04e-10  1.79e-10  1.00e-09  1.00e+00     0s
First-order optimal solution

Important

Same solution confirmed optimal by global solver in 1200 secs
No starting point provided

Visualization

Robot arm animation

Summing Up

Local optimality: new option in Gurobi for continuous nonlinear problems
- Enable with OptimalityTarget = 1
- Result can be LOCALLY_OPTIMAL or LOCALLY_INFEASIBLE
Local optima are often sufficient in practice
- Much faster to compute than global optima
- Often, starting points help performance
- For convex problems, NL barrier finds the global optimum
- Not suitable for problems with combinatorial aspects
- Try both for your application

Preview feature in Gurobi 13.0!

Local Nonlinear Optimization in Gurobi 13.0

Gurobi: Problem Types

Local vs Global Optima

Local Nonlinear Optimization in Gurobi 13.0

NL Barrier Overview

Newton’s Method (Step Computation)

Barrier Function (Handling Inequality Constraints)

Barrier Algorithm

Turn on NL Barrier: OptimalityTarget=1

New Parameters and Attributes

A First Example

Find Minima of a Quartic Function

OptimalityTarget=1 selects NL barrier

Solution Visualization (Default Starting Point)

Starting from x = 0

Solution Visualization

Starting from x = 0.5

Solution Visualization

Starting from x = 1.2

Solution Visualization

Starting from x = 1.5

Solution Visualization

Comparison of All Cases

Constrained Problem

Default starting point

Solution Visualization

Starting from x = 0.5

Local Infeasibility (Minimize Constraint Violation)

Summary

Fair Logistic Regression

Problem Description

Loss Function

Fair Logistic Regression: Input Data

Decision Variables for Learned Weights

Expressions for Output Probabilities

Loss Function

Regularized Objective

Fairness Constraint

Run Local Optimization

Extract Learned Parameters

Fairness Gap

Fair Logistic Regression: Takeaways

Circle Packing

Problem Description

Circle Packing: Problem Formulation

Circle Packing: Constraints

Run Local Optimization

Solutions For Different Starting Points (\(N=50\))

Robotic Arm Motion

Problem Description

Robot Arm: Problem Formulation

Define Variables

Dynamics Constraints: Position Update

Dynamics Constraints: Velocity Update

Run Local Optimization

Visualization

Summing Up

Turn on NL Barrier: `OptimalityTarget=1`

`OptimalityTarget=1` selects NL barrier