Gurobi: Problem Types

Local vs Global Optima

Global Optimum

  • Best solution over entire feasible region
  • The default in Gurobi, for all problem types

Local Optimum

  • Best feasible solution in a neighborhood
  • What Gurobi’s NL barrier algorithm provides

Consider: \(\min f(x) = x^4 - 5x^2 + 4 + 1.5x\)

Local Nonlinear Optimization in Gurobi 13.0

Still build models using algebraic expressions:

model.addConstr(z == sin(x) * exp(y))

Just change one parameter to use NL barrier:

model.Params.OptimalityTarget = 1

What we’ll cover today:

  1. How the NL barrier algorithm works
  2. New parameters & status codes
  3. Starting points & solution quality
  4. Walk through several examples

Newton’s Method (Step Computation)

Second order approximations animation

  • Minimize quadratic Taylor model in each iteration to compute next iterate
  • Results in fast local convergence
  • Can be attracted to different local optima
  • Requires safeguard by reducing step size to ensure convergence
  • Constraints handled by linearization

Barrier Function (Handling Inequality Constraints)

\[ \begin{aligned} \min_{x\in [0,10]} x, \ \text{s.t.} \ \ \color{red}{0\leq x \leq 10} \end{aligned} \]

\(\quad\longrightarrow\quad\)

\[ \begin{aligned} \min_{x\in\mathbb{R}} x - \color{red}{\mu \log(x)} - \color{red}{\mu \log(10 - x)} \end{aligned} \]

Barrier animation

  • Replace inequality constraints by logarithmic barrier terms (\(\color{red}{\mu}>0\): barrier parameter)
    • This avoids combinatorial complexity of identifying binding constraints

Barrier Algorithm

\[ \begin{aligned} \min_{x\in\mathbb{R}^n} \ & f(x) &&& \min_{x\in\mathbb{R}^n} \ & f(x) \color{red}{- \mu\sum_{i=1}^n\log(x_i)}\\ \text{s.t.} \ & c(x) = 0 & \longrightarrow && \text{s.t.} \ & c(x) = 0 & \text{(BP$_\mu$)}\\ & \color{red}{{x\geq 0}} \end{aligned} \]

Idea

  • Solve a sequence of barrier problems as \(\mu\to0\)

Primal-dual interior-point algorithm with line search:

  1. Choose \(x_0\in\mathbb{R}^n\). Set \(k\gets 0\).
  2. Choose barrier parameter \(\mu_k\).
  3. Solve linear system to compute Newton step \(d_k\) for (BP\(_{\mu}\)) for current \(\mu_k\).
  4. Choose step size \(\alpha_k\in (0,1]\) and take a step to get new iterate \(x_{k+1}= x_k + \alpha_kd_k\).
  5. Set \(k\gets k+1\); go to 2.

Turn on NL Barrier: OptimalityTarget=1

  • Possible parameter values:
    • -1: default (set to 0)
    • 0: use global MINLP solver
    • 1: use NL barrier
  • “Preview Feature” in Gurobi 13
    • Fully tested and supported
    • No dual information returned
    • Not all solution quality attributes available
    • May change API behavior in future
  • Only used for nonlinear continuous problems
    • Error if there are discrete variables, SOS constraints, or non-differentiable functions
    • Ignored if problem is LP or convex QP/QCP

New Parameters and Attributes

Parameters:

  • NLBarIterLimit: Iteration limit
  • NLBarPFeasTol: Primal residual tolerance
  • NLBarDFeasTol: Dual residual tolerance
  • NLBarCFeasTol: Complementarity residual tolerance

New Attribute:

  • NLBarIterCount: Iterations performed by NL barrier

Quality attributes:

  • Available: MaxVio, ConstrVioIndex, etc
  • Not yet available: DualVio, MaxSvio, ConstrResidualIndex, etc

Status codes:

  • LOCALLY_OPTIMAL and LOCALLY_INFEASIBLE

Find Minima of a Quartic Function

\[\min\ x^4 - 2x^2 + 0.5x + 1\]

model = gp.Model(env=env)
x = model.addVar(lb=-GRB.INFINITY)
z = model.addVar(lb=-GRB.INFINITY)
model.addConstr(z == x**4 - 2 * x**2 + 0.5 * x + 1)
model.setObjective(z, sense=GRB.MINIMIZE)

OptimalityTarget=1 selects NL barrier

model.Params.OptimalityTarget = 1
model.optimize()
Set parameter OptimalityTarget to value 1
                           Residuals
Iter     Objective      Primal     Dual      Compl     Step     Time
   0   8.88631776e-01  9.94e-02  1.00e+00  0.00e+00  0.00e+00     0s
   1  -3.66317993e-01  1.26e+01  2.30e+01  0.00e+00  4.88e-04     0s
   2  -1.26827514e+01  1.27e+01  3.96e+00  0.00e+00  1.00e+00     0s
   3  -8.70429470e-01  3.82e-01  7.50e-01  0.00e+00  1.00e+00     0s
   4  -5.38133569e-01  2.36e-02  5.84e-02  0.00e+00  1.00e+00     0s
   5  -5.14930637e-01  1.77e-04  4.74e-04  0.00e+00  1.00e+00     0s
   6  -5.14753653e-01  1.19e-08  3.21e-08  0.00e+00  1.00e+00     0s
   7  -5.14753641e-01  1.11e-16  8.88e-16  0.00e+00  1.00e+00     0s
First-order optimal solution

Optimality Status:

print(f"status.............: {model.status}") 
print(f"LOCALLY_OPTIMAL....: {GRB.LOCALLY_OPTIMAL}") 
status.............: 18
LOCALLY_OPTIMAL....: 18

Solution Visualization (Default Starting Point)

Starting from x = 0

model.reset()
x.PStart = 0
model.optimize()
Discarded solution information
Loaded warm-start solution
                           Residuals
Iter     Objective      Primal     Dual      Compl     Step     Time
   0   1.00000000e+00  0.00e+00  1.00e+00  0.00e+00  0.00e+00     0s
   1   2.34375000e-02  6.92e+00  1.72e+01  0.00e+00  4.88e-04     0s
   2  -6.87203560e+00  6.73e+00  3.25e+00  0.00e+00  1.00e+00     0s
   3  -7.79935113e-01  2.81e-01  5.74e-01  0.00e+00  1.00e+00     0s
   4  -5.29177542e-01  1.45e-02  3.66e-02  0.00e+00  1.00e+00     0s
   5  -5.14823680e-01  7.00e-05  1.88e-04  0.00e+00  1.00e+00     0s
   6  -5.14753643e-01  1.87e-09  5.05e-09  0.00e+00  1.00e+00     0s
First-order optimal solution

Optimality Status:

print(f"status.............: {model.status}") 
print(f"LOCALLY_OPTIMAL....: {GRB.LOCALLY_OPTIMAL}") 
status.............: 18
LOCALLY_OPTIMAL....: 18

Solution Visualization

Starting from x = 0.5

model.reset()
x.PStart = 0.5
model.optimize()
Discarded solution information
Loaded warm-start solution
                           Residuals
Iter     Objective      Primal     Dual      Compl     Step     Time
   0   8.12500000e-01  0.00e+00  1.00e+00  0.00e+00  0.00e+00     0s
   1  -1.62890625e+00  6.17e+01  4.53e+01  0.00e+00  4.88e-04     0s
   2  -1.04179861e+02  1.05e+02  1.72e+00  0.00e+00  1.00e+00     0s
   3   3.73335585e-01  1.16e-01  3.03e-01  0.00e+00  1.00e+00     0s
   4   4.77332212e-01  5.95e-03  1.94e-02  0.00e+00  1.00e+00     0s
   5   4.83222504e-01  2.90e-05  1.01e-04  0.00e+00  1.00e+00     0s
   6   4.83251491e-01  7.96e-10  2.78e-09  0.00e+00  1.00e+00     0s
First-order optimal solution

Optimality Status:

print(f"status.............: {model.status}") 
print(f"LOCALLY_OPTIMAL....: {GRB.LOCALLY_OPTIMAL}") 
status.............: 18
LOCALLY_OPTIMAL....: 18

Solution Visualization

Starting from x = 1.2

model.reset()
x.PStart = 1.2
model.optimize()
Discarded solution information
Loaded warm-start solution
                           Residuals
Iter     Objective      Primal     Dual      Compl     Step     Time
   0   7.93600000e-01  1.11e-16  1.00e+00  0.00e+00  0.00e+00     0s
   1  -7.72362998e+00  1.72e+01  3.36e+00  0.00e+00  9.77e-04     0s
   2  -1.05396252e+02  1.26e+02  4.08e+01  0.00e+00  1.00e+00     0s
   3  -7.51723070e+00  1.12e+01  1.15e+01  0.00e+00  1.00e+00     0s
  ...
   7   4.82916826e-01  3.35e-04  1.15e-03  0.00e+00  1.00e+00     0s
   8   4.83251388e-01  1.04e-07  3.62e-07  0.00e+00  1.00e+00     0s
   9   4.83251492e-01  9.88e-15  3.60e-14  0.00e+00  1.00e+00     0s
First-order optimal solution

Optimality Status:

print(f"status.............: {model.status}") 
print(f"LOCALLY_OPTIMAL....: {GRB.LOCALLY_OPTIMAL}") 
status.............: 18
LOCALLY_OPTIMAL....: 18

Solution Visualization

Starting from x = 1.5

model.reset()
x.PStart = 1.5
model.optimize()
Discarded solution information
Loaded warm-start solution
                           Residuals
Iter     Objective      Primal     Dual      Compl     Step     Time
   0   2.31250000e+00  0.00e+00  1.00e+00  0.00e+00  0.00e+00     0s
   1  -1.69182692e+01  1.65e+01  9.85e-01  0.00e+00  1.95e-03     0s
   2  -1.68966995e+01  1.64e+01  5.00e-01  0.00e+00  1.56e-02     0s
   3  -5.25944987e-01  1.12e-02  2.86e-02  0.00e+00  1.00e+00     0s
   4  -5.14796516e-01  4.29e-05  1.15e-04  0.00e+00  1.00e+00     0s
   5  -5.14753642e-01  7.04e-10  1.90e-09  0.00e+00  1.00e+00     0s
First-order optimal solution

Optimality Status:

print(f"status.............: {model.status}") 
print(f"LOCALLY_OPTIMAL....: {GRB.LOCALLY_OPTIMAL}") 
status.............: 18
LOCALLY_OPTIMAL....: 18

Solution Visualization

Comparison of All Cases

Each run finds a local minimum (not necessarily in the vicinity of the starting point)

Constrained Problem

Minimize \(x\) subject to \(\exp(\color{red}{x^4 - 2 x^2 + \frac{x}{2} + 1}) \le 1\)

x = model.addVar(lb=-GRB.INFINITY)
z = model.addVar(lb=-GRB.INFINITY, ub=1)
model.addConstr(z == gp.nlfunc.exp(x**4 - 2 * x**2 + 0.5 * x + 1))
model.setObjective(x, sense=GRB.MINIMIZE)

Default starting point

model.Params.OptimalityTarget = 1
model.optimize()
Set parameter OptimalityTarget to value 1
                           Residuals
Iter     Objective      Primal     Dual      Compl     Step     Time
   0  -2.20509451e-02  2.14e+00  1.00e+00  5.00e+00  0.00e+00     0s
   1  -1.39319205e+00  6.88e-01  4.63e+00  9.92e-01  1.00e+00     0s
   2  -1.25651502e+00  3.21e-01  2.06e+00  1.00e+00  1.00e+00     0s
   3  -1.10374445e+00  1.33e-01  1.06e+00  9.91e-01  1.00e+00     0s
  ...
  19  -1.34960322e+00  1.33e-11  9.39e-13  1.43e-05  1.00e+00     0s
  20  -1.34961735e+00  3.33e-09  6.62e-09  1.66e-07  1.00e+00     0s
  21  -1.34961752e+00  4.58e-13  9.00e-13  1.00e-09  1.00e+00     0s
First-order optimal solution

Optimality Status:

print(f"status.............: {model.status}") 
print(f"LOCALLY_OPTIMAL....: {GRB.LOCALLY_OPTIMAL}") 
status.............: 18
LOCALLY_OPTIMAL....: 18

Solution Visualization

Starting from x = 0.5

model.reset()
x.PStart = 0.5
model.optimize()
Discarded solution information
Loaded warm-start solution
                           Residuals
Iter     Objective      Primal     Dual      Compl     Step     Time
   0   5.00000000e-01  1.26e+00  1.00e+00  5.00e+00  0.00e+00     0s
   1   1.10398246e+00  9.29e-01  2.19e+00  1.02e+00  1.00e+00     0s
   2   6.53690292e-01  1.31e+00  1.62e+00  1.13e+00  1.00e+00     0s
   3   9.62585702e-01  9.48e-01  1.82e+00  1.01e+00  5.00e-01     0s
  ...
  24*  9.30402927e-01  6.21e-01  5.12e-09  1.43e-05  1.00e+00     0s
  25*  9.30402927e-01  6.21e-01  4.39e-11  1.65e-07  1.00e+00     0s
  26*  9.30402927e-01  6.21e-01  1.77e-12  1.00e-09  1.00e+00     0s
NL barrier performed 26 iterations in 0.00 seconds (0.00 work units)
First-order infeasible model

Optimality Status:

print(f"status.............: {model.status}") 
print(f"LOCALLY_INFEASIBLE.: {GRB.LOCALLY_INFEASIBLE}") 
status.............: 19
LOCALLY_INFEASIBLE.: 19

Local Infeasibility (Minimize Constraint Violation)

Summary

Usage:

  • Build models “as usual”, with algebraic expressions
  • Set OptimalityTarget = 1 to invoke NL barrier

Important considerations:

  • Local vs global optimality:
    • The algorithm finds local optima, which may not be globally optimal
    • For convex problems, every local optimum is a global optimum
      • In that case, NL barrier finds the global optimum
  • Starting point matters:
    • Different initial points can lead to different local optima for nonconvex problems
  • Local infeasibility:
    • Algorithm found local minimizer of constraint violation
    • Some starting points may lead to local infeasibility even when the problem is feasible

Problem Description

Given training data:

  • Features: \(X \in \mathbb{R}^{n \times d}\) (e.g., income, credit score)
  • Labels: \(y \in \{0, 1\}^n\) (e.g., repaid/defaulted)
  • Protected groups: \(g \in \{0, 1\}^n\) (demographic group)

Learn model weights:

  • Weights: \(w \in \mathbb{R}^d\)
  • Bias: \(b \in \mathbb{R}\)

To predict label probabilities:

  • Probability of \(y=1\):
    • \(p = \sigma(w^Tx + b) = \frac{1}{1 + e^{-(w^T x + b)}}\)

Important

Fairness constraint:

  • Limit probability deviation of separate groups
  • e.g. demographic parity across protected groups

Loss Function

Minimize cross-entropy loss + L2 regularization

\[ \begin{align} \min_{w, b} \quad & \frac{1}{n}\sum_{i=1}^n \left[-y_i \log(p_i) - (1-y_i)\log(1-p_i)\right] + \lambda \|w\|^2 \\ \end{align} \]

Where probabilities are defined using a logistic function: \[ \begin{align} p_i = \frac{1}{1 + e^{-(w^T x_i + b)}} \\ \end{align} \]

Subject to demographic parity constraint

\[ \begin{align} \left| \frac{1}{|G_0|}\sum_{i \in G_0} p_i - \frac{1}{|G_1|}\sum_{i \in G_1} p_i \right| \leq \epsilon \end{align} \]

Fair Logistic Regression: Input Data

n_samples = 30
n_features = 5

Synthetic data:

X[:3, :]
array([[ 0.49671415, -0.1382643 ,  0.64768854,  1.52302986, -0.23415337],
       [-0.23413696,  1.57921282,  0.76743473, -0.46947439,  0.54256004],
       [-0.46341769, -0.46572975,  0.24196227, -1.91328024, -1.72491783]])
y
array([0, 1, 0, 0, 0, 0, 0, 0, 0, 0, 1, 1, 0, 1, 0, 0, 0, 0, 0, 0, 0, 1,
       1, 1, 1, 0, 1, 0, 1, 0])
protected_groups
array([1, 1, 1, 0, 0, 1, 1, 0, 0, 1, 0, 0, 1, 1, 1, 0, 1, 0, 0, 1, 1, 1,
       1, 1, 0, 0, 0, 0, 1, 0])

Model parameters:

fairness_tol = 0.10
lambda_reg = 0.0001

Decision Variables for Learned Weights

Create a new model:

env = gp.Env()
model = gp.Model("FairLogisticRegression", env=env)

Add variables for learned parameters:

w = model.addVars(n_features, lb=-GRB.INFINITY, name="weight")
{0: <gurobi.Var weight[0]>,
 1: <gurobi.Var weight[1]>,
 2: <gurobi.Var weight[2]>,
 3: <gurobi.Var weight[3]>,
 4: <gurobi.Var weight[4]>}
b = model.addVar(lb=-GRB.INFINITY, name="bias")
<gurobi.Var bias>

We need to represent, for each sample:

\[ p = \sigma(w^Tx + b) = \frac{1}{1 + e^{-(w^T x + b)}} \]

where features \(x\) are known.

Expressions for Output Probabilities

Linear score: \(z_i = w^T x_i + b\)

i = 0  # First sample

z_0 = gp.quicksum(w[j] * X[i,j] for j in range(n_features)) + b
z_0
<gurobi.LinExpr: 0.4967141530112327 weight[0] + -0.13826430117118466 weight[1] + 0.6476885381006925 weight[2] + 1.5230298564080254 weight[3] + -0.23415337472333597 weight[4] + bias>

Logistic function: \(p_i = \frac{1}{1 + exp(-z_i)}\)

from gurobipy import nlfunc

p_0 = nlfunc.logistic(z_0)
p_0
<gurobi.NLExpr: logistic(0.4967141530112327 * weight[0] + -0.13826430117118466 * weight[1] + 0.6476885381006925 * weight[2] + 1.5230298564080254 * weight[3] + -0.23415337472333597 * weight[4] + bias)>

Loss Function

Create and store probability expressions:

p = [
    nlfunc.logistic(gp.quicksum(w[j] * X[i,j] for j in range(n_features)) + b)
    for i in range(n_samples)
]

Build loss function expression for training:

\[ \sum_{i=1}^n \left[-y_i \log(p_i) - (1-y_i)\log(1-p_i)\right] \]

sum_loss = 0
for i in range(n_samples):
    log_p = nlfunc.log(p[i])
    log_1mp = nlfunc.log(1.0 - p[i])
    loss = -(y[i] * log_p + (1-y[i]) * log_1mp)
    sum_loss += loss
<gurobi.NLExpr: 0.0 + -(0.0 * log(logistic(0.4967141530112327 * weight[0] + -0.13826430117118466 * weight[1] + 0.6476885381006925 * weight[2] + 1.5230298564080254 * weight[3] + -0.23415337472333597 * weight[4] + bias)) + 1.0 * log(1.0 - logistic(0.4967141530112327 * weight[0] + -0.13826430117118466 * weight[1] + 0.6476885381006925 * weight[2] + 1.5230298564080254 * weight[3] + -0.23415337472333597 * weight[4] + bias))) + -(1.0 * log(logistic(-0.23413695694918055 * weight[0] + 1.5792128155073915 * weight[1] + 0.7674347291529088 * weight[2] + -0.4694743859349521 * weight[3] + 0.5425600435859647 * weight[4] + bias)) + 0.0 * log(1.0 - logistic(-0.23413695694918055 * weight[0] + 1.5792128155073915 * weight[1] + 0.7674347291529088 * weight[2] + -0.4694743859349521 * weight[3] + 0.5425600435859647 * weight[4] + bias))) + -(0.0 * log(logistic(-0.46341769281246226 * weight[0] + -0.46572975357025687 * weight[1] + 0.24196227156603412 * weight[2] + -1.913280244657798 * weight[3] + -1.7249178325130328 * weight[4] + bias)) + 1.0 * log(1.0 - logistic(-0.46341769281246226 * weight[0] + -0.46572975357025687 * weight[1] + 0.24196227156603412 * weight[2] + -1.913280244657798 * weight[3] + -1.7249178325130328 * weight[4] + bias))) + -(0.0 * log(logistic(-0.5622875292409727 * weight[0] + -1.0128311203344238 * weight[1] + 0.3142473325952739 * weight[2] + -0.9080240755212111 * weight[3] + -1.4123037013352917 * weight[4] + bias)) + 1.0 * log(1.0 - logistic(-0.5622875292409727 * weight[0] + -1.0128311203344238 * weight[1] + 0.3142473325952739 * weight[2] + -0.9080240755212111 * weight[3] + -1.4123037013352917 * weight[4] + bias))) + -(0.0 * log(logistic(1.465648768921554 * weight[0] + -0.22577630048653566 * weight[1] + 0.06752820468792384 * weight[2] + -1.4247481862134568 * weight[3] + -0.5443827245251827 * weight[4] + bias)) + 1.0 * log(1.0 - logistic(1.465648768921554 * weight[0] + -0.22577630048653566 * weight[1] + 0.06752820468792384 * weight[2] + -1.4247481862134568 * weight[3] + -0.5443827245251827 * weight[4] + bias))) + -(0.0 * log(logistic(0.11092258970986608 * weight[0] + -1.1509935774223028 * weight[1] + 0.37569801834567196 * weight[2] + -0.600638689918805 * weight[3] + -0.2916937497932768 * weight[4] + bias)) + 1.0 * log(1.0 - logistic(0.11092258970986608 * weight[0] + -1.1509935774223028 * weight[1] + 0.37569801834567196 * weight[2] + -0.600638689918805 * weight[3] + -0.2916937497932768 * weight[4] + bias))) + -(0.0 * log(logistic(-0.6017066122293969 * weight[0] + 1.8522781845089378 * weight[1] + -0.013497224737933914 * weight[2] + -1.0577109289559 * weight[3] + 0.822544912103189 * weight[4] + bias)) + 1.0 * log(1.0 - logistic(-0.6017066122293969 * weight[0] + 1.8522781845089378 * weight[1] + -0.013497224737933914 * weight[2] + -1.0577109289559 * weight[3] + 0.822544912103189 * weight[4] + bias))) + -(0.0 * log(logistic(-1.2208436499710222 * weight[0] + 0.2088635950047554 * weight[1] + -1.9596701238797756 * weight[2] + -1.3281860488984305 * weight[3] + 0.19686123586912352 * weight[4] + bias)) + 1.0 * log(1.0 - logistic(-1.2208436499710222 * weight[0] + 0.2088635950047554 * weight[1] + -1.9596701238797756 * weight[2] + -1.3281860488984305 * weight[3] + 0.19686123586912352 * weight[4] + bias))) + -(0.0 * log(logistic(0.7384665799954104 * weight[0] + 0.1713682811899705 * weight[1] + -0.11564828238824053 * weight[2] + -0.3011036955892888 * weight[3] + -1.4785219903674274 * weight[4] + bias)) + 1.0 * log(1.0 - logistic(0.7384665799954104 * weight[0] + 0.1713682811899705 * weight[1] + -0.11564828238824053 * weight[2] + -0.3011036955892888 * weight[3] + -1.4785219903674274 * weight[4] + bias))) + -(0.0 * log(logistic(-0.7198442083947086 * weight[0] + -0.4606387709597875 * weight[1] + 1.0571222262189157 * weight[2] + 0.3436182895684614 * weight[3] + -1.763040155362734 * weight[4] + bias)) + 1.0 * log(1.0 - logistic(-0.7198442083947086 * weight[0] + -0.4606387709597875 * weight[1] + 1.0571222262189157 * weight[2] + 0.3436182895684614 * weight[3] + -1.763040155362734 * weight[4] + bias))) + -(1.0 * log(logistic(0.324083969394795 * weight[0] + -0.38508228041631654 * weight[1] + -0.6769220003059587 * weight[2] + 0.6116762888408679 * weight[3] + 1.030999522495951 * weight[4] + bias)) + 0.0 * log(1.0 - logistic(0.324083969394795 * weight[0] + -0.38508228041631654 * weight[1] + -0.6769220003059587 * weight[2] + 0.6116762888408679 * weight[3] + 1.030999522495951 * weight[4] + bias))) + -(1.0 * log(logistic(0.9312801191161986 * weight[0] + -0.8392175232226385 * weight[1] + -0.3092123758512146 * weight[2] + 0.33126343140356396 * weight[3] + 0.9755451271223592 * weight[4] + bias)) + 0.0 * log(1.0 - logistic(0.9312801191161986 * weight[0] + -0.8392175232226385 * weight[1] + -0.3092123758512146 * weight[2] + 0.33126343140356396 * weight[3] + 0.9755451271223592 * weight[4] + bias))) + -(0.0 * log(logistic(-0.47917423784528995 * weight[0] + -0.18565897666381712 * weight[1] + -1.1063349740060282 * weight[2] + -1.1962066240806708 * weight[3] + 0.812525822394198 * weight[4] + bias)) + 1.0 * log(1.0 - logistic(-0.47917423784528995 * weight[0] + -0.18565897666381712 * weight[1] + -1.1063349740060282 * weight[2] + -1.1962066240806708 * weight[3] + 0.812525822394198 * weight[4] + bias))) + -(1.0 * log(logistic(1.356240028570823 * weight[0] + -0.07201012158033385 * weight[1] + 1.0035328978920242 * weight[2] + 0.36163602504763415 * weight[3] + -0.6451197546051243 * weight[4] + bias)) + 0.0 * log(1.0 - logistic(1.356240028570823 * weight[0] + -0.07201012158033385 * weight[1] + 1.0035328978920242 * weight[2] + 0.36163602504763415 * weight[3] + -0.6451197546051243 * weight[4] + bias))) + -(0.0 * log(logistic(0.36139560550841393 * weight[0] + 1.5380365664659692 * weight[1] + -0.03582603910995153 * weight[2] + 1.564643655814006 * weight[3] + -2.6197451040897444 * weight[4] + bias)) + 1.0 * log(1.0 - logistic(0.36139560550841393 * weight[0] + 1.5380365664659692 * weight[1] + -0.03582603910995153 * weight[2] + 1.564643655814006 * weight[3] + -2.6197451040897444 * weight[4] + bias))) + -(0.0 * log(logistic(0.8219025043752238 * weight[0] + 0.08704706823817134 * weight[1] + -0.29900735046586785 * weight[2] + 0.0917607765355023 * weight[3] + -1.9875689146008928 * weight[4] + bias)) + 1.0 * log(1.0 - logistic(0.8219025043752238 * weight[0] + 0.08704706823817134 * weight[1] + -0.29900735046586785 * weight[2] + 0.0917607765355023 * weight[3] + -1.9875689146008928 * weight[4] + bias))) + -(0.0 * log(logistic(-0.21967188783751193 * weight[0] + 0.3571125715117464 * weight[1] + 1.477894044741516 * weight[2] + -0.5182702182736474 * weight[3] + -0.8084936028931876 * weight[4] + bias)) + 1.0 * log(1.0 - logistic(-0.21967188783751193 * weight[0] + 0.3571125715117464 * weight[1] + 1.477894044741516 * weight[2] + -0.5182702182736474 * weight[3] + -0.8084936028931876 * weight[4] + bias))) + -(0.0 * log(logistic(-0.5017570435845365 * weight[0] + 0.9154021177020741 * weight[1] + 0.32875110965968446 * weight[2] + -0.5297602037670388 * weight[3] + 0.5132674331133561 * weight[4] + bias)) + 1.0 * log(1.0 - logistic(-0.5017570435845365 * weight[0] + 0.9154021177020741 * weight[1] + 0.32875110965968446 * weight[2] + -0.5297602037670388 * weight[3] + 0.5132674331133561 * weight[4] + bias))) + -(0.0 * log(logistic(0.09707754934804039 * weight[0] + 0.9686449905328892 * weight[1] + -0.7020530938773524 * weight[2] + -0.3276621465977682 * weight[3] + -0.39210815313215763 * weight[4] + bias)) + 1.0 * log(1.0 - logistic(0.09707754934804039 * weight[0] + 0.9686449905328892 * weight[1] + -0.7020530938773524 * weight[2] + -0.3276621465977682 * weight[3] + -0.39210815313215763 * weight[4] + bias))) + -(0.0 * log(logistic(-1.4635149481321186 * weight[0] + 0.29612027706457605 * weight[1] + 0.26105527217988933 * weight[2] + 0.0051134566424609 * weight[3] + -0.23458713337514742 * weight[4] + bias)) + 1.0 * log(1.0 - logistic(-1.4635149481321186 * weight[0] + 0.29612027706457605 * weight[1] + 0.26105527217988933 * weight[2] + 0.0051134566424609 * weight[3] + -0.23458713337514742 * weight[4] + bias))) + -(0.0 * log(logistic(-1.4153707420504142 * weight[0] + -0.42064532276535904 * weight[1] + -0.3427145165267695 * weight[2] + -0.8022772692216189 * weight[3] + -0.16128571166600914 * weight[4] + bias)) + 1.0 * log(1.0 - logistic(-1.4153707420504142 * weight[0] + -0.42064532276535904 * weight[1] + -0.3427145165267695 * weight[2] + -0.8022772692216189 * weight[3] + -0.16128571166600914 * weight[4] + bias))) + -(1.0 * log(logistic(0.4040508568145384 * weight[0] + 1.8861859012105302 * weight[1] + 0.17457781283183896 * weight[2] + 0.257550390722764 * weight[3] + -0.0744459157661671 * weight[4] + bias)) + 0.0 * log(1.0 - logistic(0.4040508568145384 * weight[0] + 1.8861859012105302 * weight[1] + 0.17457781283183896 * weight[2] + 0.257550390722764 * weight[3] + -0.0744459157661671 * weight[4] + bias))) + -(1.0 * log(logistic(-1.9187712152990415 * weight[0] + -0.026513875449216878 * weight[1] + 0.06023020994102644 * weight[2] + 2.463242112485286 * weight[3] + -0.19236096478112252 * weight[4] + bias)) + 0.0 * log(1.0 - logistic(-1.9187712152990415 * weight[0] + -0.026513875449216878 * weight[1] + 0.06023020994102644 * weight[2] + 2.463242112485286 * weight[3] + -0.19236096478112252 * weight[4] + bias))) + -(1.0 * log(logistic(0.30154734233361247 * weight[0] + -0.03471176970524331 * weight[1] + -1.168678037619532 * weight[2] + 1.1428228145150205 * weight[3] + 0.7519330326867741 * weight[4] + bias)) + 0.0 * log(1.0 - logistic(0.30154734233361247 * weight[0] + -0.03471176970524331 * weight[1] + -1.168678037619532 * weight[2] + 1.1428228145150205 * weight[3] + 0.7519330326867741 * weight[4] + bias))) + -(1.0 * log(logistic(0.7910319470430469 * weight[0] + -0.9093874547947389 * weight[1] + 1.4027943109360992 * weight[2] + -1.4018510627922809 * weight[3] + 0.5868570938002703 * weight[4] + bias)) + 0.0 * log(1.0 - logistic(0.7910319470430469 * weight[0] + -0.9093874547947389 * weight[1] + 1.4027943109360992 * weight[2] + -1.4018510627922809 * weight[3] + 0.5868570938002703 * weight[4] + bias))) + -(0.0 * log(logistic(2.1904556258099785 * weight[0] + -0.9905363251306883 * weight[1] + -0.5662977296027719 * weight[2] + 0.09965136508764122 * weight[3] + -0.5034756541161992 * weight[4] + bias)) + 1.0 * log(1.0 - logistic(2.1904556258099785 * weight[0] + -0.9905363251306883 * weight[1] + -0.5662977296027719 * weight[2] + 0.09965136508764122 * weight[3] + -0.5034756541161992 * weight[4] + bias))) + -(1.0 * log(logistic(-1.5506634310661327 * weight[0] + 0.06856297480602733 * weight[1] + -1.0623037137261049 * weight[2] + 0.4735924306351816 * weight[3] + -0.9194242342338034 * weight[4] + bias)) + 0.0 * log(1.0 - logistic(-1.5506634310661327 * weight[0] + 0.06856297480602733 * weight[1] + -1.0623037137261049 * weight[2] + 0.4735924306351816 * weight[3] + -0.9194242342338034 * weight[4] + bias))) + -(0.0 * log(logistic(1.5499344050175399 * weight[0] + -0.7832532923362371 * weight[1] + -0.3220615162056756 * weight[2] + 0.8135172173696698 * weight[3] + -1.2308643164339552 * weight[4] + bias)) + 1.0 * log(1.0 - logistic(1.5499344050175399 * weight[0] + -0.7832532923362371 * weight[1] + -0.3220615162056756 * weight[2] + 0.8135172173696698 * weight[3] + -1.2308643164339552 * weight[4] + bias))) + -(1.0 * log(logistic(0.22745993460412942 * weight[0] + 1.307142754282428 * weight[1] + -1.6074832345612275 * weight[2] + 0.1846338585323042 * weight[3] + 0.2598827942484235 * weight[4] + bias)) + 0.0 * log(1.0 - logistic(0.22745993460412942 * weight[0] + 1.307142754282428 * weight[1] + -1.6074832345612275 * weight[2] + 0.1846338585323042 * weight[3] + 0.2598827942484235 * weight[4] + bias))) + -(0.0 * log(logistic(0.78182287177731 * weight[0] + -1.236950710878082 * weight[1] + -1.3204566130842763 * weight[2] + 0.5219415656168976 * weight[3] + 0.29698467323318606 * weight[4] + bias)) + 1.0 * log(1.0 - logistic(0.78182287177731 * weight[0] + -1.236950710878082 * weight[1] + -1.3204566130842763 * weight[2] + 0.5219415656168976 * weight[3] + 0.29698467323318606 * weight[4] + bias)))>

Regularized Objective

L2 regularization component:

l2_reg = gp.quicksum(w[j] * w[j] for j in range(n_features))
l2_reg
<gurobi.QuadExpr: 0.0 + [ weight[0] ^ 2 + weight[1] ^ 2 + weight[2] ^ 2 + weight[3] ^ 2 + weight[4] ^ 2 ]>

\[ \begin{align} \min_{w, b} \quad & \frac{1}{n}\sum_{i=1}^n \left[-y_i \log(p_i) - (1-y_i)\log(1-p_i)\right] + \lambda \|w\|^2 \\ \end{align} \]

Set the objective using a resultant variable:

objvar = model.addVar(lb=-GRB.INFINITY, name="objective")
model.addConstr(objvar == sum_loss / n_samples + lambda_reg * l2_reg)

model.setObjective(objvar, GRB.MINIMIZE)

Fairness Constraint

Compute average predictions per group:

group0_idx = [i for i in range(n_samples) if protected_groups[i] == 0]
group1_idx = [i for i in range(n_samples) if protected_groups[i] == 1]

avg_p0 = gp.quicksum(p[i] for i in group0_idx) / len(group0_idx)
avg_p1 = gp.quicksum(p[i] for i in group1_idx) / len(group1_idx)

Enforce maximum difference over the training set:

gap = model.addVar(lb=-fairness_tol, ub=fairness_tol, name="fairness_gap")
model.addConstr(gap == avg_p0 - avg_p1)

\[ \begin{align} \left| \frac{1}{|G_0|}\sum_{i \in G_0} p_i - \frac{1}{|G_1|}\sum_{i \in G_1} p_i \right| \leq \epsilon \end{align} \]

Run Local Optimization

model.Params.OptimalityTarget = 1
model.optimize()
                           Residuals
Iter     Objective      Primal     Dual      Compl     Step     Time
   0   9.29389150e-01  1.99e-02  9.00e+00  3.76e+00  0.00e+00     0s
   1   9.36440941e-01  1.82e-03  2.33e-16  9.79e-01  1.00e+00     0s
   2   1.50531648e-01  3.42e-01  7.86e-05  1.50e-01  1.00e+00     0s
   3   1.50235162e-01  4.88e-01  2.23e-04  1.50e-01  1.00e+00     0s
   4   1.50164053e-01  7.22e-01  2.69e-04  1.50e-01  1.00e+00     0s
   5   1.50098036e-01  3.38e-01  6.85e-05  1.50e-01  1.00e+00     0s
  ...
  21   4.11209993e-01  7.75e-06  2.76e-06  1.90e-05  1.00e+00     0s
  22   4.11200317e-01  2.28e-08  6.92e-09  1.43e-05  1.00e+00     0s
  23   4.11186199e-01  1.52e-08  5.45e-09  1.74e-07  1.00e+00     0s
  24   4.11186180e-01  8.87e-14  2.70e-14  1.65e-07  1.00e+00     0s
  25   4.11186016e-01  2.06e-12  7.37e-13  1.00e-09  1.00e+00     0s
First-order optimal solution

Important

Running global solver afterwards confirms global optimalty in 12 sec

Extract Learned Parameters

if model.status in [GRB.OPTIMAL, GRB.LOCALLY_OPTIMAL]:
    learned_w = np.array([w[j].X for j in range(n_features)])
    learned_b = b.X

def predict_proba(X_new):
    """ For given features, return output probabilities """
    linear = X_new @ learned_w + learned_b
    return 1.0 / (1.0 + np.exp(-linear))

Check against the training data:

probs_pred = predict_proba(X)
preds = (probs_pred >= 0.5).astype(int)
accuracy = np.mean(preds == y)
print(f"Accuracy: {accuracy:.1%}")

avg_pred_g0 = np.mean(probs_pred[protected_groups == 0])
avg_pred_g1 = np.mean(probs_pred[protected_groups == 1])
fairness_gap = abs(avg_pred_g0 - avg_pred_g1)
print(f"Fairness gap: {fairness_gap:.4f}")
Accuracy: 83.3%
Fairness gap: 0.1000

Fairness Gap

Fair Logistic Regression: Takeaways

ML models can be trained via local optimization:

  • Directly model nonlinear loss functions
  • Easily incorporate constraints

Complex nonlinear expressions are handled natively:

  • Preferable to carry complex expressions right through
    • (Here we did not introduce variables for \(p_i\))
  • NL barrier often converges better if we avoid auxiliary variables
sum_loss / n_samples + lambda_reg * l2_reg
<gurobi.NLExpr: (0.0 + -(0.0 * log(logistic(0.4967141530112327 * weight[0] + -0.13826430117118466 * weight[1] + 0.6476885381006925 * weight[2] + 1.5230298564080254 * weight[3] + -0.23415337472333597 * weight[4] + bias)) + 1.0 * log(1.0 - logistic(0.4967141530112327 * weight[0] + -0.13826430117118466 * weight[1] + 0.6476885381006925 * weight[2] + 1.5230298564080254 * weight[3] + -0.23415337472333597 * weight[4] + bias))) + -(1.0 * log(logistic(-0.23413695694918055 * weight[0] + 1.5792128155073915 * weight[1] + 0.7674347291529088 * weight[2] + -0.4694743859349521 * weight[3] + 0.5425600435859647 * weight[4] + bias)) + 0.0 * log(1.0 - logistic(-0.23413695694918055 * weight[0] + 1.5792128155073915 * weight[1] + 0.7674347291529088 * weight[2] + -0.4694743859349521 * weight[3] + 0.5425600435859647 * weight[4] + bias))) + -(0.0 * log(logistic(-0.46341769281246226 * weight[0] + -0.46572975357025687 * weight[1] + 0.24196227156603412 * weight[2] + -1.913280244657798 * weight[3] + -1.7249178325130328 * weight[4] + bias)) + 1.0 * log(1.0 - logistic(-0.46341769281246226 * weight[0] + -0.46572975357025687 * weight[1] + 0.24196227156603412 * weight[2] + -1.913280244657798 * weight[3] + -1.7249178325130328 * weight[4] + bias))) + -(0.0 * log(logistic(-0.5622875292409727 * weight[0] + -1.0128311203344238 * weight[1] + 0.3142473325952739 * weight[2] + -0.9080240755212111 * weight[3] + -1.4123037013352917 * weight[4] + bias)) + 1.0 * log(1.0 - logistic(-0.5622875292409727 * weight[0] + -1.0128311203344238 * weight[1] + 0.3142473325952739 * weight[2] + -0.9080240755212111 * weight[3] + -1.4123037013352917 * weight[4] + bias))) + -(0.0 * log(logistic(1.465648768921554 * weight[0] + -0.22577630048653566 * weight[1] + 0.06752820468792384 * weight[2] + -1.4247481862134568 * weight[3] + -0.5443827245251827 * weight[4] + bias)) + 1.0 * log(1.0 - logistic(1.465648768921554 * weight[0] + -0.22577630048653566 * weight[1] + 0.06752820468792384 * weight[2] + -1.4247481862134568 * weight[3] + -0.5443827245251827 * weight[4] + bias))) + -(0.0 * log(logistic(0.11092258970986608 * weight[0] + -1.1509935774223028 * weight[1] + 0.37569801834567196 * weight[2] + -0.600638689918805 * weight[3] + -0.2916937497932768 * weight[4] + bias)) + 1.0 * log(1.0 - logistic(0.11092258970986608 * weight[0] + -1.1509935774223028 * weight[1] + 0.37569801834567196 * weight[2] + -0.600638689918805 * weight[3] + -0.2916937497932768 * weight[4] + bias))) + -(0.0 * log(logistic(-0.6017066122293969 * weight[0] + 1.8522781845089378 * weight[1] + -0.013497224737933914 * weight[2] + -1.0577109289559 * weight[3] + 0.822544912103189 * weight[4] + bias)) + 1.0 * log(1.0 - logistic(-0.6017066122293969 * weight[0] + 1.8522781845089378 * weight[1] + -0.013497224737933914 * weight[2] + -1.0577109289559 * weight[3] + 0.822544912103189 * weight[4] + bias))) + -(0.0 * log(logistic(-1.2208436499710222 * weight[0] + 0.2088635950047554 * weight[1] + -1.9596701238797756 * weight[2] + -1.3281860488984305 * weight[3] + 0.19686123586912352 * weight[4] + bias)) + 1.0 * log(1.0 - logistic(-1.2208436499710222 * weight[0] + 0.2088635950047554 * weight[1] + -1.9596701238797756 * weight[2] + -1.3281860488984305 * weight[3] + 0.19686123586912352 * weight[4] + bias))) + -(0.0 * log(logistic(0.7384665799954104 * weight[0] + 0.1713682811899705 * weight[1] + -0.11564828238824053 * weight[2] + -0.3011036955892888 * weight[3] + -1.4785219903674274 * weight[4] + bias)) + 1.0 * log(1.0 - logistic(0.7384665799954104 * weight[0] + 0.1713682811899705 * weight[1] + -0.11564828238824053 * weight[2] + -0.3011036955892888 * weight[3] + -1.4785219903674274 * weight[4] + bias))) + -(0.0 * log(logistic(-0.7198442083947086 * weight[0] + -0.4606387709597875 * weight[1] + 1.0571222262189157 * weight[2] + 0.3436182895684614 * weight[3] + -1.763040155362734 * weight[4] + bias)) + 1.0 * log(1.0 - logistic(-0.7198442083947086 * weight[0] + -0.4606387709597875 * weight[1] + 1.0571222262189157 * weight[2] + 0.3436182895684614 * weight[3] + -1.763040155362734 * weight[4] + bias))) + -(1.0 * log(logistic(0.324083969394795 * weight[0] + -0.38508228041631654 * weight[1] + -0.6769220003059587 * weight[2] + 0.6116762888408679 * weight[3] + 1.030999522495951 * weight[4] + bias)) + 0.0 * log(1.0 - logistic(0.324083969394795 * weight[0] + -0.38508228041631654 * weight[1] + -0.6769220003059587 * weight[2] + 0.6116762888408679 * weight[3] + 1.030999522495951 * weight[4] + bias))) + -(1.0 * log(logistic(0.9312801191161986 * weight[0] + -0.8392175232226385 * weight[1] + -0.3092123758512146 * weight[2] + 0.33126343140356396 * weight[3] + 0.9755451271223592 * weight[4] + bias)) + 0.0 * log(1.0 - logistic(0.9312801191161986 * weight[0] + -0.8392175232226385 * weight[1] + -0.3092123758512146 * weight[2] + 0.33126343140356396 * weight[3] + 0.9755451271223592 * weight[4] + bias))) + -(0.0 * log(logistic(-0.47917423784528995 * weight[0] + -0.18565897666381712 * weight[1] + -1.1063349740060282 * weight[2] + -1.1962066240806708 * weight[3] + 0.812525822394198 * weight[4] + bias)) + 1.0 * log(1.0 - logistic(-0.47917423784528995 * weight[0] + -0.18565897666381712 * weight[1] + -1.1063349740060282 * weight[2] + -1.1962066240806708 * weight[3] + 0.812525822394198 * weight[4] + bias))) + -(1.0 * log(logistic(1.356240028570823 * weight[0] + -0.07201012158033385 * weight[1] + 1.0035328978920242 * weight[2] + 0.36163602504763415 * weight[3] + -0.6451197546051243 * weight[4] + bias)) + 0.0 * log(1.0 - logistic(1.356240028570823 * weight[0] + -0.07201012158033385 * weight[1] + 1.0035328978920242 * weight[2] + 0.36163602504763415 * weight[3] + -0.6451197546051243 * weight[4] + bias))) + -(0.0 * log(logistic(0.36139560550841393 * weight[0] + 1.5380365664659692 * weight[1] + -0.03582603910995153 * weight[2] + 1.564643655814006 * weight[3] + -2.6197451040897444 * weight[4] + bias)) + 1.0 * log(1.0 - logistic(0.36139560550841393 * weight[0] + 1.5380365664659692 * weight[1] + -0.03582603910995153 * weight[2] + 1.564643655814006 * weight[3] + -2.6197451040897444 * weight[4] + bias))) + -(0.0 * log(logistic(0.8219025043752238 * weight[0] + 0.08704706823817134 * weight[1] + -0.29900735046586785 * weight[2] + 0.0917607765355023 * weight[3] + -1.9875689146008928 * weight[4] + bias)) + 1.0 * log(1.0 - logistic(0.8219025043752238 * weight[0] + 0.08704706823817134 * weight[1] + -0.29900735046586785 * weight[2] + 0.0917607765355023 * weight[3] + -1.9875689146008928 * weight[4] + bias))) + -(0.0 * log(logistic(-0.21967188783751193 * weight[0] + 0.3571125715117464 * weight[1] + 1.477894044741516 * weight[2] + -0.5182702182736474 * weight[3] + -0.8084936028931876 * weight[4] + bias)) + 1.0 * log(1.0 - logistic(-0.21967188783751193 * weight[0] + 0.3571125715117464 * weight[1] + 1.477894044741516 * weight[2] + -0.5182702182736474 * weight[3] + -0.8084936028931876 * weight[4] + bias))) + -(0.0 * log(logistic(-0.5017570435845365 * weight[0] + 0.9154021177020741 * weight[1] + 0.32875110965968446 * weight[2] + -0.5297602037670388 * weight[3] + 0.5132674331133561 * weight[4] + bias)) + 1.0 * log(1.0 - logistic(-0.5017570435845365 * weight[0] + 0.9154021177020741 * weight[1] + 0.32875110965968446 * weight[2] + -0.5297602037670388 * weight[3] + 0.5132674331133561 * weight[4] + bias))) + -(0.0 * log(logistic(0.09707754934804039 * weight[0] + 0.9686449905328892 * weight[1] + -0.7020530938773524 * weight[2] + -0.3276621465977682 * weight[3] + -0.39210815313215763 * weight[4] + bias)) + 1.0 * log(1.0 - logistic(0.09707754934804039 * weight[0] + 0.9686449905328892 * weight[1] + -0.7020530938773524 * weight[2] + -0.3276621465977682 * weight[3] + -0.39210815313215763 * weight[4] + bias))) + -(0.0 * log(logistic(-1.4635149481321186 * weight[0] + 0.29612027706457605 * weight[1] + 0.26105527217988933 * weight[2] + 0.0051134566424609 * weight[3] + -0.23458713337514742 * weight[4] + bias)) + 1.0 * log(1.0 - logistic(-1.4635149481321186 * weight[0] + 0.29612027706457605 * weight[1] + 0.26105527217988933 * weight[2] + 0.0051134566424609 * weight[3] + -0.23458713337514742 * weight[4] + bias))) + -(0.0 * log(logistic(-1.4153707420504142 * weight[0] + -0.42064532276535904 * weight[1] + -0.3427145165267695 * weight[2] + -0.8022772692216189 * weight[3] + -0.16128571166600914 * weight[4] + bias)) + 1.0 * log(1.0 - logistic(-1.4153707420504142 * weight[0] + -0.42064532276535904 * weight[1] + -0.3427145165267695 * weight[2] + -0.8022772692216189 * weight[3] + -0.16128571166600914 * weight[4] + bias))) + -(1.0 * log(logistic(0.4040508568145384 * weight[0] + 1.8861859012105302 * weight[1] + 0.17457781283183896 * weight[2] + 0.257550390722764 * weight[3] + -0.0744459157661671 * weight[4] + bias)) + 0.0 * log(1.0 - logistic(0.4040508568145384 * weight[0] + 1.8861859012105302 * weight[1] + 0.17457781283183896 * weight[2] + 0.257550390722764 * weight[3] + -0.0744459157661671 * weight[4] + bias))) + -(1.0 * log(logistic(-1.9187712152990415 * weight[0] + -0.026513875449216878 * weight[1] + 0.06023020994102644 * weight[2] + 2.463242112485286 * weight[3] + -0.19236096478112252 * weight[4] + bias)) + 0.0 * log(1.0 - logistic(-1.9187712152990415 * weight[0] + -0.026513875449216878 * weight[1] + 0.06023020994102644 * weight[2] + 2.463242112485286 * weight[3] + -0.19236096478112252 * weight[4] + bias))) + -(1.0 * log(logistic(0.30154734233361247 * weight[0] + -0.03471176970524331 * weight[1] + -1.168678037619532 * weight[2] + 1.1428228145150205 * weight[3] + 0.7519330326867741 * weight[4] + bias)) + 0.0 * log(1.0 - logistic(0.30154734233361247 * weight[0] + -0.03471176970524331 * weight[1] + -1.168678037619532 * weight[2] + 1.1428228145150205 * weight[3] + 0.7519330326867741 * weight[4] + bias))) + -(1.0 * log(logistic(0.7910319470430469 * weight[0] + -0.9093874547947389 * weight[1] + 1.4027943109360992 * weight[2] + -1.4018510627922809 * weight[3] + 0.5868570938002703 * weight[4] + bias)) + 0.0 * log(1.0 - logistic(0.7910319470430469 * weight[0] + -0.9093874547947389 * weight[1] + 1.4027943109360992 * weight[2] + -1.4018510627922809 * weight[3] + 0.5868570938002703 * weight[4] + bias))) + -(0.0 * log(logistic(2.1904556258099785 * weight[0] + -0.9905363251306883 * weight[1] + -0.5662977296027719 * weight[2] + 0.09965136508764122 * weight[3] + -0.5034756541161992 * weight[4] + bias)) + 1.0 * log(1.0 - logistic(2.1904556258099785 * weight[0] + -0.9905363251306883 * weight[1] + -0.5662977296027719 * weight[2] + 0.09965136508764122 * weight[3] + -0.5034756541161992 * weight[4] + bias))) + -(1.0 * log(logistic(-1.5506634310661327 * weight[0] + 0.06856297480602733 * weight[1] + -1.0623037137261049 * weight[2] + 0.4735924306351816 * weight[3] + -0.9194242342338034 * weight[4] + bias)) + 0.0 * log(1.0 - logistic(-1.5506634310661327 * weight[0] + 0.06856297480602733 * weight[1] + -1.0623037137261049 * weight[2] + 0.4735924306351816 * weight[3] + -0.9194242342338034 * weight[4] + bias))) + -(0.0 * log(logistic(1.5499344050175399 * weight[0] + -0.7832532923362371 * weight[1] + -0.3220615162056756 * weight[2] + 0.8135172173696698 * weight[3] + -1.2308643164339552 * weight[4] + bias)) + 1.0 * log(1.0 - logistic(1.5499344050175399 * weight[0] + -0.7832532923362371 * weight[1] + -0.3220615162056756 * weight[2] + 0.8135172173696698 * weight[3] + -1.2308643164339552 * weight[4] + bias))) + -(1.0 * log(logistic(0.22745993460412942 * weight[0] + 1.307142754282428 * weight[1] + -1.6074832345612275 * weight[2] + 0.1846338585323042 * weight[3] + 0.2598827942484235 * weight[4] + bias)) + 0.0 * log(1.0 - logistic(0.22745993460412942 * weight[0] + 1.307142754282428 * weight[1] + -1.6074832345612275 * weight[2] + 0.1846338585323042 * weight[3] + 0.2598827942484235 * weight[4] + bias))) + -(0.0 * log(logistic(0.78182287177731 * weight[0] + -1.236950710878082 * weight[1] + -1.3204566130842763 * weight[2] + 0.5219415656168976 * weight[3] + 0.29698467323318606 * weight[4] + bias)) + 1.0 * log(1.0 - logistic(0.78182287177731 * weight[0] + -1.236950710878082 * weight[1] + -1.3204566130842763 * weight[2] + 0.5219415656168976 * weight[3] + 0.29698467323318606 * weight[4] + bias)))) / 30.0 + 0.0001 * weight[0]**2 + 0.0001 * weight[1]**2 + 0.0001 * weight[2]**2 + 0.0001 * weight[3]**2 + 0.0001 * weight[4]**2>

Problem Description

Pack \(N\) circles in a unit square to maximize sum of radii

Circle Packing: Problem Formulation

Problem data is just the number of circles to place:

circles = range(50)

Decision variables (\(x_i\), \(y_i\), \(r_i\)) are simple to state:

import gurobipy as gp
from gurobipy import GRB

model = gp.Model(env=env)
x = model.addVars(circles, ub=1.0, name="x") # x-coordinate of centers
y = model.addVars(circles, ub=1.0, name="y") # y-coordinate of centers
r = model.addVars(circles, ub=0.5, name="r") # Radii (≤0.5 to fit in square)

Circle Packing: Constraints

Pairs of circles cannot overlap (nonconvex constraint!):

from itertools import combinations

for i, j in combinations(circles, r=2):
    model.addConstr(
        (x[i] - x[j])**2 + (y[i] - y[j])**2 >= (r[i] + r[j])**2
    )

Each circle must stay inside unit square:

for i in circles:
    model.addConstr(x[i] - r[i] >= 0)
    model.addConstr(x[i] + r[i] <= 1)
    model.addConstr(y[i] - r[i] >= 0)
    model.addConstr(y[i] + r[i] <= 1)

Maximize the total radius:

model.setObjective(r.sum(), GRB.MAXIMIZE)

Run Local Optimization

model.Params.OptimalityTarget = 1
model.optimize()
                           Residuals
Iter     Objective      Primal     Dual      Compl     Step     Time
   0   1.25058900e+01  1.03e+01  5.53e+02  8.33e+01  0.00e+00     0s
   1   6.33643586e+00  1.81e-01  2.97e+02  1.60e+00  9.89e-01     0s
   2   4.04667381e+00  3.70e-02  3.25e+02  6.20e-01  8.54e-01     0s
   3   9.10471390e-01  4.22e-02  2.25e+02  9.03e-01  9.27e-01     0s
   4   7.98674534e-01  1.05e-02  2.91e+02  9.80e-01  9.25e-01     0s
   5   7.08424644e-01  2.13e-02  3.24e+02  9.11e-01  6.38e-01     0s
  ...
 219   3.55966588e+00  3.10e-09  6.99e-08  1.43e-05  1.00e+00     0s
 220   3.56145171e+00  6.07e-06  1.71e-03  7.33e-07  8.37e-01     0s
 221   3.56179122e+00  2.69e-09  2.08e-07  1.65e-07  1.00e+00     0s
 222   3.56179121e+00  6.55e-11  9.64e-10  1.65e-07  1.00e+00     0s
 223   3.56181582e+00  5.98e-10  7.97e-09  9.99e-10  1.00e+00     0s
First-order optimal solution

Solutions For Different Starting Points (\(N=50\))

# Extract solution
xsol = [x[i].X for i in circles]
ysol = [y[i].X for i in circles]
rsol = [r[i].X for i in circles]

Problem Description

  • Actuated robot arm
    • Fixed geometry
    • Known weight / inertia
  • Actuator forces are limited
  • Goal: Take the fastest path between initial and final positions
  • Model using time-discretised equations of motion
  • Nonlinear constraints, complex expressions

Robot Arm: Problem Formulation

State Variables (at each time step \(i = 0, 1, \ldots, n_h\)):

  • \(ρ_i, \dot{ρ}_i\): radial position and velocity
  • \(θ_i, \dot{θ}_i\): azimuthal angle and angular velocity
  • \(φ_i, \dot{φ}_i\): polar angle and angular velocity

Control Variables (at each time step \(i = 0, 1, \ldots, n_h\)):

  • \(u_ρ^i\): control input for radial motion, \(|u_ρ^i| \leq u_ρ^{\max}\)
  • \(u_θ^i\): control input for azimuthal rotation, \(|u_θ^i| \leq u_θ^{\max}\)
  • \(u_φ^i\): control input for polar rotation, \(|u_φ^i| \leq u_φ^{\max}\)

Time Variable:

  • \(t_f\): final time (to be minimized)
  • \(h = \frac{t_f}{n_h}\): variable time step size

Define Variables

State variables:

# Number of time intervals
nh = 200

# Position
rho = m.addVars(nh+1, lb=0, ub=L, name="rho")
the = m.addVars(nh+1, lb=-pi, ub=pi, name="the")
phi = m.addVars(nh+1, lb=0, ub=pi, name="phi")

# Velocities
rho_dot = m.addVars(nh+1, lb=-GRB.INFINITY, name="rho_dot")
the_dot = m.addVars(nh+1, lb=-GRB.INFINITY, name="the_dot")
phi_dot = m.addVars(nh+1, lb=-GRB.INFINITY, name="phi_dot")

Control Inputs (with bounds as limits):

u_rho = m.addVars(nh+1, lb=-max_u_rho, ub=max_u_rho, name="u_rho")
u_the = m.addVars(nh+1, lb=-max_u_the, ub=max_u_the, name="u_the")
u_phi = m.addVars(nh+1, lb=-max_u_phi, ub=max_u_phi, name="u_phi")
tf = m.addVar(lb=0, name="tf")

Objective is to minimize final time:

m.setObjective(tf, GRB.MINIMIZE)

Dynamics Constraints: Position Update

\[ \begin{align} ρ_j &= ρ_{j-1} + \frac{h}{2}(\dot{ρ}_j + \dot{ρ}_{j-1}), \quad j = 1, \ldots, n_h \\ θ_j &= θ_{j-1} + \frac{h}{2}(\dot{θ}_j + \dot{θ}_{j-1}), \quad j = 1, \ldots, n_h \\ φ_j &= φ_{j-1} + \frac{h}{2}(\dot{φ}_j + \dot{φ}_{j-1}), \quad j = 1, \ldots, n_h \end{align} \]

for j in range(1, nh+1):
    m.addConstr(rho[j] == rho[j-1] + 0.5*(tf/nh)*(rho_dot[j] + rho_dot[j-1]))
    m.addConstr(the[j] == the[j-1] + 0.5*(tf/nh)*(the_dot[j] + the_dot[j-1]))
    m.addConstr(phi[j] == phi[j-1] + 0.5*(tf/nh)*(phi_dot[j] + phi_dot[j-1]))

Boundary conditions:

rho[0].lb = rho[0].ub = rho_0
the[0].lb = the[0].ub = the_0
phi[0].lb = phi[0].ub = phi_0
rho_dot[0].lb = rho_dot[0].ub = 0
the_dot[0].lb = the_dot[0].ub = 0
phi_dot[0].lb = phi_dot[0].ub = 0
rho[nh].lb = rho[nh].ub = rho_n
the[nh].lb = the[nh].ub = the_n
phi[nh].lb = phi[nh].ub = phi_n
rho_dot[nh].lb = rho_dot[nh].ub = 0
the_dot[nh].lb = the_dot[nh].ub = 0
phi_dot[nh].lb = phi_dot[nh].ub = 0

Dynamics Constraints: Velocity Update

\[ \begin{align} I_θ^i &= \frac{(L-ρ_i)^3 + ρ_i^3}{3} \sin^2(φ_i) \\ I_φ^i &= \frac{(L-ρ_i)^3 + ρ_i^3}{3} \end{align} \]

\[ \begin{align} \dot{ρ}_j &= \dot{ρ}_{j-1} + \frac{h}{2L}(u_ρ^j + u_ρ^{j-1}), \quad j = 1, \ldots, n_h \\ \dot{θ}_j &= \dot{θ}_{j-1} + \frac{h}{2}\left(\frac{u_θ^j}{I_θ^j} + \frac{u_θ^{j-1}}{I_θ^{j-1}}\right), \quad j = 1, \ldots, n_h \\ \dot{φ}_j &= \dot{φ}_{j-1} + \frac{h}{2}\left(\frac{u_φ^j}{I_φ^j} + \frac{u_φ^{j-1}}{I_φ^{j-1}}\right), \quad j = 1, \ldots, n_h \end{align} \]

for j in range(1, nh+1):
    m.addConstr(rho_dot[j] == rho_dot[j-1] + 0.5*(tf/nh)*(u_rho[j] + u_rho[j-1])/L)
    m.addConstr(the_dot[j] == the_dot[j-1] + 0.5*(tf/nh) * (
        u_the[j] / (((L-rho[j])**3 + rho[j]**3) * nlfunc.sin(phi[j])**2 / 3.0) + 
        u_the[j-1] / (((L-rho[j-1])**3 + rho[j-1]**3) * nlfunc.sin(phi[j-1])**2 / 3.0)
    ))
    m.addConstr(phi_dot[j] == phi_dot[j-1] + 0.5*(tf/nh) * (
        u_phi[j] / (((L-rho[j])**3 + rho[j]**3) / 3.0) + 
        u_phi[j-1] / (((L-rho[j-1])**3 + rho[j-1]**3) / 3.0)
    ))

Run Local Optimization

m.Params.OptimalityTarget = 1
m.optimize()
                           Residuals
Iter     Objective      Primal     Dual      Compl     Step     Time
   0   9.76273934e-01  2.37e+00  9.00e+00  1.70e+01  0.00e+00     0s
   1   1.77239676e+00  2.14e+00  3.65e+03  1.50e+01  9.72e-02     0s
   2   1.77239676e-02  2.08e+00  1.76e+03  1.45e+01  3.08e-02     0s
   3   3.89952943e-02  1.99e+00  2.32e+03  1.30e+01  4.40e-02     0s
   4   1.58136087e+00  2.90e-02  3.97e+04  5.00e+00  1.00e+00     0s
   5   1.56389201e+00  6.31e-02  3.95e+04  4.85e+00  1.46e-02     0s
  ...
  60   9.14991648e+00  1.93e-11  1.32e-08  1.43e-05  1.00e+00     0s
  61   9.14152788e+00  7.07e-07  2.34e-04  1.46e-06  1.00e+00     0s
  62   9.14149417e+00  1.43e-10  2.15e-08  1.65e-07  1.00e+00     0s
  63   9.14149419e+00  5.05e-15  7.94e-12  1.65e-07  1.00e+00     0s
  64   9.14139610e+00  1.04e-10  1.79e-10  1.00e-09  1.00e+00     0s
First-order optimal solution

Important

  • Same solution confirmed optimal by global solver in 1200 secs
  • No starting point provided

Visualization

Robot arm animation