Gurobi Summit EMEAI 2024
The Decision Intelligence Summit
https://gurobi.github.io/slides/
Senior Developer
Application | Phenomenon | Nonlinearity |
---|---|---|
Finance | risk | quadratic (convex) |
Truss topology | physical forces | quadratic (convex) |
Pooling (petrochemical, mining, agriculture) | mixing products | quadratic nonconvex |
electricity distribution (ACOPF) | Alternative Current | \(\sin\) and \(\cos\) (can be made quadratic) |
machine learning | — | logistic function, \(\tanh\) |
chemical engineering | chemical reactions | anything |
Ideally, we seek to solve \[ \begin{aligned} & \min f(x)\\ & \text{s.t:}\\ & g_i(x) \le 0, i = 1,\ldots, m \\ & x_j \in \mathbb Z, j \in \mathcal I\\ & l \le x \le u \end{aligned} \]
Any segment connecting two points inside the region is inside the region.
There exists two points in the region the segment connecting them is not completely in the region.
New in Gurobi 12
\[ \begin{aligned} & \min c^T x + x^T Q^0 x\\ & \text{s.t:}\\ & {a_k}^T x + x^T Q^k x \le b_k, k = 1,\ldots, m\\ & x_j \in \mathbb Z, j \in \mathcal I\\ & l \le x \le u \end{aligned} \]
Through simple algebra, can be represented as SOC:
Very powerful but modeling sometimes far from obvious.
Not all forms recognized by solvers
Note
Our barrier algorithm needs everything converted
to SOC or linear by presolve.
New in Gurobi 12
Barrier implicitly handles bounds on variables
At each node of the tree:
In MILP and MIQP continuous relaxation usually solved by simplex.
In MISOCP/MIQCP, continuous relaxation solved by barrier.
Drop quadratic constraints and solve an LP relaxation at each node.
Integer feasible nodes are not necessarily solutions.
An exponential number of cutting planes is needed to approximate a convex quadratic form.
From \[ \sum_{i = 1}^n x_i^2 \le x_0^2, x_0 \ge 0 \]
\[ (A) \left\{ \begin{aligned} &\sum_{i = 1}^n x_i^2 \le x_0^2,\\ &x_0 \ge 0 \end{aligned} \right. \]
\[ (B) \left\{ \begin{aligned} & \sum_{i = 1}^n y_i \le x_0\\ & x_i^2 \le y_i x_0 & & i = 1, \ldots, n\\ & y \ge 0, x \ge 0 \end{aligned} \right. \]
Gurobi tries to deal with it but can be an issue.
MIQCPMethod
PreMIQCPForm
\[ \begin{aligned} & \min c^T x + x^T Q^0 x\\ & \text{s.t:}\\ & {a_k}^T x + x^T Q^k x \le b_k, & &k = 1,\ldots, m\\ & x_j \in \mathbb Z, j \in \mathcal I\\ & l \le x \le u \end{aligned} \]
Note
For simplicity sake, consider model without integer variables, in remainder.
NonConvex
Since Gurobi 11
Default behavior change: \(2\) (was \(1\)).
\[ \begin{aligned} & \min c^T x + \left\langle Q^0, Z \right\rangle\\ & \text{s.t:}\\ & {a_k}^T x + \color{red}{\left\langle Q^k , Z \right\rangle} \le b_k, & &k = 1,\ldots, m\\ & \color{red}{Z = x x^T} \\ & l \le x \le u\\ \\ &(\left\langle Q , Z \right\rangle = \sum_i \sum_j q_{ij} z_{ij}) \end{aligned} \]
Warning
Similar to disaggregation, infeasibilities in bilinear formulation can be larger than in the original model
Consider the square case: \(z = x^2\)
It is convex: \[ z^-(x_i, x_i) = x_i^2 \]
Can be dealt with by OA.
\(z^+(x_i, x_i)\) is given by the secant: \[ z^+_{ii} = (u + l) x_i - l \cdot u \]
\[ z^-_{ij} = \max \left\{ \begin{aligned} & 𝑙_j x_i+ 𝑙_i 𝑥_j - 𝑙_i 𝑙_j \\ & 𝑢_j x_i + 𝑢_i x_j - u_i u_j \end{aligned} \right\} \le z_{ij} \le z^+_{ij} = \min \left\{ \begin{aligned} & 𝑙_j x_i+ u_i 𝑥_j - u_i l_j \\ & 𝑢_j x_i + l_i x_j - l_i u_j \end{aligned} \right\} \]
RLTCuts
BQPCuts
SDPCuts
OBBT
:
Overloaded operators
Nonlinear functions
Internal representation of a nonlinear constraint:
index | opcode | data | parent |
---|---|---|---|
0 | plus | -1 | -1 |
1 | sin | -1 | 0 |
2 | multiply | -1 | 1 |
3 | constant | 2.5 | 2 |
4 | variable | 1 | 2 |
5 | variable | 2 | 0 |
Warning
In non-Python API nonlinear constraints
are entered with this representation
Similar to bilinear
Construct a disaggregated formulation
For each function, compute lower/upper envelope
Spatial branching to refine them
Additional difficulties:
Since Gurobi 9.0, can state \(𝑦=𝑓(𝑥)\)
Library of predefined functions include:
Gurobi 9 and 10: PWL approximations
Gurobi 11: Global optimization through FuncNonLinear
attribute/parameter.
Differences
Similarity
Disaggregated formulation for spatial branch-and-bound could be solved by Gurobi 11
Warning
In Gurobi 12 default value for FuncNonLinear
attribute parameter is 1
No PWL used unless specified by the user
In this example we use multinomial logistic regression to classify digits
clf = LogisticRegression(C=50.0 / 500, penalty="l1", solver="saga", tol=0.1, random_state=4)
pipeline = make_pipeline(clf)
pipeline.fit(X, y)
Pipeline(steps=[('logisticregression', LogisticRegression(C=0.1, penalty='l1', random_state=4, solver='saga', tol=0.1))])In a Jupyter environment, please rerun this cell to show the HTML representation or trust the notebook.
Pipeline(steps=[('logisticregression', LogisticRegression(C=0.1, penalty='l1', random_state=4, solver='saga', tol=0.1))])
LogisticRegression(C=0.1, penalty='l1', random_state=4, solver='saga', tol=0.1)
\(\bar x\) is a grayscale image of \(28\times 28\) (\(=784\)) pixels
The output is a vector \(y\) of length 10 (each entry corresponding to a digit)
The image is classified according to the largest entry of \(y\)
The entries of the vector are computed using SoftMax and are probability estimates
Predicted label is 1
y = [0. 0.98 0. 0. 0. 0. 0. 0. 0.01 0. ]
The multinomial regression computes one regression vector for each label:
\[ r_l(x) = a_l^T x + b_l, \, l=0,\ldots,9 \]
Then the SoftMax function is applied to obtain the vector \(y\):
\[ y_l = g_l(x) = \frac{e^{r_l(x)}}{\sum_{k=0}^{9} e^{r_k(x)}}, \, l=0,\ldots, 9 \]
Tip
\(g\) is typically a function that is now easy to write in Gurobi 12!
\[\begin{alignat}{3} \max_{0 \le x \le 1, y} \quad & y_w\\ \mbox{s.t.} \quad & \|x - \bar{x}\|_1 \le \delta\\ & g(x) = y \\ & y_l \le 0.25 \end{alignat}\]
Notations
\(g\) the prediction function of the logistic regression
\(\delta\) the width of the neighbourhood
m = gp.Model()
delta = 6
right_label = 1
wrong_label = 5
# Create input/output variables
x = m.addMVar(xbar.shape, lb=0.0, ub=1.0, name="x")
y = m.addMVar(10, lb=-np.inf, name="y")
# Constrain x to 1-norm neighbourhood around \bar x
eta = m.addMVar(xbar.shape, lb=0, ub=1, name="abs_diff")
m.addConstr(eta >= x - xbar)
m.addConstr(eta >= -x + xbar)
m.addConstr(eta.sum() <= delta);
# Upper bound on probability of right label
y[right_label].UB = 0.25
# Objective: Maximize the probablilty of wrong label
m.setObjective(y[wrong_label], gp.GRB.MAXIMIZE)
# Insert the neural network in the gurobipy model
pred_constr = gml.add_predictor_constr(m, pipeline, x, y, predict_function="predict_proba")
Gurobi Optimizer version 12.0.0 build v12.0.0rc0 (mac64[arm] - Darwin 23.6.0 23H222)
CPU model: Apple M3 Pro
Thread count: 4 physical cores, 8 logical processors, using up to 8 threads
Optimize a model with 1589 rows, 1588 columns and 3940 nonzeros
Model fingerprint: 0xa32573da
Model has 20 general nonlinear constraints (20 nonlinear terms)
Variable types: 1588 continuous, 0 integer (0 binary)
Coefficient statistics:
Matrix range [1e+00, 1e+00]
Objective range [1e+00, 1e+00]
Bounds range [2e-01, 1e+00]
RHS range [4e-03, 6e+00]
Presolve model has 20 nlconstr
Added 11 variables to disaggregate expressions.
Presolve removed 1442 rows and 966 columns
Presolve time: 0.01s
Presolved: 248 rows, 634 columns, 4960 nonzeros
Presolved model has 10 bilinear constraint(s)
Presolved model has 10 nonlinear constraint(s)
Warning: Model contains variables with very large bounds participating
in nonlinear terms.
Presolve was not able to compute smaller bounds for these variables.
Consider bounding these variables or reformulating the model.
Solving non-convex MINLP
Variable types: 634 continuous, 0 integer (0 binary)
Root relaxation: objective 1.000000e+00, 282 iterations, 0.00 seconds (0.00 work units)
Nodes | Current Node | Objective Bounds | Work
Expl Unexpl | Obj Depth IntInf | Incumbent BestBd Gap | It/Node Time
0 0 1.00000 0 20 - 1.00000 - - 0s
0 0 1.00000 0 16 - 1.00000 - - 0s
0 0 1.00000 0 14 - 1.00000 - - 0s
0 0 1.00000 0 14 - 1.00000 - - 0s
0 0 1.00000 0 14 - 1.00000 - - 0s
H 0 0 0.7396035 1.00000 35.2% - 0s
0 2 1.00000 0 14 0.73960 1.00000 35.2% - 0s
Explored 360 nodes (3690 simplex iterations) in 0.19 seconds (0.37 work units)
Thread count was 8 (of 8 available processors)
Solution count 1: 0.739604
Optimal solution found (tolerance 1.00e-04)
Best objective 7.396035216634e-01, best bound 7.396198121843e-01, gap 0.0022%
Label probabilites for perturbed image:
[0. 0.24 0. 0. 0. 0.74 0. 0. 0.01 0. ]
# We want to write y_j = e^z_j / sum_j=1^k e^z_j
# y_j are the output variable z_j = input @ coefs + intercepts
# Store the e^z_j in a nonlinear expression
exponentials = nlfunc.exp(linear_predictor)
# The denominator is the sum over the first axis
denominator = exponentials.sum(axis=1)
# Voila!
gp_model.addConstr(
output == exponentials / denominator[:, np.newaxis]
)
linear_predictor_vars = gp_model.addMVar(
output.shape, lb=-gp.GRB.INFINITY, name="linear_predictor"
)
gp_model.addConstr(linear_predictor_vars == linear_predictor, name="linreg")
predictor_model.linear_predictor = linear_predictor_vars
exponentials = gp_model.addMVar(output.shape)
exponentials = exponentials
denominator = gp_model.addMVar((output.shape[0]), lb=epsilon)
num_gc = gp_model.NumGenConstrs
for index in np.ndindex(output.shape):
gp_model.addGenConstrExp(
linear_predictor_vars[index],
exponentials[index],
name=predictor_model._indexed_name(index, "exponential"),
)
gp_model.update()
try:
attributes = predictor_model.attributes
except AttributeError:
attributes = {}
for gen_constr in gp_model.getGenConstrs()[num_gc:]:
for attr, val in attributes.items():
gen_constr.setAttr(attr, val)
gp_model.addConstr(denominator == exponentials.sum(axis=1))
gp_model.addConstrs(
output[i, :] * denominator[i] == exponentials[i, :]
for i in range(output.shape[0])
)
Using Gurobi 12 beta:
v10 | v11 | v12 | ||||
---|---|---|---|---|---|---|
#models | # viol. | Speedup | # viol. | Speedup | # viol. | |
50 | 15 | 19.5 | 0 | 35.2 | 0 |
Presolve
Works on the full nonlinear expressions:
Branch-and-bound search
Can check original nonlinear constraints for feasibility:
Consider models from public library minlplib.org
#models | Speedup | |
---|---|---|
> 0 | 419 | 1.00 |
> 1 | 178 | 0.98 |
> 10 | 121 | 0.87 |
#models | Speedup | |
---|---|---|
> 0 | 422 | 1.30 |
> 1 | 182 | 1.63 |
> 10 | 127 | 1.69 |
v11 univ | v12 univ | v12 expr | |
---|---|---|---|
Unsolved | 30 | 36 | 16 |
Sub-Optimal | 0 | 0 | 3 |
Cases where cannot branch but solution still violated
Sub-optimal termination (unable to solve some node relaxations)
Best objective 7.000338563098e+00, best bound 6.998594162348e+00, gap 0.0249%
©️ Gurobi Optimization, 2024