Gurobi’s Ill-Conditioning Explainer

OR 2023 – Hamburg

Matthias Miltenberger
Ed Klotz

Introduction and Motivation

  • OR practitioners have to deal with unexpected, counterintuitive results

  • We need an organized approach to deal with the unexpected
  • We already have the Infeasibility Explainer (computeIIS)
  • Based on our customer tickets, an Ill-Conditioning Explainer would help

Quick Refresher on Condition Numbers

Definition of condition number \(\kappa\) of a square matrix \(B\) for solving \(Bx=b\)

\(\kappa(B) = \|B\|\cdot\|B^{-1}\|\) provides upper bound on input error amplification in the solution of a linear system \(Bx=b\). It is the normed reciprocal of the distance to singularity.

  • Effect of the input error \(\delta b\) on the output error \(\delta x\):

\[\frac{\|\delta x\|}{\|x\|} \leq \kappa(B) \cdot \frac{\|\delta b\|}{\|b\|}\]

  • Error of \(10^{-16}\) in the input \(b\) and \(\kappa \approx 10^{10}\) can cause error up to \(10^{-6}\) in \(x\)

Double precision

Default feasibility tolerance

Basic Idea

  • Approximate the distance to singularity by applying FeasRelax to minimize the sum of infeasibilities of:

\[ \begin{align} B^Ty = 0 \\ e^Ty = 1 \\ y \; \text{free} \end{align} \]

  • Rows of \(B\) with very small \(y\) component are filtered out
  • Support of \(y\) provides a row-based certificate of the ill-conditioning
  • Add constraint \(e^Ty=1\) to prevent \(y=0\)

\[ \begin{align} By = 0 \\ e^Ty = 1 \\ y \; \text{free} \end{align} \]

  • Columns of \(B\) with very small \(y\) component are filtered out
  • Support of \(y\) provides a column-based certificate of the ill-conditioning

Third case: Examine angles of pairs of matrix rows and columns

  • Vectors \(u\) and \(v\) are almost parallel if \(u^Tv - \|u\|\cdot\|v\| < \epsilon\).

Challenges

  • The FeasRelax problem is an LP whose structural columns comprise an ill-conditioned basis matrix
    • Set NumericFocus=3 and ScaleFlag=2 if row and column ratios are large
    • Provide more custom parameters via PRM file


  • Too many rows or columns in the output; hard to determine source of ill-conditioning
    • Try different explanation method (row, column, angle)
    • Sometimes one method provides a much smaller explanation

Sources of Numerical Issues

  1. Large matrix coefficient ratios
    • Especially rows and columns with common intersection

    • Example: \[\begin{align} & B=\begin{pmatrix} 1 & 0\\ M & 1 \end{pmatrix} \text{ has inverse } B^{-1}=\begin{pmatrix} 1 & 0\\ -M & 1 \end{pmatrix}\\ & \Rightarrow \|B\| \geq M \text{ and } \|B^{-1}\| \geq M\\ & \Rightarrow \kappa(B) = \|B\|\cdot\|B^{-1}\| \color{red}{\geq M^2} \end{align}\]

    • Remedies:

      • Avoid unnecessarily large big M values
      • Avoid unnecessarily small units of measurement

Sources of Numerical Issues

  1. Inconsistent or unnecessarily truncated data
    • Values that should be equal have slight differences
    • Example:
         c306: 0.416666667 x80 – 100 x90 = 0  
        [...]
       c11360: 0.416666666 x80 – 100 x90 = 0
     
    • Values appear to be calculated inconsistently
    • Values are only calculated to 9 digits -> bigger perturbation than necessary

Sources of Numerical Issues

  1. Hidden mixtures of large and small coefficients
    • \(\kappa(B)=\|B\|\cdot\|B^{-1}\|\) is large if \(\|B\|\) or \(\|B^{-1}\|\) is large
    • \(\|B^{-1}\|\) can also be large for other reasons
    • Cascade: \[ B=\begin{pmatrix} 1 & -2 & & & \\ & 1 & -2 & & \\ & & \ddots & \ddots & \\ & & & 1 & -2 \\ & & & & 1 \\ \end{pmatrix} ,\; B^{-1}=\begin{pmatrix} 1 & 2 & 4 &\dots & 2^{n-1} \\ & 1 & 2 & \dots & 2^{n-2} \\ & & \ddots & \ddots & \vdots \\ & & & 1 & 2 \\ & & & & 1 \\ \end{pmatrix} \]

Quick Start Guide

  • Installation: python -m pip install gurobi-modelanalyzer

  • Basic usage:

    import gurobipy as gp
    import gurobi_modelanalyzer as gma
    m = gp.read("badmodel.mps")
    m.optimize()
    print(m.kappa)                         # print kappa of final basis
    gma.kappa_explain(m)                   # row-based explanation
    gma.kappa_explain(m, expltype="COLS")  # column-based explanation
    print(gma.angle_explain(m))            # angle explanation
  • Modified afiro instance for demonstration:

    Subject To
     R09: - X01 + X02 + X03 = 0
     R09x: - 0.9999999 X01 + X02 + X03 = 0
    [...]

Output on Maliciously Modified afiro

  • kappa_explain() will generate a new LP or MPS file, containing the ill-conditioning certificate:
  • afiro_kappaexplain.lp:
Minimize
  0 X36 + 0 X04 + 0 X15 + 0 X16 + 0 X26 + 0 X38 + 0 X37
Subject To
 GRB_Combined_Row: 0.0303868836044176 X23 + 4.80518e-10 X01
   - 4.65661e-10 X03 = 0
 (mult=2696322.968477607)R09x: - 0.9999999000000001 X01 + X03 = 0
 (mult=-2696322.6896988587)R09: - X01 + X03 = 0
 (mult=0.2787787486643817)X46: - X03 + 0.109 X22 <= 0
 (mult=0.030386883604417606)R19: X23 - X22 + X24 + X25 = 0
 (mult=0.030386883604417606)X45: - X25 <= 0
 (mult=0.030386883604417606)X48: 0.301 X01 - X24 <= 0
Bounds
End

Angle Explanation on Modified afiro Instance

  • angle_explain() returns a list of tuples:
    1. almost parallel rows
    2. almost parallel columns
    3. associated model
    (
      [(<gurobi.Constr R09>, <gurobi.Constr R09x>)],
      [],
      <gurobi.Model Continuous instance basismodel:
        28 constrs, 28 vars, No parameter changes>
    )

“Real” Example: neos-1603965 (MIPLIB 2017)

  • Use model.relax() to relax integrality conditions
  • Final condition number 1.60000000016e+21 (original model)
  • angle_explain() does not reveal any almost parallel rows or columns
Subject To
 GRB_Combined_Row: = 1.000000015655
 (mult=0.999999999815)R24991: C8008 - C8009 >= 0
 (mult=9.9999999981499e-11)R9004: - 1e+10 C8008 <= 0
 (mult=9.9999999981499e-11)R13004: - 0.15 C7002 + 1e+10 C8009 <= 1e+10
 (mult=1.4999999997225e-11)R1030: C3030 - C7001 <= 0
 (mult=-1.4999999997225e-11)R1966: C3966 - C7001 <= 0
 (mult=-1.4999999997225e-11)R2966: C4966 - C7002 <= 0
 (mult=-1.4999999997225e-11)R28014: C0030 + C3030 = 7165
 (mult=1.4999999997225e-11)R28950: C0966 + C3966 + C4966 = 8221
 (mult=1.12499999982e-11)R0030: 1.3333333333 C0030 = 0
 (mult=-1.12499999982e-11)R0966: 1.3333333333 C0966 = 0
  • Don’t ignore rows with small \(y\) multiplier
  • big M values of \(10^{10}\) are the issue here

More Functionality

  • converttofractions():
    • takes a list of decimal values and tries to represent them as fractions
    • useful to clean-up “dirty” coefficients
  • matrix_bitmap():
    • wrapper for spy() (matplotlib) to inspect sparsity pattern
    • useful to identify cascades

Summary

  • Track down reasons for numerical issues

  • Open-source (Apache-2.0) on GitHub:

    https://github.com/Gurobi/gurobi-modelanalyzer

  • Install via pip:

    python -m pip install gurobi-modelanalyzer

  • Currently only supports LPs (use model.relax() to inspect root LP of MIPs)

  • Still in pre-release stage (version 0.2.0)

Get involved and share your feedback and ideas!

Thanks!