❮❮❮ ❮❮❮  
Sorting and ordering with linear programming
Sorting and ordering parameters and variables in mixed binary linear programming. Permutations encoded via permutation matrices.

Tags: data
Reading time: 6 min.

Sources

Basics

Permutations

A permutation on $n$ elements is a bijection from $N:=\lbrace{}1,\dotsc, n\rbrace$ onto itself.

The permutations on $N$ form a group $S_n$ under function composition. There are $n!$ permutations. The group identity is the identity mapping. The inverse of $\tau$ is both the left- and right-inverse.

A transposition is a permutation exchanging only $2$ elements of $N$ and leaving the remaining ones fixed. The permutation group $S_n$ is generated by the transpositions. The transpositions involving a single fixed element are a minimal generator.

Digraph representation

The digraph representation of a permutation consists of a collection of directed cycles on $N$. The non-trivial cycles are the ones in the cycle encoding notation of a permutation. For a permutation $\tau$, the digraph representation of $\tau^{-1}$ has the same cycle structure as the digraph representation of $\tau$, but with inverse direction (except for singletons = fixed points of $\tau$).

Permutation matrices

A $n$-dimensional permutation matrix has entries in $\lbrace{}0,1\rbrace$ and all row and column sums equal one. The inverse of a permutation matrix is its transpose and again a permutation matrix. It is both the left and right-inverse.

The digraph representation of $\tau$ is a directed graph on $N$. Its associated edge matrix is a permutation matrix.

Commuting diagrams and isomorphisms

The group operations on $S_n$ commute under taking the digraph representation and permutation matrices. The operation on the digraph representation is following arcs in the first and second digraph.

Linear algebra and LP

The linear relaxation hull of the permutation matrices (the Birkhoff polytope) has the permutation matrices as its extreme points.

Optimising an LP over a double stochastic non-negative matrix yields an optimum in a (scaled) permutation matrix.

MBLP formulations

Usages

Sorting parameters

input N in {1, 2, ...}
local I := {1, 2, ..., N}

input v[I] in ]-inf, inf[

variable S[I] in ]-inf, inf[
variable P[I][I] in {0, 1}

constraint PermMatrixRowSum[i in I] is
  1 = sum[j in I] P[i][j]
constraint PermMatrixColSum[j in I] is
  1 = sum[i in I] P[i][j]
constraint Assignment[i in I] is
  S[i] = sum[j in I] P[i][j] v[j]
constraint Sorting[i in {1, ..., N - 1}] is
  S[i] <= S[i + 1]

Sorting variables

input N in {1, 2, ...}
local I := {1, 2, ..., N}

variable V[I] in ]-inf, inf[
variable S[I] in ]-inf, inf[
variable P[I][I] in {0, 1}

constraint PermMatrixRowSum[i in I] is
  1 = sum[j in I] P[i][j]
constraint PermMatrixColSum[j in I] is
  1 = sum[i in I] P[i][j]
constraint Sorting[i in I] is
  S[i] = sum[j in I] P[i][j] V[j]
constraint Sorting[i in {1, ..., N - 1}] is
  S[i] <= S[i + 1]

This fragment is quadratic.

Bounded non-negative variables

input N in {1, 2, ...}
local I := {1, 2, ..., N}

input M in [0, inf[

variable V[I] in [0, M]
variable S[I] in [0, M]
variable P[I][I] in {0, 1}
variable SP[I][I] in [0, M] // sorted parts

constraint PermMatrixRowSum[i in I] is
  1 = sum[j in I] P[i][j]
constraint PermMatrixColSum[j in I] is
  1 = sum[i in I] P[i][j]

constraint AssignmentAdd[i in I] is
  S[i] = sum[j in I] SP[i][j]
constraint Assignment[i, j in I] is
  SP[i][j] = P[i][j] * V[j]

constraint Sorting[i in {1, ..., N - 1}] is
  S[i] <= S[i + 1]

The last constraint is quadratic but can the linearized.

Real variables

Jointly sorting parameters and bounded non-negative variables

input N_P in {1, 2, ...}
local I_P := {1, 2, ..., N_P}

input N_V in {1, 2, ...}
local I_V := {N_P + 1, ..., N_P + N_V}

param I = I_P cup I_V

variable P[I][I] in {0, 1}

constraint PermMatrixRowSum[i in I] is
  1 = sum[j in I] P[i][j]
constraint PermMatrixColSum[j in I] is
  1 = sum[i in I] P[i][j]

input v[I_P] in ]-inf, inf[
input M in [0, inf[
variable V[I_V] in [0, M]
variable S[I] in ]-inf, inf[

variable SP[I][I_V] in [0, M]

constraint AssignmentAdd[i in I] is
  S[i] = sum[j in I_V] SP[i][j] + sum[j in I_P] P[i][j] * v[j]
constraint Assignment[i in I, j in I_V] is
  SP[i][j] = P[i][j] * V[j]

constraint Sorting[i in {1, ..., N_P + N_V - 1}] is
  S[i] <= S[i + 1]

The last constraint is quadratic but can the linearized.

Optimisations

Objective function

From Solvermax: Objectives matter: Sorting using a MIP model. Use

maximize sum[i in I] S[i] * i

Alternative: Optimise for large gaps between consecutive elements:

maximize sum[i in {1, ..., N - 1}] S[i + 1] - S[i]

But this reduces to the range of the elements:

sum[i in {1, ..., N - 1}] S[i + 1] - S[i] =  S[N] - S[1] = max[i in I] v[i] - min[i in I] v[i]

But one can look at non-trivial nested ranges

maximize sum[i = 1 to n/2] S[n + 1 - i] - S[i]

If one counts all $\binom{n}{2}$ possible ranges, then this becomes

maximize sum[i = 1 to n/2] 2 * i * (S[n + 1 - i] - S[i])

Extra equality

From Kalvenhagen: Sorting using a MIP model . Add

constraint Sum is 
  sum[i in I] v[i] = sum[i in I] S[i]

This one ist implied by the permutation setup, but not that costly.

Use sorting networks

Proposed in Kalvenhagen: Sorting using a MIP model.

Tune solver parameters

From OR in an OB World: Solver Parameters Matter: tune solver parameters.

Experiments

Setup for the MBLP

  1. Data
    1. Choose a real-valued distribution with bounded (non-negative) support. Generate an i.i.d sample from it.
    2. OR. Pick a finite set of data points. Shuffle it randomly (via uniform random permutation).
  2. Sort the data.
  3. Solve the MBLP.
  4. Compare the sorted data with the MBLP solution.

Evaluating the LP relaxation

  1. Data
    1. Choose a real-valued distribution with bounded (non-negative) support. Generate an i.i.d sample from it.
    2. OR. Pick a finite set of data points. Shuffle it randomly (via uniform random permutation).
  2. Sort the data and derive the permutation matrix.
  3. Solve the LP relaxation and extract the relaxed permutation matrix.
  4. Evaluate

Note: A non-constant objective function and sorting parameters should result in an MBLP solution

Axes of variation