- SciPy v0.18.1 Reference Guide
- Optimization and root finding ( scipy.optimize )
scipy.optimize.linear_sum_assignment ¶
Solve the linear sum assignment problem.
The linear sum assignment problem is also known as minimum weight matching in bipartite graphs. A problem instance is described by a matrix C, where each C[i,j] is the cost of matching vertex i of the first partite set (a “worker”) and vertex j of the second set (a “job”). The goal is to find a complete assignment of workers to jobs of minimal cost.
Formally, let X be a boolean matrix where \(X[i,j] = 1\) iff row i is assigned to column j. Then the optimal assignment has cost
s.t. each row is assignment to at most one column, and each column to at most one row.
This function can also solve a generalization of the classic assignment problem where the cost matrix is rectangular. If it has more rows than columns, then not every row needs to be assigned to a column, and vice versa.
The method used is the Hungarian algorithm, also known as the Munkres or Kuhn-Munkres algorithm.
New in version 0.17.0.
- http://csclab.murraystate.edu/bob.pilgrim/445/munkres.html
- Harold W. Kuhn. The Hungarian Method for the assignment problem. Naval Research Logistics Quarterly , 2:83-97, 1955.
- Harold W. Kuhn. Variants of the Hungarian method for assignment problems. Naval Research Logistics Quarterly , 3: 253-258, 1956.
- Munkres, J. Algorithms for the Assignment and Transportation Problems. J. SIAM , 5(1):32-38, March, 1957.
- https://en.wikipedia.org/wiki/Hungarian_algorithm
Previous topic
scipy.optimize.linprog_verbose_callback
scipy.optimize.approx_fprime
- © Copyright 2008-2016, The Scipy community.
- Last updated on Sep 19, 2016.
- Created using Sphinx 1.2.3.
Linear assignment with non-perfect matching
Dec 8, 2020
The linear assignment problem (or simply assignment problem) is the problem of finding a matching between two sets that minimizes the sum of pair-wise assignment costs. This can be expressed as finding a matching (or independent edge set) in a bipartite graph \(G = (U, V, E)\) that minimizes the sum of edge weights. The edge weights may be positive or negative and the bipartite graph does not need to be complete : if there is no edge between two vertices then they cannot be associated. Note that a maximum-weight assignment can be obtained by negating the weights and finding a minimum-weight assignment.
The simplest form of the assignment problem assumes that the bipartite graph is balanced (the two sets of vertices are the same size) and that there exists a perfect matching (in which every vertex has a match). Let \(n\) be the number of elements in each set and let \(C\) be a square matrix of size \(n \times n\) that contains the edge weights. Missing edges are represented by \(\infty\), such that \(C_{i j} < \infty \Leftrightarrow (i, j) \in E\). The assignment problem can then be clearly expressed as an integer linear program : (note that the problem is not actually solved using a general-purpose ILP solver, it is just a convenient framework in which to express the problem)
The constraint that the sum of each row and column is equal to one ensures that each element has exactly one match. However, what happens when the graph is not balanced or does not contain a perfect matching? We cannot enforce the sums to be equal to one. Which problem should be solved?
I recommend reading the tech report “On Minimum-Cost Assignments in Unbalanced Bipartite Graphs” by Lyle Ramshaw and Robert E. Tarjan. I will summarize some of the main points here.
Let us consider a more general, rectangular problem of size \(r \times n\) and assume (without loss of generality) that \(r \le n\). If \(r = n\) then the problem is balanced, if \(r < n\) it is unbalanced.
Clearly an unbalanced probem cannot have a perfect matching, since there will be at least \(n - r\) unmatched elements in the larger set. However, it may be possible to find a matching in which every vertex in the smaller set has a match. This is referred to as a one-sided perfect matching and the optimization problem can be expressed:
The inequality constraints enforce that each element in the smaller set has exactly one match while each element in the larger set has at most one match. Ramshaw and Tarjan outline a method to reduce from an unbalanced problem to a balanced problem while preserving sparsity. A simple alternative is to add \(n - r\) rows of zeros and then exclude these edges from the eventual solution. Most libraries for the assignment problem solve either the balanced or unbalanced version of this problem (see the later section).
However, whether balanced or unbalanced, it may still occur that the constraint set is infeasible, meaning that there does not exist a (one-sided) perfect matching. Let \(\nu(W) \le r\) denote the size of the largest matching in the graph. If \(\nu(W) < r\), then there does not exist a one-sided perfect matching and all possible matchings are imperfect.
Ramshaw and Tarjan actually outline three different versions of the assignment problem:
- perfect matching
- imperfect matching
- minimum-weight matching
The imperfect matching problem is to find the matching of size \(\nu(G)\) with the minimum cost. The minimum-weight matching problem is to find the matching of any size with the minimum cost. If \(\nu(G) = r = n\), then perfect and imperfect matching coincide. Otherwise, when \(\nu(G) < n\), there does not exist a perfect matching.
The imperfect matching problem can be expressed
and the minimum-weight matching problem can be expressed
Ramshaw and Tarjan show that both of these problems can be reduced to finding a perfect matching in a balanced graph. When using linear assignment, we should carefully consider which of the three problems we actually want to solve.
In support of minimum-weight matching
The minimum-weight matching problem is often the most natural choice, since it puts no constraint on the size of the matching. To illustrate the difference between this and the other problems, consider the following balanced problem:
The solution to perfect (or imperfect) matching is to choose -1 and -2 for a total score of -3 and a cardinality of 2. The solution to minimum-weight matching is to choose -4 with a cardinality of 1.
Minimum-weight matching will never select an edge with positive cost: it is better to simply leave it unselected. Edges with zero cost have no impact.
It may be more natural to consider the maximization of positive weights than the minimization of negative costs.
Min-cost imperfect matching with positive weights
Be careful when solving imperfect matching problems with positive edge weights! I would avoid this situation altogether due to the tension that exists between maximizing the number of matches and minimizing the sum of (positive) costs. This may result in the unexpected behaviour that adding an edge to the graph increases the minimum cost. For example, compare the following two problems:
Quick and dirty transformations
Ramshaw and Tarjan above describes some clever techniques to transform imperfect and minimum-weight matching problems into perfect matching problems while preserving sparsity. Here we describe some quick and dirty alternatives.
We can always transform an unbalanced (and therefore imperfect) problem into a balanced problem by adding \(n - r\) rows of zeros. The resulting balanced graph has a perfect matching if and only if the original unbalanced graph had a matching of size \(r\) (in which every vertex in the smaller set is matched).
If we need to solve imperfect matching but we only have a solver for perfect matching, it suffices to replace the infinite edge weights with a large, finite cost (e.g. larger than the total absolute value of all weights). The resulting graph must contain a perfect matching since it is a complete bipartite graph, and each high-cost edge is worth more than all original edges combined. The high-cost edges can be excluded at the end.
Most existing packages either solve perfect, one-sided perfect or imperfect matching. To use one of these solvers for the minimum-weight matching problem, it suffices to replace all positive edges (including infinite edges) with zero. If using a solver that leverages sparsity, it is better to use the technique described by Ramshaw and Tarjan.
Python packages
The table below outlines the different behaviour of several popular packages. The code that was used to determine the behaviour is available as a Jupyter notebook .
- Docs »
- 4. API ( l-o ) »
- 4.5. Optimization and root finding ( scipy.optimize ) »
- 4.5.4.4. scipy.optimize.linear_sum_assignment
- View page source
4.5.4.4. scipy.optimize.linear_sum_assignment ¶
Solve the linear sum assignment problem.
The linear sum assignment problem is also known as minimum weight matching in bipartite graphs. A problem instance is described by a matrix C, where each C[i,j] is the cost of matching vertex i of the first partite set (a “worker”) and vertex j of the second set (a “job”). The goal is to find a complete assignment of workers to jobs of minimal cost.
Formally, let X be a boolean matrix where \(X[i,j] = 1\) iff row i is assigned to column j. Then the optimal assignment has cost
s.t. each row is assignment to at most one column, and each column to at most one row.
This function can also solve a generalization of the classic assignment problem where the cost matrix is rectangular. If it has more rows than columns, then not every row needs to be assigned to a column, and vice versa.
The method used is the Hungarian algorithm, also known as the Munkres or Kuhn-Munkres algorithm.
New in version 0.17.0.
- http://www.public.iastate.edu/~ddoty/HungarianAlgorithm.html
- Harold W. Kuhn. The Hungarian Method for the assignment problem. Naval Research Logistics Quarterly , 2:83-97, 1955.
- Harold W. Kuhn. Variants of the Hungarian method for assignment problems. Naval Research Logistics Quarterly , 3: 253-258, 1956.
- Munkres, J. Algorithms for the Assignment and Transportation Problems. J. SIAM , 5(1):32-38, March, 1957.
- https://en.wikipedia.org/wiki/Hungarian_algorithm
IMAGES
VIDEO