Standard Pooling Problem

Introduction

The pooling problem is a challenging problem in the petrochemical refining, wastewater treatment, and mining industries. This problem can be regarded as a generalization of the minimum-cost flow problem and the blending problem. It is indeed important because of the significant savings it can generate, so it comes at no surprise that it has been studied extensively since Haverly pointed out the non-linear structure of this problem in 1978.

Proposed Solution

Two alternative formulations based on Bilinear Programming, a subclass of non-convex Quadratic Programming problems, are presented, namely:

  • P-formulation (concentration).
  • Q-formulation (proportion).

Key Features of the Solution

  • Deployment of Bilinear Programs.
  • Optimization based on Spatial Branch and Bound (sB&B).
  • Benchmark run on an instance of the Standard Pooling Problem to compare the performance of the formulations mentioned above.

Added Value

It is shown that solving Bilinear Programs with Gurobi is as easy as configuring a single global parameter. The dramatic difference in the performance of alternative formulations is also highlighted.

Modeling Example

Access the Jupyter Notebook 

 

 

 


Objective

One of the new features of Gurobi 9.0 is the addition of a bilinear solver, which enables finding the optimal solution of non-convex quadratic programming problems (i.e. QPs, QCQPs, MIQPs and MIQCQPs). This notebook will show you how to use this feature by tackling an instance of the standard pooling problem, one of the most well-known bilinear programs. Moreover, we will present two alternative formulations, cast as Quadratically Constrained Quadratic Problems, to contrast their performance when calling the optimizer.

Problem Description

The Minimum-Cost Flow Problem (MCFP) seeks to find the cheapest way of sending a certain amount of flow from a set of source nodes to a set of target nodes, possibly via transshipment nodes⁠, in a directed capacitated network. The Blending Problem is a type of MCFP with only source and target nodes, where raw materials with different attribute qualities are blended together to create end products in such a way that their attribute qualities are within tolerances.

The Pooling Problem combines features of both problems, as flow streams from different sources are mixed at intermediate pools and blended again at the target nodes. The non-linearity is, in fact, the direct result of considering pools, as the quality of a given attribute at a pool —defined as the weighted average of the qualities of the incoming streams— is an unknown quantity and thus needs to be captured by a decision variable. We refer to this problem as the Standard Pooling Problem when the network can be represented by a tripartite graph, i.e. three disjoint sets of nodes such that no nodes within the same set are adjacent.

In a nutshell, it can be stated as follows: Given a list of source nodes with raw materials containing known attribute qualities, what is the cheapest way of mixing these materials at intermediate pools so as to meet the demand and tolerances at multiple target nodes? (Gupte et al., 2017) [4]. Several different formulations for the Standard Pooling Problem and its extensions exist in the literature, which can be classified into two main categories: one that consists of flow and quality variables, and the other that uses flow proportions instead of quality variables. Both categories will be considered in this notebook.

Problem Instance

As an illustrative example, we will solve the second pooling problem posed by Rehfeldt and Tisljar in 1997 and cited by Audet et al. in 2004:

To that end, let’s declare the required data structures to represent this problem instance:

In [1]:

Solution Approach

Mathematical programming is a declarative approach where the modeler formulates a mathematical optimization model that captures the key aspects of a complex decision problem. The Gurobi optimizer solves such models using state-of-the-art mathematics and computer science.

A mathematical optimization model has five components, namely:

  • Sets and indices.
  • Parameters.
  • Decision variables.
  • Objective function(s).
  • Constraints.

A quadratic constraint that involves only products of disjoint pairs of variables is often called a bilinear constraint, and a model that contains bilinear constraints is often called a Bilinear Program. Bilinear constraints are a special case of non-convex quadratic constraints.

This type of problems are typically solved using spatial Branch and Bound (sB&B). This algorithm explores the entire search space, so it provides a globally valid lower bound on the optimal objective value and —given enough time— it will find a globally optimal solution (subject to tolerances). The interested reader is referred to references [3], [6] and [7].

We now present two alternative Bilinear Programs for the Standard Pooling Problem:

P-formulation (Concentration)

Sets and Indices

$G=(V,E)$: Directed graph.

$i,j \in V$: Set of nodes.

$(i,j) \in E \subset V \times V$: Set of edges.

$N(i)^+ = \{j \in V \mid (i,j) \in E \}$: Set of successor nodes receiving outflow from node $i$.

$N(j)^- = \{i \in V \mid (i,j) \in E \}$: Set of predecessor nodes sending inflow to node $i$.

$k \in \text{Attrs}$: Set of attributes.

$s \in \text{Sources} \subset V$: Set of source nodes, i.e. $N(s)^-= \emptyset$.

$t \in \text{Targets} \subset V$: Set of target nodes, i.e. $N(t)^+= \emptyset$.

$p \in \text{Pools} = V \setminus (\text{Sources} \cup \text{Targets})$: Set of pools.

Parameters

$\text{Cost}_s \in \mathbb{R}^+$: Cost of acquiring one unit of raw material at source node $s$.

$\text{Supply}_s \in \mathbb{R}^+$: Maximum number of units of raw material available at source node $s$.

$\text{Content}_{s,k} \in \mathbb{R}^+$: Content of attribute $k$ in raw material at source node $s$.

$\text{Price}_t \in \mathbb{R}^+$: Price for selling one unit of final blend at target node $t$.

$\text{Demand}_t \in \mathbb{R}^+$: Minimum number of units required of final blend at target node $t$.

$\text{Min_tol}_{t,k} \in \mathbb{R}^+$: Minimum tolerance for attribute $k$ in final blend at target node $t$.

$\text{Max_tol}_{t,k} \in \mathbb{R}^+$: Maximum tolerance for attribute $k$ in final blend at target node $t$.

$\text{Cap}_p \in \mathbb{R}^+$: Maximum Capacity to store intermediate blend at pool $p$.

$\text{UB}_{i,j}\in \mathbb{R}^+$: Maximum flow from node $i$ to node $j$.

Decision Variables

$\text{flow}_{i,j} \in [0, \text{UB}_{i,j}]$: Flow from node $i$ to node $j$.

$\text{quality}_{p,k} \in \mathbb{R}^+$: Concentration of attribute $k$ at pool $p$.

Objective Function

Profit: Maximize total profits.

\begin{equation} \text{Max} \quad Z = \sum_{t \in \text{Targets}}{\sum_{i \in N(t)^-}{\text{Price}_t \cdot \text{flow}_{i,t}}} – \sum_{s \in \text{Sources}}{\sum_{j \in N(s)^+}{\text{Cost}_s \cdot \text{flow}_{s,j}}} \tag{0} \end{equation}

Constraints

Flow conservation: Total inflow of pool $p$ must be equal to its toal outflow (nothing is stored in them).

\begin{equation} \sum_{t \in N(p)^+}{\text{flow}_{p,t}} – \sum_{s \in N(p)^-}{\text{flow}_{s,p}} = 0 \quad \forall p \in \text{Pools} \tag{1} \end{equation}

Source capacity: Total outflow of source $s$ cannot exceed its capacity.

\begin{equation} \sum_{j \in N(s)^+}{\text{flow}_{s,j}} \leq \text{Supply}_s \quad \forall s \in \text{Sources} \tag{2} \end{equation}

Pool capacity:Total outflow of pool $p$ cannot exceed its capacity.

\begin{equation} \sum_{t \in N(p)^+}{\text{flow}_{p,t}} \leq \text{Cap}_p \quad \forall p \in \text{Pools} \tag{3} \end{equation}

Target demand: Total inflow of target $t$ must at least meet its minimum demand.

\begin{equation} \sum_{i \in N(t)^-}{\text{flow}_{i,t}} \geq \text{Demand}_t \quad \forall t \in \text{Targets} \tag{4} \end{equation}

Pool concentration: Concentration of attribute $k$ at pool $p$ is expressed as the weighted average (linear blending) of the concentrations associated to the incoming flows (notice the bilinear terms on the right-hand side).

\begin{equation} \sum_{s \in N(p)^-}{\text{Content}_{s,k} \cdot \text{flow}_{s,p}} = \text{quality}_{p,k} \cdot \sum_{t \in N(p)^+}{\text{flow}_{p,t}} \quad \forall (p,k) \in \text{Pools} \times \text{Attrs} \tag{5} \end{equation}

Target tolerances: Concentration of attribute $k$ at target $t$ is also the result of linear blending, and must be within tolerances (notice the bilinear terms on the second expression of the left-hand side).

\begin{equation} \sum_{s \in N(t)^- \cap \text{Sources}}{\text{Content}_{s,k} \cdot \text{flow}_{s,t}}+ \sum_{p \in N(t)^- \cap \text{Pools}}{\text{quality}_{p,k} \cdot \text{flow}_{p,t}} \geq \text{Min_tol}_{t,k} \cdot \sum_{i \in N(t)^-}{\text{flow}_{i,t}} \quad \forall (t,k) \in \text{Targets} \times \text{Attrs} \tag{6.1} \end{equation}

\begin{equation} \sum_{s \in N(t)^- \cap \text{Sources}}{\text{Content}_{s,k} \cdot \text{flow}_{s,t}}+ \sum_{p \in N(t)^- \cap \text{Pools}}{\text{quality}_{p,k} \cdot \text{flow}_{p,t}} \leq \text{Max_tol}_{t,k} \cdot \sum_{i \in N(t)^-}{\text{flow}_{i,t}} \quad \forall (t,k) \in \text{Targets} \times \text{Attrs} \tag{6.2} \end{equation}

The number of bilinear terms in this formulation is proportional to the number of attributes. An alternative formulation relies on decision variables that represent fractions of flow instead of concentrations so that the bilinear terms are no longer associated to the number of attributes. Two types of decision variables may be used:

  • fraction of total inflow at pool 𝑝 , coming from source 𝑠 .
  • fraction of total outflow at pool 𝑝 , going to terminal 𝑑 .

A model based on the first option is now presented:

Q-formulation (Proportion)

Decision Variables

$\text{flow}_{i,j} \in [0, \text{UB}_{i,j}]$: Flow from node $i$ to node $j$.

$\text{prop}_{s,p} \in \mathbb{R}^+$: fraction of total inflow at pool $p$, coming from source $s$.

Objective Function

  • Profit: Maximize total profits (notice the bilinear terms on the second expression).

\begin{equation} \text{Max} \quad Z = \sum_{t \in \text{Targets}}{\sum_{i \in N(t)^-}{\text{Price}_t \cdot \text{flow}_{i,t}}} – \sum_{s \in \text{Sources}}{\text{cost}_s \cdot \left( \sum_{t \in N(s)^+ \cap \text{Targets}}{\text{flow}_{s,t}} + \sum_{p \in N(s)^+ \cap \text{Pools}}{\text{prop}_{s,p} \cdot \sum_{t \in N(p)^+}{\text{flow}_{p,t}}} \right) } \tag{0} \end{equation}

Constraints

  • Source capacity: Total outflow of source $s$ cannot exceed its capacity (notice the bilinear terms on the first expression of the left-hand side).

\begin{equation} \sum_{p \in N(s)^+ \cap \text{Pools}}{\text{prop}_{s,p} \cdot \sum_{t \in N(p)^+}{\text{flow}_{p,t}}} + \sum_{t \in N(s)^+ \cap \text{Targets}}{\text{flow}_{s,t}} \leq \text{Supply}_s \quad \forall s \in \text{Sources} \tag{1} \end{equation}

  • Pool capacity: Total outflow of pool 𝑝 cannot exceed its capacity.

\begin{equation} \sum_{t \in N(p)^+}{\text{flow}_{p,t}} \leq \text{Cap}_p \quad \forall p \in \text{Pools} \tag{2} \end{equation}

  • Target demand: Total inflow of target 𝑑 must at least meet its minimum demand.

\begin{equation} \sum_{i \in N(t)^-}{\text{flow}_{i,t}} \geq \text{Demand}_t \quad \forall t \in \text{Targets} \tag{3} \end{equation}

  • Pool inflow: The sum of the contributions of all incoming sources to pool 𝑝 must equal one.

\begin{equation} \sum_{s \in N(p)^-}{\text{prop}_{s,p}} = 1 \quad \forall p \in \text{Pools} \tag{4} \end{equation}

  • Target tolerances: Concentration of attribute π‘˜ at target 𝑑 is also the result of linear blending, and must be within tolerances (notice the bilinear terms on the second expression of the left-hand side).

\begin{equation} \sum_{s \in N(t)^- \cap \text{Sources}}{\text{Content}_{s,k} \cdot \text{flow}_{s,t}} + \sum_{p \in N(t)^- \cap \text{Pools}}{\text{flow}_{p,t} \cdot \sum_{s \in N(p)^-}{\text{content}_{s,k} \cdot \text{prop}_{s,p}}} \geq \text{Min_tol}_{t,k} \cdot \sum_{i \in N(t)^-}{\text{flow}_{i,t}} \\ \forall (t,k) \in \text{Targets} \times \text{Attrs} \tag{5.1} \end{equation}

\begin{equation} \sum_{s \in N(t)^- \cap \text{Sources}}{\text{Content}_{s,k} \cdot \text{flow}_{s,t}} + \sum_{p \in N(t)^- \cap \text{Pools}}{\text{flow}_{p,t} \cdot \sum_{s \in N(p)^-}{\text{content}_{s,k} \cdot \text{prop}_{s,p}}} \leq \text{Max_tol}_{t,k} \cdot \sum_{i \in N(t)^-}{\text{flow}_{i,t}} \\ \forall (t,k) \in \text{Targets} \times \text{Attrs} \tag{5.2} \end{equation}

One drawback is that if some of the source-to-pool edges have flow capacity, we need to define additional constraints instead of just specifying upper bounds on the associated decision variables. Such constraints can be defined with bilinear terms as follows:

  • Flow limit: Flow from source 𝑠 to pool 𝑝 cannot exceed the installed capacity (notice the bilinear terms on the left-hand side).

\begin{equation} \text{prop}_{s,p} \cdot \sum_{t \in N(p)^+}{\text{flow}_{p,t}} \leq \text{UB}_{s,p} \quad \forall (i,j) \in E \cap \left( \text{Sources} \times \text{Pools} \right) \mid \text{UB}_{i,j} < \infty \tag{6} \end{equation}

Python Implementation

Solving Bilinear Programs with Gurobi is as easy as configuring the global parameter nonConvex. When setting this parameter to a value of 2, non-convex quadratic problems are solved by means of translating them into bilinear form and applying sB&B. We will first deploy the P-formulation model, and then compare it with the Q-formulation model:

P-formulation (Concentration)

In [2]:

   

 

  • 24 decision variables.
  • 10 linear constraints.
  • 20 bilinear constraints.
  • a linear objective function.

As can be seen, we still observe a gap of 53.30% after reaching the time limit of five minutes (at this point, the incumbent solution induces a total profit of 434,301.95 USD). In fact, even after 20 minutes, the solver does not make much progress in closing the gap.

Q-Formulation (proportion)

Let’s now see how the q-formulation model performs:

In [3]:

 

The Q-formulation for this instance has:

  • 16 decision variables.
  • 7 linear constraints.
  • 15 bilinear constraints.
  • a bilinear objective function.

Notice it has fewer decision variables and also fewer bilinear constraints. Now Gurobi was able to find the optimal solution of 439,182.59 USD in less than one second, which is five orders of magnitude faster than before.

Analysis

Let’s see the optimal flows found using the Q-formulation:

Flow from Sources to Targets

In [4]:

Out [4]:

Flow from Pools to Targets

In [5]:

Out [5]:

Flow from Sources to Pools

In [6]:

Out [6]:

Conclusion

This notebook showed how easy it is to solve Bilinear Programs using Gurobi. It also highlighted the dramatic difference in performance of alternative formulations when solving challenging problems, such as the Standard Pooling Problem. It is thus of utmost importance to analyze carefully the context of the problem at hand, and to weigh the pros and cons of alternative models.

References

  1. Alfaki, M. (2012). Models and solution methods for the pooling problem.
  2. Audet, C., Brimberg, J., Hansen, P., Digabel, S. L., & Mladenović, N. (2004). Pooling problem: Alternate formulations and solution methods. Management science, 50(6), 761-776.
  3. Dombrowski, J. (2015, June 07). McCormick envelopes. Retrieved from https://optimization.mccormick.northwestern.edu/index.php/McCormick_envelopes
  4. Gupte, A., Ahmed, S., Dey, S. S., & Cheon, M. S. (2017). Relaxations and discretizations for the pooling problem. Journal of Global Optimization, 67(3), 631-669.
  5. Haverly, C. A. (1978). Studies of the behavior of recursion for the pooling problem. Acm sigmap bulletin, (25), 19-28.
  6. Liberti, L. (2008). Introduction to global optimization. Ecole Polytechnique.
  7. Zhuang E. (2015, June 06). Spatial branch and bound method. Retrieved from https://optimization.mccormick.northwestern.edu/index.php/Spatial_branch_and_bound_method