By Dr. Cara Touretzky, Dr. Robert Luce, Dr. David Torres Sanchez, and Dr. Byron Tasseff 

Mixed-integer linear programming (MILP or MIP) models are a core element of decision-making processes across industries. Given the recent rise in GPU-accelerated computing power, it is no surprise that the optimization community is curious about the impact this hardware will have on the algorithms used to solve MIP models.   

Solving LPs is a fundamental step within modern MIP solvers. So to answer the question, “How do GPUs improve the performance of MIP algorithms?” we must also address the question, “How do GPUs improve the performance of LP algorithms?” 

We’ll explore these questions and potential future implications in this post.  

Solver Performance on GPUs: Key Takeaways 

The performance of a solver on a particular model depends on the capabilities of both  the algorithm and hardware used to solve it. For example, algorithmic improvements delivered through every new Gurobi release yield performance speedups without the user needing to invest in new hardware. But a sophisticated optimization algorithm that leverages powerful hardware like the latest GPUs and CPUs can solve problems that previously seemed intractable.

Based on the recent advancements in solver algorithms and hardware, here are the key takeaways for LP and MIP: 

  • Large LPs that are difficult to solve using simplex or barrier methods may have good performance with the PDHG algorithm.  
  • MIP models that spend significant time at the root node (i.e., solving a large LP) may also benefit from using PDHG at the root node.  
  • The majority of today’s LPs and MIPs will have better performance with the established barrier or simplex methods.   

We can already see that using first-order methods on GPUs to solve LPs (like Gurobi’s PDHG algorithm implementation) may yield a performance improvement for some very large LP model instances. However, accelerating the Branch-and-Bound algorithms for MIPs using GPUs is still an active area of research, which we discuss in detail below.    

How Do GPUs Improve the Performance of LP Algorithms? 

Consider the main steps of solving an LP, shown in Figure 1. Notice that: 

  • All three of the main algorithm options (simplex, barrier, and PDHG) can run on a CPU.  (Gurobi even offers a concurrent method, where multiple LP algorithms are run in parallel.) 
  • Barrier and PDHG require an additional step to achieve a basic solution: Crossover. Basic solutions are used as a starting point for the Branch-and-Bound search for MIP models. Generally, the time spent in Crossover will depend on the quality of the solution from barrier or PDHG. Most PDHG algorithms terminate by default with lower quality solutions, so Crossover will likely require additional time.   
  • The PDHG algorithm is the most amenable to running on GPU architectures. It relies only on sparse matrix-vector multiplication (unlike barrier and simplex), which can be performed efficiently on a GPU. Gurobi’s PDHG method is available as part of version 13.0 
  • While Gurobi does not yet provide a publicly available GPU-enabled barrier method, it is well established that matrix factorization operations used in the barrier method can be implemented on a GPU.    

Figure 1. LP Framework.
*Gurobi’s LP barrier method for GPUs is not yet publicly available.   

First-order methods like PDHG and components of the barrier method can both run on a GPU; however, PDHG has received significant attention in recent years. This is because the mechanics of the PDHG algorithm are much simpler to implement on GPUs. If you’re wondering why the primal and dual simplex methods are not part of this discussion, it is because these algorithms are already difficult to parallelize on CPU hardware, and even more so on GPUs.   

PDHG may yield runtime improvements for some large-scale LPs compared to alternative options, especially when run on a GPU. Typically, these are giant LP models where PDHG can achieve a high-quality solution, and the Crossover step does not add significant additional runtime in the framework. This caveat regarding Crossover has already prompted new research into the field of concurrent Crossover methods. In the rare situation where a basic solution is not required, the solution from PDHG alone may also be acceptable if computed with a sufficient level of precision.   

At Gurobi, we typically consider the Crossover time when benchmarking LPs on GPUs, because our customers need to be aware of the overall runtime of their applications. Optimization application stakeholders need to be aware of this difference, as other benchmarks might not account for the Crossover step—instead reporting the time to achieve a non-basic solution. Other benchmarks may also use looser PDHG convergence tolerances, which help shorten runtime to achieve a non-basic solution.  

How Do GPUs Impact the Branch-and-Bound Framework for MIPs? 


Figure 2. A highly simplified Branch-and-Bound framework.  
*Gurobi’s LP barrier method for GPUs is not yet publicly available.       

LP solves form the backbone of today’s MIP algorithms. The simplified framework in Figure 2 illustrates the Branch-and-Bound search used when solving a MIP. There are multiple areas where GPUs can aid in the workflow: 

  • The initial LP solve that occurs at the root node of the Branch-and-Bound tree: It is possible to solve the root node LP relaxation on a GPU using PDHG in Gurobi 13.0. 
  • Some heuristics and presolve components may be amenable to a GPU acceleration— Gurobi is actively researching the implementation of these components. 

Notice that solving node relaxation LPs (a step within the Branch-and-Bound search tree that uses the LP algorithms yet again) with PDHG on GPU is not recommended at this time. Achieving performance improvement through the node relaxation LP using first-order methods is challenging, given the excellent warm-starting capabilities of the simplex method and its extensive integration into the Branch-and-Bound framework. It is likely that PDHG (and the added Crossover step) will increase the overall runtime of this step. For these reasons, the simplex method is still the standard algorithm for node relaxation LPs for most models.  

Regarding what is possible today for MIPs, cases where the majority of solve time is spent in the initial LP root relaxation may benefit from using GPUs for that step in the Branch-and-Bound framework (e.g., the rmine25 model in Example 2 here).  Figure 3 illustrates this concept, where you can see that Model B experiences a larger improvement in Time to Optimality than Model A, because the LP phase takes up a larger portion of the overall runtime. If you are working with a model that seems to have similar behavior to Model B, it is certainly worth evaluating whether using PDHG on GPU for the initial LP step improves overall performance.    

  

Figure 3. Example of runtime improvements for a MIP model that has a challenging LP relaxation step. This illustration assumes no significant increase in Crossover time when changing from Barrier to PDHG. This is an oversimplification, and in practice it is possible that Crossover times could take longer, which would reduce the improvement in Time to Optimality.  

How Do Algorithms Use GPUs and CPUs Together? 

Keep in mind that no solver framework can run purely on GPU hardware. The CPU facilitates many computational tasks and the orchestration of the solve. Only key components of certain algorithms can make use of the GPU, making a CPU-GPU coupling an essential consideration for hardware selection.   

As you can see in Figure 2, there are still many steps in the solver framework that run on the CPU. Currently, the algorithmic components like node selection, presolve, cut separation, and heuristics require irregular control flow and dynamic data operations—characteristics that are better suited to CPUs. These algorithmic components, together with solving node relaxation LPs, typically dominate the overall runtime of a MIP model (see the Model A example in Figure 3) and not all of them are GPU-enabled. This is why running your MIP on a GPU may not yet yield a major performance improvement compared to running on a CPU at this time.   

However, as certain heuristics and presolve components are enabled on GPUs, it is expected that MIP problems using those components will benefit from GPU acceleration. For now, solving MIPs on GPU makes sense only if the root LP relaxation dominates the overall runtime.   

As the underlying algorithms and hardware continue to evolve, these conclusions may shift. We believe that sharing our rationale behind these takeaways is valuable for our optimization community. By understanding how the outlook on MIP performance will impact your individual models, you can plan more effectively for the future adoption of GPUs in your optimization tech stack.   

Our Research Continues 

To summarize, there are dozens of different components that play a role in the Branch-and-Bound framework for MIPs (Figure 2 is highly simplified). Improvements in the underlying algorithms for each of these steps yield performance improvements, regardless of the hardware they run on.  

At this time, the LP components outside of the Branch-and-Bound framework have been GPU-accelerated and have helped improve the overall performance of giant LPs (and MIPs with challenging root LP relaxations). As more components of the Branch-and-Bound framework are enabled on GPUs, more MIP models may benefit from GPU acceleration.   

The key when evaluating CPU vs. GPU hardware is to understand the interaction between these components. Because it is difficult to predict the effect on overall performance for a particular model instance, we will always recommend benchmarking and tuning your specific model when deciding which hardware to run on.   

This remains an exciting and active area of research, and Gurobi continues to evaluate and adopt new algorithms that do show promise on GPUs within this rapidly evolving field. If you’re interested in seeing the Gurobi Optimizer in action, including with GPU acceleration, request a free evaluation license today or send us your models for benchmarking.  

Guidance for Your Journey

30 Day Free Trial for Commercial Users

Start solving your most complex challenges, with the world's fastest, most feature-rich solver.

Always Free for Academics

We make it easy for students, faculty, and researchers to work with mathematical optimization.

Try Gurobi for Free

Choose the evaluation license that fits you best, and start working with our Expert Team for technical guidance and support.

Evaluation License
Get a free, full-featured license of the Gurobi Optimizer to experience the performance, support, benchmarking and tuning services we provide as part of our product offering.
Cloud Trial

Request free trial hours, so you can see how quickly and easily a model can be solved on the cloud.

Academic License
Gurobi provides free, full-featured licenses for coursework, teaching, and research at degree-granting academic institutions. Academics can receive guidance and support through our Community Forum.

Search

Gurobi Optimization

Navigation Menu