Tips for evaluating benchmarks & solvers
Double-check the defaults.
Because benchmark tests are usually run using a solver’s default settings, it’s important to understand what those defaults are. But because defaults are chosen to provide the best overall performance across a range of models, they’re often not optimized for a particular model.
Understand benchmark tests in context of their defaults. Use them as a starting point, and ultimately test solvers against your own models.
Dig deeper than face-value.
Some benchmark tests can be misleading – intentionally or not. If a company cherry-picks models and tunes their solver for that subset of models, they may be able to claim superiority over recognized industry-leading solvers. With a deeper look, you may find that the selected model is only academic in nature and not reflective of the real world, or that tuning the opposing solver would result in a much better performance than indicated by the test parameters.
Make sure the results you’re seeing aren’t being manipulated or misconstrued to appear more impressive than they are.
Look for meaningful measures.
It’s important to determine whether a test measures something that is meaningful to you in practice. A test that measures the time required to produce poor-quality solutions isn’t relevant if your application requires high-quality solutions.
Evaluate the benchmark test and the solver’s performance based on the problems and models you need to solve.
Tune the parameters.
When testing a solver, you need the opportunity to tune performance to your specific models. Gurobi includes over 100 parameters to adjust, and an Automatic Tuning Tool that intelligently explores parameter settings and returns with advice on specific settings you can use to optimize the solver for your model(s).
Using default settings, Gurobi has the fastest out-of-the-box performance. By using the Automatic Tuning Tool to tune the parameters for each individual model, mean performance across the models increases by 68%. Our distributed tuning capabilities show a 152% performance improvement in the same amount of tuning time.