In the fast-paced world of optimization, trust is everything. Whether you’re solving complex logistics problems, designing university timetables, or developing machine learning models, you need to know your algorithm will perform—not just in ideal conditions, but in the real world, under stress. 

That’s exactly what Professor Kate Smith-Miles, a leading mathematician, Pro Vice-Chancellor (Research Capability) at the University of Melbourne and Director of the ARC Training Centre in Optimization Technologies, Integrated Methodologies, and Applications (OPTIMA), is helping researchers and practitioners achieve. In a recent Gurobi webinar, she shared a rigorous, science-backed approach to answering one of the toughest questions in optimization: When can I trust my algorithm? 

Rethinking Standard Practice 

For decades, optimization researchers have tested algorithms using small collections of benchmark instances, often selected without transparency or clear justification. Performance results are typically averaged across these test sets, and the algorithm with the best average wins the headline. 

But as Professor Smith-Miles pointed out, this approach hides a dangerous truth: algorithms don’t perform equally well across all types of instances. An average can mask both critical weaknesses and underexplored strengths. 

During her presentation, she asked a provocative question: “How do we attach a warning label to an algorithm?” In other words, how can we understand the terms and conditions under which an algorithm, or a model, or solver, or parameter setting, will deliver trustworthy results? 

To answer that question, her team, including current Gurobi Senior Developer, Dr. Simon Bowly, developed a methodology called “instance space analysis.” 

What Is Instance Space Analysis? 

Instance space analysis flips the standard approach on its head. Instead of averaging performance across opaque sets of instances, it focuses on understanding the diversity of problem instances and mapping where each algorithm performs well — and where it doesn’t. 

This is achieved by projecting high-dimensional problem features into a two-dimensional “instance space,” using feature selection, dimensionality reduction, and optimization techniques. Each point in the space represents a problem instance, and its location is defined by mathematically selected features that influence algorithm behavior. 

Once you have this space, you can visualize the footprint of each algorithm, i.e. the region where it performs reliably. And importantly, you can also spot the gaps: types of instances that have not been tested, where algorithm performance is unknown. 

As Smith-Miles explained, this helps move from black-box benchmarking to transparent and evidence-based algorithm assessment. 

From Theory to Real-World Impact 

To make the methodology accessible, Smith-Miles and her team developed Matilda, an online tool that helps researchers and developers run instance space analyses without needing deep expertise in the underlying mathematics. 

Using Matilda, she walked through a case study comparing two algorithms for university timetabling. Initially, one algorithm appeared to be better on average. But after mapping their footprints, it became clear that each had different strengths in different regions of the instance space. In some parts of the space, neither algorithm had a strong performance. This insight would have been missed if only averages were reported. 

The same methodology can be applied not only to algorithms, but also to modeling choices and solver configurations. In another study, her team applied instance space analysis to compare four different MIP models for bin packing problems, using both Gurobi and CPLEX solvers. They discovered subtle but important differences in which solver performed best, and for which kinds of instances. 

In some cases, only one solver could solve the instance within the time limit, despite both being state-of-the-art. The takeaway: solver performance also depends heavily on instance characteristics, and these differences deserve scrutiny. 

Towards Algorithmic Trust 

Why does all this matter? Because in today’s world, where optimization is being deployed in high-stakes settings like autonomous vehicles, power grids, and financial systems, we can no longer accept vague claims of “good on average.” 

As Smith-Miles noted, “Twenty years ago, people weren’t talking much about trust. Now it’s a hot topic.” The public, and regulators, are demanding AI systems that are explainable, reliable, and stress-tested. That begins with how we evaluate the building blocks of these systems: the algorithms, solvers, and models themselves. 

Instance space analysis offers a path forward. It’s a methodology that brings scientific rigor, visual clarity, and repeatable insights to algorithm assessment. It also helps identify blind spots in benchmarks, inspires the design of better algorithms, and offers a new level of confidence in performance claims. 

And perhaps most importantly, it’s already changing the field. As Smith-Miles noted, journal editors are now starting to recommend instance space analysis in peer review. This is a sign that the community is waking up to the need for more robust evaluation practices. 

“We’re trying to facilitate a more objective assessment of the power of an algorithm,” she said. “Not just cherry-picked examples.” 

With tools like Matilda and methodologies like instance space analysis, optimization researchers and practitioners can finally move from trust assumptions to trust evidence—and in doing so, make smarter, safer, and more transparent decisions. 

To explore this methodology in greater depth and see it applied to real-world optimization challenges, be sure to watch the full webinar from Gurobi, “Stress-testing Algorithms, Solvers, and Models via Instance Space Analysis.” 

Lindsay Montanari
AUTHOR

Lindsay Montanari

Senior Director of Academic Programs

AUTHOR

Lindsay Montanari

Senior Director of Academic Programs

Lindsay brings over 13 years of experience working at the intersection of technology and education. Prior to Gurobi, Lindsay worked as an Operations leader at Opex Analytics, a product and services firm dedicated to solving complex business problems using the power of Artificial Intelligence. While there, she focused on growth and business development, product launch, and marketing. Lindsay spent 10 years working in various leadership capacities at Universities including Columbia University, Northwestern University, and the University of Chicago. From 2013 to 2017, she worked to establish and grow the Master of Science in Analytics degree at Northwestern University’s School of Engineering, the program was one of the earliest MS degrees focused on an applied data science curriculum. During her time with Northwestern, she managed external relations and corporate relations, helped hire and onboard new faculty and subject-matter experts in various disciplines of analytics, directed recruiting efforts/admissions/student advising, and managed a team of administrative professionals. Prior to Northwestern, she spent over 5 years working in Advancement at Columbia University’s School of Engineering and Applied Science.   She completed her Bachelor’s Degree in English and Fine Art at Sewanee: The University of the South and her Master’s Degree in Nonprofit Management at Columbia University.

Lindsay brings over 13 years of experience working at the intersection of technology and education. Prior to Gurobi, Lindsay worked as an Operations leader at Opex Analytics, a product and services firm dedicated to solving complex business problems using the power of Artificial Intelligence. While there, she focused on growth and business development, product launch, and marketing. Lindsay spent 10 years working in various leadership capacities at Universities including Columbia University, Northwestern University, and the University of Chicago. From 2013 to 2017, she worked to establish and grow the Master of Science in Analytics degree at Northwestern University’s School of Engineering, the program was one of the earliest MS degrees focused on an applied data science curriculum. During her time with Northwestern, she managed external relations and corporate relations, helped hire and onboard new faculty and subject-matter experts in various disciplines of analytics, directed recruiting efforts/admissions/student advising, and managed a team of administrative professionals. Prior to Northwestern, she spent over 5 years working in Advancement at Columbia University’s School of Engineering and Applied Science.   She completed her Bachelor’s Degree in English and Fine Art at Sewanee: The University of the South and her Master’s Degree in Nonprofit Management at Columbia University.

Guidance for Your Journey

30 Day Free Trial for Commercial Users

Start solving your most complex challenges, with the world's fastest, most feature-rich solver.

Always Free for Academics

We make it easy for students, faculty, and researchers to work with mathematical optimization.

Try Gurobi for Free

Choose the evaluation license that fits you best, and start working with our Expert Team for technical guidance and support.

Evaluation License
Get a free, full-featured license of the Gurobi Optimizer to experience the performance, support, benchmarking and tuning services we provide as part of our product offering.
Cloud Trial

Request free trial hours, so you can see how quickly and easily a model can be solved on the cloud.

Academic License
Gurobi provides free, full-featured licenses for coursework, teaching, and research at degree-granting academic institutions. Academics can receive guidance and support through our Community Forum.

Search

Gurobi Optimization

Navigation Menu