Industry Signals
UPMC Anesthesiologist Staffing Model Delivers $800K Annual Savings Across 11 Hospitals
Foundation: Hospitals generate structural cost deficits when trained anesthesiologists sit unassigned at one location while adjacent sites simultaneously face demand surges that draw in expensive last-minute overtime. Researchers from the Indian School of Business, UCLA Anderson, and UPMC model this as a three-stage mixed-integer programmeAn optimisation model where some decisions are restricted to whole numbers; branch-and-bound solvers find provably optimal solutions by exploring and pruning a decision tree. See full concept page. — an optimisation framework that encodes yes/no staff-assignment decisions as binary variables and minimises total labour cost subject to hospital capacity, surgeon scheduling, and regulated working-time constraints. Stage 1 assigns anesthesiologists to hospital sites or a shared on-call pool six weeks ahead; Stage 2 re-deploys staff using updated surgical schedules three days out; Stage 3 makes final same-day adjustments, committing resources progressively as demand forecasts narrow.
The three-stage model, validated on historical UPMC data, operates at three shrinking horizons: Stage 1 (six weeks out) assigns anesthesiologists to sites or an on-call pool; Stage 2 (three days out) re-deploys as updated surgical lists arrive; Stage 3 (same-day) finalises. Against the prior ad-hoc approach, the model delivers a net daily saving of $8,382, equivalent to over $800,000 annually across the 11-hospital network.
Any integrated health system using ad-hoc anesthesiologist scheduling can apply the three-stage MILP formulation directly; the Indian School of Business / UCLA Anderson / UPMC model is published in full detail, and UPMC’s 11-hospital deployment confirms the $8,382/day saving holds in a live multi-site environment rather than in simulation alone.
ECCO Shoes’ Stochastic Replenishment Programme Cuts Global Inventory Costs 1.09% Across 536 Stores
Foundation: A global retailer with 536 stores in 27 countries generates nearly 300,000 replenishment orders per month: deciding how much inventory to allocate to each store before the season opens, then adjusting once early demand signals arrive, is the canonical form of stochastic programmingOptimisation under uncertainty; the model separates decisions made before uncertainty resolves (Stage 1) from recourse decisions made after (Stage 2), and optimises in expectation across probability-weighted scenarios. See full concept page. — a framework that separates Stage 1 decisions made under uncertainty from Stage 2 recourse decisions made after uncertainty partly resolves. ECCO Shoes built a two-stage stochastic mixed-integer programme for this problem, where Stage 1 fixes global inventory allocation and Stage 2 allows store-level recourse once seasonal demand signals arrive. At 536-store scale the monolithic model is intractable within the weekly planning cycle, so a divide-and-conquer decomposition splits the problem into country-level subproblems coordinated by shared global inventory constraints.
ECCO deployed the model across 536 stores in 27 countries, replacing a largely manual ordering process. The decomposition makes the stochastic MIP tractable at the scale of nearly 300,000 monthly replenishment orders across hundreds of product lines. Validated against the prior system, the model reduces global operational cost by 1.09%, a saving that compounds in a fashion supply chain where inventory mis-allocation forecloses next-season margin. The case study below details the formulation and decomposition structure.
Research Papers
PyVRP+ Adds Time Windows and Mixed-Fleet Support to HGS, Matching Best-Known Benchmarks 🔗
Foundation: The vehicle routing problemFinding minimum-cost routes for a fleet of vehicles to serve a set of customers subject to capacity, time-window, and fleet constraints. See full concept page. (finding minimum-cost routes for a fleet of vehicles to serve all customers subject to capacity, time-window, and fleet constraints) is one of the most benchmarked problem classes in combinatorial optimisation, yet open-source solvers lagged best-known solutions on time-window and heterogeneous-fleet variants. PyVRP is an open-source solver built on large neighbourhood searchA metaheuristic that iteratively destroys and repairs large portions of a solution to escape local optima; effective on vehicle routing and scheduling problems. See full concept page. inside a hybrid genetic search (HGS) framework — an evolutionary metaheuristic that maintains a population of solutions and applies large neighbourhood search operators to generate new offspring, balancing intensification and diversification. This paper extends the HGS core with time-window feasibility enforcement and mixed-fleet support, targeting the two vehicle routing problem with time windows (VRPTW) and heterogeneous-fleet variants most common in real-world deployments.
PyVRP+ adds two modules to the existing HGS core: a time-window constraint checker that repairs infeasible routes during search, and a heterogeneous-fleet module that tracks vehicle-type capacities separately across the population. On standard VRPTW and heterogeneous-fleet benchmark suites the extended solver matches best-known solution quality on commodity hardware. The codebase is open-source, backward-compatible with the existing PyVRP interface, and includes all benchmark scripts used in the evaluation.
Any logistics team currently running homogeneous-fleet or time-unconstrained routing can upgrade to time windows and mixed fleets with no solver change; PyVRP+ is a drop-in extension that preserves the HGS benchmark quality already validated across standard VRPTW instance sets.
Sparse Learning Speeds Up Branch-and-Bound: Bayramoglu, Nemhauser & Sahinidis Generalise Across MIPLIB Instance Families 🔗
Foundation: Choosing which variable to branch on at each node of a branch-and-boundA tree search algorithm that partitions the solution space into subproblems, pruning branches whenever a bound proves they cannot improve on the best solution found. See full concept page. tree — the branching-policy decision that determines how quickly a mixed-integer programme finds and proves an optimal solution; strong branching (evaluating every candidate variable by partial LP solves) is the most accurate policy but adds prohibitive per-node cost on large instances. Recent machine-learning approaches train branching policies on historical solved instances, but rely on dense feature vectors extracted by solving LP relaxations at every node, limiting scalability. Bayramoglu, Nemhauser, and Sahinidis show that sparse features (problem-structure statistics extractable at near-zero cost from the MIP instance itself, without LP relaxation) are sufficient to train branching policies that match strong-branching accuracy across diverse MIPLIB instance families.
The sparse-feature policy uses variable type, coefficient statistics, and constraint-participation counts that are available directly from the MIP coefficient matrix, requiring no LP-relaxation solves during training or inference. On MIPLIB benchmark families the sparse policy matches strong-branching solution quality at a fraction of the per-node computation. The approach generalises across instance classes without retraining, making it applicable to practitioners running heterogeneous MIP workloads on a shared solver.
MIP practitioners running Gurobi or CPLEX can test sparse learned branching as a plug-in policy; Bayramoglu et al.’s MIPLIB result shows no branching-quality loss at substantially lower per-node cost, and the cross-family generalisation means the policy does not need to be retrained for each new problem type.
Term of the Day
Dual Decomposition
"Duplicate the coupled variable, assign a private copy to each subproblem, and enforce agreement through prices; the market does the coordination that the monolith did by constraint." — Adapted from Dimitri Bertsekas, Nonlinear Programming, 3rd edition (2016)
Dual decomposition decomposes a large optimisation problem by duplicating any variable that appears in more than one subproblem (creating a private copy for each subproblem that holds it) and enforcing that the copies agree with one another through Lagrange multiplier prices. Each subproblem is solved independently (and in parallel if desired), treating the current price vector as a fixed penalty on disagreement between copies of the shared variable; after all subproblems return their solutions, the prices are updated, raised wherever a subproblem’s copy of the shared variable is above the target value, lowered where it falls short. This price-update–resolve cycle continues until all copies agree, at which point the prices are the dual variables certifying optimality of the joint problem. In stochastic programming the shared variables are the first-stage decisions that must be the same across all scenarios regardless of which scenario realises (the non-anticipativity requirement); in large-scale logistics, the shared variables are depot capacities or vehicle counts that multiple route planners compete for. The key computational advantage over solving the monolithic problem is that all subproblems run in parallel: the coordinator only exchanges a compact price vector at each iteration, making the communication cost negligible relative to subproblem solve time.
A concrete example
A city logistics operator must allocate 20 refrigerated trucks across 5 delivery zones for the next week. Each truck is a shared resource: the fleet manager’s model has a single constraint capping total allocation at 20, but each zone planner needs to run its own routing model to know how many trucks it can usefully employ given its customer locations and time windows.
Dual decomposition handles this by giving each zone planner a private copy of the truck count. Initially the coordinator sets prices (shadow prices on trucks) at zero; each zone planner solves its routing problem independently and requests as many trucks as its model can absorb, say Zones 1–5 ask for 8, 6, 5, 4, and 4 trucks respectively (total: 27, exceeding the 20 available). The coordinator raises the truck price to penalise over-demand. Zone planners re-solve with the higher price and reduce their requests. After several iterations total demand converges to 20, and the final allocation equates marginal value across zones, no zone can profitably acquire a truck that another zone values more.
The ECCO case study in this issue uses the same principle at larger scale: the global inventory allocation (shared variable) is separated from country-level demand-fulfilment subproblems, with prices enforcing that total inventory assigned across all countries never exceeds the available global supply.
Why practitioners misread this
Most practitioners conflate dual decomposition with Lagrangian relaxation. The distinction is architectural: Lagrangian relaxation dualizes coupling constraints directly into the objective of a single model, turning infeasible constraint violations into penalties the solver balances internally, the problem remains one model, just with a modified objective. Dual decomposition goes a step further: it creates physically separate subproblems, each holding private copies of the shared variable, and the subproblems communicate only through the price vector the coordinator broadcasts. The difference matters when subproblems can run on separate machines or when they have fundamentally different structures (a routing subproblem and an inventory subproblem, for instance) that are better handled by different solvers. Lagrangian relaxation tightens a single model’s bounds; dual decomposition enables independent parallel solvers.
A related misread confuses dual decomposition with Benders decomposition. Benders splits a problem by variable type: complicating variables go into a master problem and the remaining variables into a subproblem; the subproblem returns Benders cuts to the master until convergence. The split is one-way and structural. Dual decomposition splits by problem owner: each subproblem holds its own copy of the shared variable and communicates back only a proposed value, not a cut. In practice the two can resemble each other when Benders cuts carry dual information, but the coordination mechanism differs: Benders decomposes by what the variable does (it complicates the master); dual decomposition decomposes by who owns it (each subproblem needs its own copy to be solvable independently).
Where this shows up in practice
Stochastic programming: Non-anticipativity constraints coupling all scenario subproblems are enforced through prices; this formulation is also known as the progressive hedging algorithm. Each scenario is solved independently on a separate processor. Network flow and logistics: Shared arc capacities in multi-commodity network flow are the coupling constraint; dual decomposition lets each commodity’s flow problem run separately, coordinated by arc-capacity prices. Power systems: Unit commitment problems with shared transmission constraints are routinely decomposed by generator subproblem, with locational marginal prices as the coordinating dual variables. Multi-facility inventory: Shared warehouse capacity couples store-level replenishment plans; the shadow price on capacity is the ECCO case study’s decomposition signal.
- UPMC Anesthesiologist Staffing A three-stage mixed-integer linear programme assigns anesthesiologists across 11 UPMC hospitals at shrinking planning horizons (six weeks out, three days out, and same-day) delivering a net daily saving of $8,382 ($800K+ annually) over the prior ad-hoc staffing approach.
- ECCO Global Inventory A two-stage stochastic MIP with divide-and-conquer decomposition automates nearly 300,000 monthly replenishment decisions across ECCO’s 536-store network in 27 countries, reducing global operational costs 1.09% against the prior manual ordering system.
- PyVRP+ Extending PyVRP’s hybrid genetic search core with time-window feasibility enforcement and heterogeneous-fleet support produces a solver that matches best-known benchmark solutions on VRPTW and mixed-fleet instances, covering the two routing variants most common in real-world deployments.
- MIP Sparse Learning Bayramoglu, Nemhauser, and Sahinidis show that sparse problem-structure features (extractable without LP-relaxation solves) are sufficient to train branching policies that match strong-branching accuracy across diverse MIPLIB instance families.