Solution Methods · First introduced 20 Apr 2026

Augmented Lagrangian

Penalties command. Prices persuade.

A loop that discovers the right per-unit price for every rule, so the solver gets cheap answers that still respect them.

Why it's needed

Every industrial optimisation problem has the same form: find the cheapest (or fastest, or best) way to do something, as long as certain rules hold. The second half of that sentence is where things go wrong.

Typical rules
  • Keep the budget under $10M.
  • Don't exceed 5,000 gallons per hour.
  • Total production must equal demand exactly.

The “as long as” part is where things get hard. Turn a solver loose on the problem, and here's what you get:

IT CHEATS
Finds a cheap answer, then quietly ignores the rules.
OR
IT OVERPAYS
Honours every rule, at a cost nobody wants to pay.

What you actually want is both: the cheapest answer and every rule respected, at the same time.

On a realistic industrial problem, that combination is genuinely hard. Two classical tricks have been tried. Each has a fatal flaw.

THE FINE
IdeaAdd a fixed cost for every unit the rule is broken. Overspend the $10M budget by $1? Pay a $1,000 fine in the objective. The solver weighs the fine against the savings.
CatchA small fine is cheap to pay, so the solver just pays it and keeps cheating. To force the rule to hold, the fine has to be astronomical: millions, billions. At that scale, tiny rounding errors get amplified into huge penalty swings, and the solver can't tell a real answer from numerical noise.
THE PRICE
IdeaInstead of forbidding rule-breaking, charge a per-unit price, like buying electricity. At the right price (the shadow price), the solver stops exactly where the rule demands.
CatchYou don't know that price in advance. Hunting for it round by round can swing wildly: too low, too high, too low again, especially with scheduling or routing decisions.
THE FINE Grows without bound → ∞ iterations THE PRICE Oscillates, won't settle true price iterations
Why neither classical trick works alone

Each trick, alone, breaks for its own reason. The augmented Lagrangian is what you get when you stop choosing between them.


What it does

The augmented Lagrangian carries both tricks at once inside a loop, but gives them different jobs.

Each round:

  1. Solve the problem with the current price and the small fixed fine, both sitting in the objective.
  2. Measure how much the rule was broken in that solution.
  3. Nudge the price by an amount proportional to the violation: up if the rule was broken, down if it was obeyed with room to spare. The fine stays put.
  4. Solve again.
Solve problem with current price Measure how much the rule was broken Adjust price up or down repeat until price settles
The augmented Lagrangian loop
Why the fine matters even though it's small
Without the fine, the price update can overshoot and swing: a large violation demands a big price hike, which over-corrects, which demands a big price cut next round. The fine adds a gentle pull toward the last solution, damping those swings. The price still does the enforcing; the fine just keeps the hunt from going off the rails.

Round by round, the price homes in on its true value. Once it stops changing, the rule holds exactly. You never guessed the price in advance, and you never sent the fine to infinity. The iteration found both.


Core idea

For the math-curious, here's the same idea in formal notation. Each symbol maps directly to the plain-English version above.

Take a constrained problem in standard form: minimise f(x) subject to h(x) = 0. Read f(x) as the cost you want to minimise, and h(x) as the rule-violation vector (zero exactly when the rule holds).

The ordinary Lagrangian attaches a price λ (the "multiplier") to the violation:

L(x, λ) = f(x) + λT h(x)

The second term is literally "price × violation", the per-unit charge from the price trick. If λ is set to the right shadow price, minimising this surrogate gives the correct answer with no constraint in sight.

The augmented Lagrangian adds a quadratic fine with weight ρ:

Lρ(x, λ) = f(x) + λT h(x) + (ρ/2) • ||h(x)||2

That third term is the fine: a cost proportional to the squared violation. ρ is kept small; on its own it cannot enforce the rule, but it damps how far each round's solution can jump from the last.

The algorithm alternates two steps that mirror the loop above:

  1. Solve: minimise Lρ over x with λ held fixed.
  2. Update the price: λ := λ + ρ • h(x). The nudge is exactly ρ times the violation just observed.

For inequality constraints g(x) ≤ 0, the quadratic term becomes a one-sided penalty (max(0, g(x) + λ/ρ))2. The structure is unchanged: outer loop updates multipliers, inner loop minimises the surrogate.

Hestenes and Powell showed in 1969 that ρ does not need to diverge to enforce feasibility. A finite ρ, combined with the multiplier update, drives h(x) to zero at convergence. The multiplier λ, not the penalty ρ, is what makes the constraint bind at the solution. In the vocabulary above: the price does the enforcing. The fine only stabilises.


Concrete example: two plants sharing a supplier cap
Scenario: two plants, one shared supplier cap

The setup. Plant A and Plant B each plan their own weekly production schedule. Both buy the same raw material from one supplier, who sells no more than 10,000 kg per week in total across the two plants. Neither plant will share its full plan with the other. So a neutral coordinator sits in the middle: it doesn't schedule anything, it just sets one shared price for the raw material each week. Each plant uses that price to decide how much to buy.

Try price only (plain Lagrangian relaxation). The coordinator quotes a price. Each plant runs its own schedule treating the material as that-priced. The coordinator sees the combined purchase, notes any cap violation, and adjusts the price for the next round. The catch: scheduling is full of yes/no decisions (run this batch, skip that line). When the price crosses a threshold, a plant's purchase flips discretely. Combined purchase jumps between "way over" the cap and "way under". The price bounces with it. No convergence.

Now add the fine (augmented Lagrangian). Each plant's objective carries both the price and a small quadratic fine on how far the combined purchase strayed from the cap last round. The fine is too small to enforce the cap on its own; that's still the price's job. Its role is to punish big swings between rounds. The oscillation damps. The price settles. The cap holds. The coordinator updates the price after each round, by an amount proportional to the violation observed.

The symbols, for the record. The price is λ. The fine weight is ρ, kept small. Each plant's objective is (its own cost) + λ • (its share of combined purchase) + (ρ/2) • (combined purchase − cap)+2. The coordinator updates λ := λ + ρ • (observed violation) each round.


The geometry of augmentation
constraint violation h(x) surrogate value feasible: h(x) = 0 Lagrangian (linear in h) Augmented (convex in h) AL minimiser lands here quadratic term curls the line into a bowl
Plain Lagrangian is linear in the constraint violation; augmented Lagrangian adds a quadratic that makes the surrogate strictly convex in the violation, giving a unique minimiser at or near feasibility without sending ρ to infinity.

Common misreads
It is not the same as Lagrangian relaxation

Lagrangian relaxation uses only the price, not the fine. It gives a lower bound on the best possible answer and a family of candidate plans, but on problems with yes/no decisions the price oscillates and nothing settles. Augmented Lagrangian needs both halves working together: drop the fine and you get Lagrangian relaxation's oscillations back; drop the price update and you get a pure penalty method's numerical problems. Neither half alone carries the convergence guarantee.

It is not a pure penalty method

A pure penalty method uses only the fine, not the price. To force a rule to hold, the fine must be astronomical, and at astronomical sizes the solver becomes numerically unstable: tiny rounding errors get amplified into huge penalty swings. The augmented Lagrangian gets away with a small fine because the price, not the fine, is what makes the rule hold. Cranking ρ up higher and higher defeats the whole point of the method.

It does not guarantee the globally best answer

On a non-convex problem (anything with integer variables qualifies), the method finds a locally best answer, not necessarily the globally best one. Different starting guesses can land in different local solutions. The standard fix is to run the loop from several starting points and keep the best answer found, or to layer a global-search method on top. Formally, the method converges to a Karush-Kuhn-Tucker (KKT) stationary point. The CALADIN and Mix-CALADIN families inherit this caveat and are typically deployed with multi-start strategies.

The price update is not a tuning knob

The rule λ := λ + ρ • (violation observed) looks simple, but it is not a free parameter you can adjust by feel. Every part of the formula has a mathematical derivation behind it (it's a steepest-ascent step on the dual of the augmented problem). Replacing it with a plausible-sounding heuristic, for example "multiply λ by 1.1 whenever the violation grows", breaks the math, and the method can stop converging. Legitimate accelerations like Nesterov momentum on λ exist, but they must be derived from the underlying theory, not guessed.


Where this shows up in practice
ADMM
The Alternating Direction Method of Multipliers is an augmented Lagrangian on a splitting variable, used across convex signal processing, distributed consensus, and large-scale machine learning.
CALADIN & Mix-CALADIN
Distributed non-convex consensus methods with integer-variable support. The outer augmented-Lagrangian loop coordinates agents while inner subproblems stay local and private.
Proximal methods
Proximal point, proximal gradient, and proximal ADMM all use augmented-Lagrangian machinery to handle constraints or regularisers in large-scale Machine Learning optimisation.
Method of Multipliers solvers
LANCELOT and successors solve large-scale non-linear programmes via a method-of-multipliers outer loop around a bound-constrained trust-region inner loop.
SDP interior point
Semidefinite Programming solvers handling equality constraints through augmented-Lagrangian penalties rather than direct elimination, for numerical stability.
Constrained RL
Constrained Policy Optimisation and related safe-Reinforcement-Learning algorithms carry a Lagrangian multiplier on the safety-budget constraint and update it with augmented-Lagrangian rules.

The Hestenes–Powell principle

A finite penalty is enough, as long as it rides on top of the right multiplier. The multiplier does the enforcing; the penalty only stabilises.


Related concepts