Stochastic Methods · First introduced 21 Apr 2026

Conditional Value at Risk

"Do not cross a river if it is four feet deep on average." Nassim Nicholas Taleb, The Bed of Procrustes (Incerto, 2012)

A coherent risk measure that reports the average loss inside the worst-α tail of a distribution, not just the boundary of that tail — and one auxiliary variable turns that average into a clean linear program.

Why it's needed

You have to make a decision today whose cost depends on something you don't know yet. A gas trader locks in procurement tonight, before tomorrow's weather is settled. A bank sets capital reserves before the market moves. A utility commits generators before demand spikes. In every case you know the shape of what might happen — the distribution of possible outcomes — but not the outcome itself.

The question is: what single number from that distribution do you budget against?

PLAN FOR AVERAGE
Cheap, but blind to risk. A handful of bad days can erase a year of thin margins.
OR
PLAN FOR THE WORST
Safe, but wasteful. Hoarding capital for a once-in-a-century event that may never come.

What you actually want is the middle: a number that ignores the best-case noise, takes the tail seriously, but doesn't bet everything on the single worst outcome.

Two classical compromises have been tried. Each has a fatal flaw.

VALUE AT RISK (VaR)
IdeaPick a confidence level, say 95%. VaR is the loss you stay under 95% of the time. The other 5% is the tail; VaR reports where the tail begins.
CatchIt ignores tail depth. A portfolio that loses €10M in its worst 5% and one that loses €100M have the same VaR. And VaR is incoherent: merging two portfolios can make their combined VaR go up, contradicting the idea that diversification reduces risk.
EXPECTED LOSS GIVEN THE TAIL
IdeaAverage only the bad outcomes, conditional on being past the VaR threshold. Looks like the right answer.
CatchWritten as E[L | L ≥ VaR], this conditional expectation is not convex in the decision variables. Convex solvers — the workhorses of operations research — can't touch it.

Conditional Value at Risk fixes both problems with one small move: it averages all of the losses at or past the VaR boundary, defined in a way that stays convex. That single change restores coherence (subadditivity holds, so diversification never raises CVaR), restores convexity (any linear-program solver can handle it), and makes tail depth a first-class input to the decision.


What it does

Fix a tail fraction α — typically 1%, 5%, or 10%. CVaR at level α is the average loss across the worst α-fraction of outcomes. You can read it off in three steps.

  1. Run every scenario you care about and collect the loss under each one.
  2. Rank the losses from smallest to largest. Find the threshold that marks the top α-fraction — that threshold is VaR.
  3. Average every loss at or beyond that threshold. That average is CVaR.

That's it. CVaR reports the mean of the tail; VaR reports the start of the tail. Same scenarios, same α, different question answered.

A concrete read-off
Suppose you have 100 scenarios for tomorrow's gas cost and α = 5%. VaR is the 5th-worst cost. CVaR is the average of the 5 worst costs. If those 5 costs are €11M, €13M, €14M, €18M, €40M, then VaR is €11M but CVaR is €19.2M. A risk manager sizing reserves against €11M would be blindsided by the €40M tail. CVaR captures it.

The shift from boundary to mean is the whole game. CVaR never lets a bad tail hide behind a tolerable quantile.


Core idea

For the math-curious, here's the same idea in formal notation. Each symbol maps directly to the plain-English version above.

Let L be a random loss with cumulative distribution function FL. For a confidence level α (typically 0.95 or 0.99), the Value at Risk is the quantile VaRα(L) = inf{x : FL(x) ≥ α}. CVaR is the conditional expectation

CVaR (continuous-distribution form) CVaRα(L) = E[L ∣ L ≥ VaRα(L)]

VaR reports the boundary of the tail. CVaR reports the mean of what lives inside it. That shift from boundary to mean is why regulators and optimisation modellers prefer CVaR: it is a coherent risk measure in the Artzner–Delbaen–Eber–Heath sense, which means it satisfies monotonicity, positive homogeneity, translation invariance, and (critically) subadditivity. So the CVaR of a diversified book is never worse than the sum of its parts. VaR fails subadditivity. A regulator who caps VaR can, in principle, force a bank to break up a well-diversified book into pieces that each look safer on paper yet are more dangerous in aggregate; a regulator who caps CVaR cannot.


Concrete example: an energy trader buying gas for tomorrow
Scenario — day-ahead gas procurement under price uncertainty

Setup. An energy desk runs 200 simulated price paths for tomorrow's natural-gas spot market, built from weather forecasts and storage levels. The desk has to decide how much gas to lock in today at a fixed forward price versus how much to buy at spot tomorrow. Locking in more costs the forward premium; locking in less exposes the desk to spot spikes.

Three candidate strategies and what each number says. Strategy A minimises expected cost across the 200 scenarios and lands at an average bill of €4.1M. Strategy B additionally caps the 95%-VaR of cost at €5.0M, which means the 10 worst scenarios (the worst 5%) all exceed €5.0M but the cap says nothing about how far. Strategy C instead caps the 95%-CVaR at €5.5M, which constrains the mean of those 10 worst scenarios.

What changes. Under Strategy B, the optimiser discovers it can satisfy the VaR cap by engineering a distribution where 95% of scenarios are clean and the remaining 5% cluster at €8–12M. VaR is happy. The trading desk is not. Under Strategy C, the optimiser cannot hide the tail: every euro of extra loss in a worst-5% scenario pulls the CVaR average up by one-tenth of a euro, so the solver is forced to hedge the tail directly.

The practical consequence. Strategy C buys more forward gas than Strategy B in exactly the scenarios where it matters (cold-snap weather paths with storage draw-downs). The expected-cost penalty versus Strategy A is about €250k. The saving versus a bad realisation of Strategy B is measured in millions. The trader who asks "what is my average bill in a bad month?" gets a real answer from CVaR and a misleading one from VaR.


The geometry of tail averaging
loss L (euros) density f(L) VaRα (boundary) CVaRα (tail mean) mass = 1 − α body of the distribution
VaR marks where the tail starts; CVaR marks the mean of what lives inside it.

Common misreads
It is not just a tighter VaR

CVaR and a tighter VaR (say, 99% instead of 95%) both push the binding scenario further into the tail, but they are not substitutes. A tighter VaR still reports a boundary. CVaR reports a mean. If the tail is long and jagged, tightening VaR moves the boundary but still tells you nothing about what is past it.

It is coherent; VaR is not

Subadditivity is the load-bearing property. A risk measure is subadditive when R(A + B) ≤ R(A) + R(B), i.e. diversification never increases risk. VaR can violate this: two uncorrelated credit portfolios each with a small tail can combine into a portfolio whose joint tail sits above the sum of the parts, and VaR will misreport it. CVaR cannot. This is why the Basel Committee replaced VaR with Expected Shortfall (which equals CVaR on continuous distributions) in the Fundamental Review of the Trading Book (FRTB) capital rules.

It does not make your model non-linear

The Rockafellar–Uryasev form (see the principle section below) is linear in the scenario-based case. A Mixed Integer Linear Programming (MILP) formulation that already encodes demand uncertainty as scenarios absorbs a CVaR objective or constraint with nothing more than one extra auxiliary variable per scenario and the variable t. Solver choice does not need to change.

CVaR, Expected Shortfall, and Average VaR are (almost) the same thing

On continuous loss distributions the three names are synonyms. On discrete or atomic distributions the definitions can diverge by how they handle the probability mass at the VaR quantile. Rockafellar–Uryasev's CVaR uses a convex-combination resolution that preserves coherence in the discrete case; Basel's Expected Shortfall uses a specific percentile formula. For most optimisation work the distinction does not bite.


Where this shows up in practice
Banking Regulation
Basel III FRTB replaced 99%-VaR with 97.5%-Expected Shortfall (equivalent to CVaR) for trading-book capital, because ES is coherent and captures tail depth rather than just tail onset.
Energy Dispatch
Day-ahead unit commitment with wind and solar uncertainty uses CVaR objectives to budget reserve capacity against cold-snap or dunkelflaute scenarios, not just the 95th-percentile scenario.
Supply Chain
Safety-stock and sourcing decisions under demand-tail risk (pandemic spikes, chip shortages, tariff shocks) use CVaR constraints to size inventory against average tail cost, not worst case.
Insurance & Reinsurance
Catastrophe reinsurance uses tail-VaR (a CVaR variant) to price attachment points and limits for hurricane, earthquake, and cyber-event layers where the worst single scenario dominates classic VaR.
Portfolio Optimisation
The mean-CVaR frontier is the coherent analogue of the mean-variance frontier: pick a target expected return, minimise CVaR at a chosen confidence level, solve an LP rather than a QP.
Safe Reinforcement Learning
CVaR-constrained policy optimisation makes an agent risk-aware: it bounds the expected cost on the worst α-fraction of trajectories rather than the expected cost across all trajectories, yielding policies that refuse tail-catastrophic actions.
Hydro & Water Resources
Reservoir operation under inflow uncertainty uses CVaR constraints on drought-year water releases, so operators size spill and draw-down plans against mean drought severity rather than a single design year.
Healthcare Operations
Pandemic-surge planning and ICU capacity decisions use CVaR on peak-occupancy scenarios to avoid the VaR artefact of being "just over capacity" on the worst 5% of days with no budget for how much over.

The Rockafellar–Uryasev principle

Measuring CVaR is easy: rank and average. Optimising with CVaR — deciding what to do in order to minimise it — is harder, and the reason is a hidden feature of the landscape.

Think of optimisation as walking downhill on a landscape, looking for the lowest point. An expected-cost objective gives you rolling hills: the ground slopes gradually, and every step downhill is obvious. Switching to a CVaR objective puts sharp ridges in that landscape. A ridge appears wherever a scenario crosses in or out of the "worst α%" group, and the downhill direction flips abruptly across each one. Standard solvers stumble at these ridges because they build their next step from a local slope that lies on one side of the ridge.

Rockafellar and Uryasev's 2000 result makes this problem disappear. They showed that minimising CVaR over a decision variable x, for loss function L(x, ω) at level α, is equivalent to minimising a different function, jointly over x and a single auxiliary variable t:

Rockafellar–Uryasev (2000) minx, t   t + (1 / (1 − α)) · Eω[ max(L(x, ω) − t, 0) ]

The auxiliary t behaves like a free-floating threshold: at the optimum it settles exactly at VaRα. When L is linear in x and the expectation is a sum over a finite scenario set, the max term flattens into one auxiliary variable per scenario plus one linear constraint, and the whole objective becomes a Linear Programming (LP) problem. The ridges are still in the underlying CVaR function, but they have been absorbed into the structure of the LP — and LP solvers don't care about ridges.

Why this matters operationally
You don't have to guess VaR in advance. You don't have to hand-code which scenarios are in the tail. The reformulation finds both on its own, as a by-product of the optimisation. Any model that could accept a linear objective can accept a CVaR objective with the same solver.

Introducing one auxiliary variable converts a non-smooth tail statistic into a smooth convex program. That single trick turned CVaR from a pretty definition into an operational one — the one that banks, utilities, and supply-chain planners now solve at production scale.


Related concepts