We measured "probability of reaching a level": fixed horizons lie at the end of the session

Plenty of indicators print lines like "probability of touching this level: 79%". We build one of those guides ourselves, so we asked the uncomfortable question: how honest can that number actually be? We measured it on ~19,000 out-of-sample observations per dataset. Here are the three results — including the one that refutes a favorite volume-profile belief, and the one that forced us to redesign our own guide.

Summary: (1) counting frequencies on your own chart beats formulas, if the sample is large; (2) the VPOC does not attract price — the "magnet" is distance plus volatility; (3) a fixed 24-bar horizon overestimates by up to 2.4× late in the session, and bounding the horizon to the current session fixes it (31–77% Brier improvement).

The experiment

Event to predict: "price touches the nearest volume zone (VPOC/VAH/VAL) within the next 24 bars". Data: EURUSD and GBPUSD on H1 plus EURUSD on H4 (OANDA candles, 2021–2026). The volume profile replicates exactly what our indicator computes (200 bars, 30 bins, 70% value area, distances in ATR(14)). Scoring is the Brier score — the mean squared error of probabilities: 0 is a perfect oracle, and "always predict the base rate" scores ~0.25. Everything is evaluated out of sample (the final 40% of each series, ~19,000 observations per H1 dataset).

Finding 1 — counting beats formulas (if you count enough)

We compared five estimators: the empirical frequency over 300 samples (what we had deployed), a first-passage Brownian model, the frequency over 5,000 samples, the Brownian model with isotonic calibration, and a blend.

Method	EURUSD H1	GBPUSD H1	EURUSD H4
Constant (no skill)	0.2499	0.2494	0.2499
Frequency, 300 samples	0.1737	0.1676	0.1523
Brownian (formula)	0.1680	0.1605	0.1498
Frequency, 5,000 samples	0.1667	0.1576	0.1489
Calibrated Brownian	0.1686	0.1601	0.1521
Blend (Brownian + frequency)	0.1662	0.1579	0.1483

Three takeaways. First: every method beats chance by a wide margin (~33% skill over the constant) — the concept of a measured reach probability works. Second: the large-sample frequency is almost perfectly calibrated — when it says 26%, it happens 27% of the time; when it says 87%, it happens 87%. Third: the flaw of the 300-sample method isn't the concept, it's the short sample: in the lowest decile it predicts ~1% while reality is 9%. That "0% in 24 bars" you sometimes see on a chart is a statistical artifact, not a probability. With 5,000 samples the same decile predicts 5% and 7% happens.

Blending in the Brownian model adds only 0.3–0.5% — not worth losing the most valuable property of the empirical frequency: anyone can count it on their own chart and verify it.

Finding 2 — the VPOC is not a magnet

A repeated volume-profile belief says the VPOC "attracts" price. We tested it: compare the realized touch rate against what pure distance + volatility implies (the driftless Brownian model). If the magnet existed, the realized rate should sit above it.

Result: it sits below — an excess of −2.6 percentage points on EURUSD H1 and −2.0 on H4 for the VPOC; VAH/VAL within ±1pp of noise. The base rate of touching the nearest level within 24 bars is ~52–53%, and it is fully explained by how close the level is and how volatile the market is. We also tested the "zone already tested this session" variant: controlling for distance it adds +0.3pp — nothing. The spectacular raw gap ("66% for tested zones vs 32% for fresh ones") exists because a freshly tested zone is, by definition, close.

We publish this knowing it goes against standard volume-profile marketing — including our own: we are not allowed to claim the VPOC "attracts" price, because our own data says it doesn't.

Finding 3 — fixed horizons lie at the end of the session

This is the result that forced a redesign. Typical reach guides (ours included) answer "does price touch the level within the next 24 bars?". But an intraday trader doesn't think in 24 bars: they think in this session. A fixed 24-bar window happily crosses from London into New York and into Asia.

We measured the alternative: bound the horizon to the bars remaining in the current session (using the CDF of time-to-reach from that session's pool). Event: "covers d·ATR upward before its session ends". Train on 2023, test out-of-sample on 2024 (5,391 samples per distance):

Distance (ATR)	Fixed 24-bar	Session-bounded	Brier improvement
0.5	0.3366	0.2330	31%
1.0	0.4166	0.2164	48%
1.5	0.3811	0.1449	62%
2.0	0.3092	0.0934	70%
3.0	0.1777	0.0412	77%

The intuition behind the failure is simple: the fixed estimator prints "~79%" through the entire London session, but the real probability of covering 1 ATR before London closes depends brutally on the clock: 31% with one bar left, 49% with two, 59% with three, 65% with four. Late in the session, the fixed number overestimates by ~2.4×.

Replication on fresh data

Everything above was measured on 2023–2024. To rule out a dataset artifact, we repeated the measurement on 404 H1 bars from June 2026, downloaded from a different source (TradingView/OANDA):

Bars left in session	P(1 ATR), 2023–24	P(1 ATR), Jun-2026
1	0.31	0.33
2	0.49	0.43
3	0.59	0.55
4	0.65	0.65

The reproduction — two years later, on a different data source — is nearly exact. The pattern is structural, not a quirk of the backtest window.

What we changed because of this

We redesigned our reach guide so the intraday horizon is "what remains of the current session" (time-to-reach histograms per session and distance bucket, updated causally), while keeping the property that everything is counted on the chart itself. One extra lesson from the calibration study: a frozen historical pool over-predicts by 5–13pp when the regime shifts (2023→2024) — the window must roll. More history is not always better history.

Limitations (read before quoting)

Major FX pairs (EURUSD, GBPUSD) on H1/H4; other assets and timeframes may behave differently.
The session study measures upward d·ATR moves; downside symmetry is not tested here.
Distances are ATR(14)-normalized; in extreme volatility the ATR lags and biases the buckets.
Probabilities ≠ signals: a well-calibrated probability informs context, it does not tell you to trade.
None of this is financial advice. Trading involves risk of loss.

Takeaways

If your indicator gives you a "probability of touching a level", ask it two questions: over how many samples? and over what horizon?
Short samples fabricate fake 0%s and 100%s; fixed horizons fabricate optimism late in the session.
The VPOC does not attract price: distance and volatility run the show.
Honest numbers can be counted and verified on the chart itself — and they still have to be counted right.