Backtesting the watchlist: do our signals actually anticipate designations?

A watchlist of "entities tied to sanctioned parties" is only worth anything if those ties show up before the designation, not after. So we hold our own method to that standard and publish the result: a point-in-time backtest that asks one question — of the entities that have since been designated, what fraction would our signals have flagged in advance, and how far ahead?

The discipline: no hindsight

The easy version of this test cheats. It looks at a sanctioned entity, finds any link to another sanctioned entity, and calls it a hit — ignoring that both designations may have happened on the same day, or that the "predictive" link was only visible in retrospect.

We don't count those. A signal qualifies only when both conditions hold:

the linked counterparty was already sanctioned by some jurisdiction at the moment of the court case, ownership stake, shared directorship, or contract; and
that moment predates the entity's own first designation.

Everything else is discarded. What survives is a genuinely point-in-time record: relationships that were observable, and informative, before the entity in question was listed by anyone.

What we measure — and what we don't

This is a recall test, not a precision test. We are not asking "how many of our flags came true." We are asking the inverse: of the population that did get designated, how many did we anticipate, and with how much lead time. Precision — the false-positive rate — is a different question that the forward watchlist addresses separately.

The channels feeding the test: shared court litigation, ownership (both the ≥50% control threshold and the 10–50% minority band), shared directorship, and procurement (a contract signed with an already-sanctioned counterparty). Anchored against OFAC plus the EU, UK, Ukraine, Canada, Australia, Japan, Switzerland, and New Zealand lists.

The result, as of this run

On the court-present cohort first designated since 2022, the signals anticipate 41.4% of subsequent designations. The two channels we added most recently — procurement and minority ownership — are what moved it there: on an apples-to-apples comparison over the same cohort, recall rises from 33.3% with the legacy three channels to 41.4% with all five. The honest current-state figure, on the broadened court-or-procurement denominator, is 40.7% — slightly lower, because procurement surfaces more entities than its own signals always anticipate. We report both numbers and keep them separate; conflating a channel change with a denominator change is exactly the kind of quiet inflation we'd rather not do.

The lead time is the part worth sitting with. The median anticipated entity was flagged ~1,843 days — roughly five years — before its designation. The longest leads exceed eleven years. This is not a nowcast; it is a structural early-warning signal with a long fuse.

Recall is also rising in recent cohorts: 51.7% for entities first designated in 2025, 60.6% for 2026, against ~35–42% earlier in the window. As the relationship graph deepens and designations increasingly land on entities already embedded in known networks, the method catches more of them.

A finding that fell out of it: allies lead OFAC

One cross-jurisdiction result is useful on its own. Of OFAC-designated entities with a Russian tax ID, 14.7% were already on an allied list before OFAC acted — a median of 382 days earlier. Allied designations are themselves a leading indicator for US action. If you screen only against OFAC, you are, on average, a year behind the entities your peers in Brussels, London, and Kyiv have already named.

What it's for

The backtest is the foundation the forward-looking watchlist stands on. We don't ask anyone to take the watchlist's flags on faith. We show the method's historical hit rate, its lead time, and its honest limitations first — and only then point it at the entities that aren't sanctioned yet. That companion piece is the next entry.