Discrimination Backtest — Does Watchlist Rank Predict Designation?

Generated 2026-07-02T23:41:38+00:00

Result. Among entities our data links to a sanctioned party, watchlist rank separates the companies designated in the following year from those never designated — AUC 0.6623 at the start of the invasion wave rising to 0.7864 by 2025, improving every year. A company on the qualified watchlist is designated the next year at 117–238× the rate of a random Russian company. This is the ranking counterpart to the recall backtest: that one asks whether we flag designations at all; this asks whether we rank them near the top.

The recall backtest asks a yes/no question (did any pre-designation link exist) and never uses the score that orders the watchlist. This asks the ranking question directly: are the entities we rank highest the ones actually designated next? Point-in-time landmark case-control — at each Jan-1 date every entity is scored from links dated strictly before it, gated so the counterparty was already sanctioned when the link formed.

Two AUCs, two questions. AUC-population vs a random EGRUL company (symmetric zeros) folds selection and ranking together — being on the list plus where on it. AUC-within-visible compares only entities already linked to sanctioned parties, isolating pure ranking quality. 0.50 = no better than a phone book; 1.0 = perfect. Recall-visible is the share of window designations our data could see at all before T (the recall ceiling). A never-designated control is not a false positive — designation is a throttled sample of the sanctionable set — so this scores ranking, not calibrated probability.

Per-landmark discrimination

Landmark T	Cases (visible)	Recall-vis	AUC-population	AUC-within-visible
2022-01-01	1,589 (513)	32.3%	0.6605	0.6623
2023-01-01	3,917 (968)	24.7%	0.622	0.7331
2024-01-01	1,578 (419)	26.6%	0.6302	0.7546
2025-01-01	1,031 (299)	29.0%	0.6414	0.7864

Designation enrichment by score bucket (sampling-corrected, next-1yr window)

Lift = how many more times likely a company in the bucket is to be designated in the next year than a random EGRUL company. qualified is the published watchlist definition (≥2 link types or ≥3 sanctioned ties).

Landmark T	Base rate	flagged lift	qualified lift	multi-channel lift	≥3-channel lift
2022-01-01	0.0129%	157.4×	237.7×	990.7×	829.9×
2023-01-01	0.0318%	73.1×	133.8×	441.1×	761.6×
2024-01-01	0.0128%	46.7×	121.1×	312.0×	299.9×
2025-01-01	0.0084%	34.7×	117.3×	309.9×	1645.6×

Pooled (indicative — controls recur across years)

Cases 8,115 (2,199 visible pre-window)
AUC-population 0.6333 · AUC-within-visible 0.7666
Base designation rate 0.0165%; being on the qualified watchlist raises it 137.6×, ≥3-channel 726.1×.

Reading it. AUC-within-visible isolates ranking given a link exists (does breadth/tie-count order the linked crowd correctly); AUC-population adds selection (being flagged at all vs a random company). Recall-visible is the orthogonal recall ceiling — the share of designations our data sees pre-designation. The enrichment table is the buyer-facing number: work the qualified list and you hit next-year designations at many times the base rate. A never-designated control is not a false positive — designation is a throttled sample of the sanctionable set, so this is enrichment of real designations, not a precision claim.