Superseded numbers — canonical-target re-estimation (June 4, 2026)

This analysis note documents a historical run under the earlier validation label. On June 4, 2026 the paper adopted a reproducible, non-circular target (651 always-loser cobidders; frequent-loser flag never used in the label) and re-estimated every result. Where this page conflicts with the paper or the changelog, the paper wins.

AN-034: Sequential gatekeeping envelope — joint vs sequential cost-of-evidence¶

Intuition (plain-language)

Opening full bid-level microdata is the expensive forensic step. Can a near-free award screen order which firms enter that cost first? An FL → Imhof pipeline keeping the top 2,000 firms recovers 74% of the true positives the full joint model finds, while substantially shrinking the bid-row footprint pulled into forensic analysis (on the order of a third, not the firm-count reduction). This is the paper's cost-of-evidence argument read honestly: it maps a cost–recall frontier where one operating point trades evidence for recall — there is no single optimal cutoff, and the joint benchmark is a full-observability upper bound, not an attainable target.

Question¶

When deployed sequentially (FL gatekeeper → Imhof forensic stage) vs jointly, how does the cost-of-evidence trade-off look across precision targets and Stage-1 cutoffs? The joint scoring of AN-010 is the full-observability upper bound; the sequential architecture is the operational deployment.

Design¶

Rules compared:
Award-layer only (FL log_tc): no bid microdata required.
Bid-layer only (Imhof full): requires full bid microdata.
Joint scoring (FL + Imhof, single model): full microdata.
Sequential FL → Imhof, Stage-1 keeps top K ∈ {1000, 2000, 4000}.
At k = 50 firms (top-50 of the relevant rule), report:
Smallest k achieving precision targets 0.10 / 0.15 / 0.20.
True positives (TP) at smallest k.
Recall at smallest k.
Bid-microdata footprint (number of bid records that must be recovered to apply the rule).

Results¶

At precision target ≥ 0.1, smallest k = 50 across all rules:

Rule	TP @ k=50	Recall	Bid-microdata footprint
Award-layer only (FL log_tc)	15	7.8%	0
Bid-layer only (Imhof full)	12	6.2%	11,676
Joint scoring (FL + Imhof)	23	11.9%	11,676
Sequential FL → Imhof, K = 1,000	17	8.8%	1,000
Sequential FL → Imhof, K = 2,000	17	8.8%	2,000
Sequential FL → Imhof, K = 4,000	15	7.8%	4,000

Source: output/architecture_gatekeeper/sequential_envelope.csv.

Figure: cost-of-evidence Pareto plot. X-axis: bid-microdata footprint (records to recover). Y-axis: TP at k=50 (precision target 0.10). Joint scoring is the Pareto frontier upper bound (23 TP, 11,676 microdata); Sequential K=1,000 and K=2,000 approximate the joint upper bound at 8-17% of the microdata cost; Award-only (0 microdata, 15 TP) is the zero-cost benchmark.

Cost-of-evidence trade-off¶

Architecture	TP @ k=50	Microdata cost	TP per microdata-record-recovered
Award-only	15	0	∞ (no microdata)
Joint	23	11,676	0.0020
Sequential K=1,000	17	1,000	0.017 (8.6× more efficient than joint)
Sequential K=2,000	17	2,000	0.0085 (4.3× more efficient than joint)
Sequential K=4,000	15	4,000	0.0038 (1.9× more efficient than joint)

Recovery as % of joint upper bound¶

Architecture	TP @ k=50	% of joint	Microdata as % of joint
Award-only	15	65%	0%
Joint	23	100%	100%
Sequential K=1,000	17	74%	8.6%
Sequential K=2,000	17	74%	17.1%
Sequential K=4,000	15	65%	34.3%

Interpretation¶

The envelope quantifies the operational trade-off:

Joint scoring is the upper bound (23 TP, 11.9% recall at k = 50) but requires full bid microdata on every firm (11,676 records).
Sequential FL → Imhof at Stage-1 K = 2,000 captures 74% of joint recall (17 TP vs 23) using 17% of the bid-microdata footprint (2,000 vs 11,676). This is the operational architecture that approximates the full-observability upper bound at substantially lower forensic cost.
Award-only achieves 65% of joint recall (15 TP) with ZERO bid-microdata cost. For agencies that cannot recover bid microdata at all, the award-layer screen alone preserves most of the discriminative value.
The bid-layer alone is the weakest of the four architectures (12 TP, 6.2% recall) — Imhof requires participation features to reach its headline performance (AN-010 shows Imhof CV-only = 0.585 chance-level). Bid-distribution features alone are not a substitute for award-layer information.

For H:award-bid-complementarity: the sequential envelope confirms the complementarity claim at the operational level. Award-layer signal is necessary; bid-layer signal adds incremental discrimination at additional microdata cost; joint is the upper bound; sequential approximates joint at lower cost.

This is the architecture defended in §6 of the manuscript: the award layer orders forensic priority, deciding where the bid layer is opened — a division of labor, not a cartel detector. The envelope is a cost–recall frontier: K = 2,000 is a reported operating point, not an optimal cutoff, and the joint benchmark is a full-observability upper bound rather than an attainable deployment target. AN-013 reports the temporal-holdout precision metrics under this architecture.

Follow-ups¶

Same envelope under temporal holdout (test rules formed on 2009–2016 applied to 2017–2019 — partially in temporal_holdout_table.csv).
Sensitivity to Stage-1 cutoff (smooth between K = 500 and K = 5,000).
Precision targets 0.25 and 0.30 (the table only reports through 0.20 because Award-only doesn't reach higher).
Add macros \valSeqEnvJointTP, \valSeqEnvSeqKTwoTP, \valSeqEnvSeqKTwoMicrodata to the scripts/99_make_paper_values.R pipeline.