Sequential gatekeeping traces a cost-recall frontier¶
Intuition (plain-language)
Forensic bid analysis is costly, so you cannot run it on every firm. Route it instead through the top of the cheap loser-side ranking: you decide how big a survivor pool to forward to the expensive stage, and each choice buys you a level of recall. There is no single magic cutoff and no optimal K — there is a frontier. And "how much you save" depends on what you count: by firms, a pool of 2,000 looks like an 88% cut; but those survivors are heavy bidders, so by bid rows — the data you actually have to recover — the saving is only about 33%. A smaller pool (K₁ = 1,000) even beats K₁ = 2,000 at the same k here. This is a retrospective recovery-footprint design, not measured agency budget savings.
🟡 Used as a sequential gatekeeper on the validated incumbent ranking, the award-layer ranking traces a cost-recall frontier: each survivor-pool size forwarded to the costly bid stage buys a level of recall against the 651 adjudication-anchored cobidders. There is no universal cut and no optimal K — only a frontier of operating points.
The honest cost depends on what you count, with explicit denominators. On pool A (16,731 firms, 651 positives) at the operating point K₁ = 2,000 survivors (sequential, evaluated at k = 500):
- By firms opened (denominator 16,731 firms), the pool falls
~88.1% (
\valCostFirmRedTwoK). - By bid rows to recover — the data an agency actually pays for —
the saving is only ~32.7% (
\valCostBidRowRedTwoK), because the survivors are high-participation firms that carry most of the bid volume. The firm count overstates the burden saving.
No optimal K. A smaller pool, K₁ = 1,000, recovers more true positives at k = 500 (124 TP, prec 0.248) than K₁ = 2,000 (116 TP, prec 0.232) — one more reason no operating point is "optimal" (AN-034, AN-035). Sequential at K₁ = 1,000 recovers 124 of the joint upper bound's 133 true positives at k = 500 (≈93%) at far lower informational cost.
The frontier — not a single optimal cutoff — is the design object. An agency picks an operating point that fits its forensic budget; the paper supplies the trade-off curve, not a prescription.
Caveat. Because strict timing for the bid rerank is not available with the current LANCES features, the frontier should be read as a retrospective cost-footprint design conditional on the validated incumbent ranking, not a fully prospective deployment test. These are recovery-footprint reductions with stated denominators, not measured agency budget savings; do not quote the 88% firm figure as the burden saving. The reading is 🟡 pending independent replication on a non-BEC procurement panel.
Sources.
- Own analysis: AN-012 (in-sample precision@k), AN-013 (temporal- holdout audit), AN-014 (leakage audit — defensible verdict), AN-034 (sequential envelope — no optimal K; K₁=1,000 beats K₁=2,000 at k=500), AN-035 (full architecture × k × regime matrix — recovery-footprint accounting), AN-036 (K-fold CV precision SD ≤ 0.011).
- Cross-refs: H:gatekeeping-cost-of-evidence; docs/results.md.
- Macros:
\valCostFirmRedTwoK(88.1% firm reduction at K₁=2,000, denominator 16,731 firms),\valCostBidRowRedTwoK(32.7% bid-row reduction — the honest burden figure),\valCostTPSeqOneK(124 TP at K₁=1,000),\valCostTPSeqTwoK(116 TP at K₁=2,000),\valCostTPJoint(133 joint upper-bound TP),\valCostPoolN(16,731),\valMainCobidders(651),\valFL(2,735). - Validation: backing scripts
scripts/42_operational_metrics.R,scripts/43_precision_at_k_audit.R,scripts/40_leakage_audit_d3.R.