AN-023: Theory operationalization audit¶
Intuition (plain-language)
This page audits the bridge between concept and code: "loser-side concentration" (the theory) is operationalized as FL14 (the rule). Is FL14 special? No — the continuous score beats every binary version (FL10, FL20, Tukey, percentile ranks; 0.939 vs 0.924 and below). The point is deliberately deflationary: FL14 is the auditable, deployable layer, but it is not ontologically privileged. The economic object is continuous concentration; the cutoff is an engineering choice you can defend without pretending it is a law of nature.
Question¶
Does the operational mapping from theory (loser-side concentration) to implementation (FL14) survive an explicit audit against alternative operationalizations? The audit anchors the locked rule of engagement: loser-side concentration is the concept; frequent losers is the implementation. The paper does not defend FL14 as ontologically special.
Design¶
- Sample: 16,843 always-loser firms in BEC 2009–2019.
- Operationalizations evaluated:
- Continuous log(1 + tenders_count) — the underlying signal.
- FL14 (paper convention): median + 1.5 × IQR, integer cutoff 14.
- Tukey Q3 + 1.5 × IQR (alternative IQR rule).
- Strict-train FL7 (cutoff retrained on 2009–2016 only).
- Outcome: AUC against the cobidder target.
Results¶
| Operationalization | AUC | 95% CI | N firms |
|---|---|---|---|
| Continuous log(1+tenders_count) | 0.939 | [0.932, 0.946] | 16,843 |
| FL14 (paper) | 0.924 | [0.921, 0.926] | 2,735 |
| Tukey Q3 + 1.5 × IQR | 0.834 | [0.804, 0.863] | 1,981 |
| Strict-train FL7 firm-level | 0.767 | [0.734, 0.800] | (train-pool) |
| Strict-train continuous (train) | 0.750 | [0.706, 0.795] | (train-pool) |
Macros: \valAUClogtc, \valAUCFLfirm, \valAUCQThreeIQR,
\valAUCStrictFirmFL, \valAUCStrictFirmTC, \valFLQThreeIQR,
\valFL, \valAlwaysLosers.
Figure: AUC point estimates across alternative FL operationalizations — continuous log_tc (0.939), FL14 (0.924), Tukey Q3 + 1.5 × IQR (0.834), strict-train FL7 (0.767). Continuous dominates; FL14 sits on the high plateau; tighter cutoffs lose discrimination. The paper's choice is auditable, not ontologically privileged.
Interpretation¶
The continuous score dominates every binary operationalization. FL14 is not ontologically privileged: it is the auditable, deployable layer on top of an underlying continuous primitive. Three readings:
- FL14 vs continuous (0.924 vs 0.939): the auditable binary loses ~0.015 AUC relative to the full-information continuous score — the trade-off price of an auditable cutoff.
- FL14 vs Tukey (0.924 vs 0.834): the paper's
median + 1.5 × IQRcutoff substantially outperforms the TukeyQ3 + 1.5 × IQRalternative. The choice is documented, not arbitrary. - Full-panel vs strict-train (0.924 vs 0.767, FL14 binary): in-sample numbers are inflated; the train-cutoff variant gives the honest discrimination (AN-006).
The audit forecloses the JLEO-reviewer suspicion that the paper is over-defending an arbitrary cutoff. The rule of engagement is explicit: the construct is the continuous primitive; FL14 is the operational rule.
Follow-ups¶
- Robustness to alternative IQR definitions (Q1+x×IQR, median+kσ).
- Sensitivity of the continuous score to alternative transformations (rank-percentile, raw counts).
- Persistence across sub-periods.
