H:placebo-and-dynamics — The urgent-procurement pattern is specific to litigated items; dynamic evidence is diagnostic, not the primary design¶

Two robustness pieces close the argument. First, a placebo: if the urgent-procurement pattern were a general artifact of the data rather than something tied to the litigation margin, it should reproduce on items that are never subject to litigation. It does not — the placebo coefficient is economically and statistically null. Second, a dynamic event study (BJS) that traces the urgent margin over time around exposure. The dynamic estimates are informative in direction but do not survive Honest-DiD sensitivity at the observed maximum pre-period scale, so the dynamic evidence is presented as a diagnostic, not as the paper's primary identification. The primary design remains the selection-bounded UTG comparison and the within-firm-buyer-item test.

Economic intuition

Two final checks. The first asks whether the pattern is real or just a quirk of the data: we run the same test on medicines that are never litigated. If the effect showed up there too, we would worry it was an artifact — but it comes back flat, essentially zero, which tells us the pattern is specific to the litigation margin. The second is a timing study that watches the urgent margin evolve around exposure. It points the right way, but when we stress-test it for pre-trend violations it does not hold up at the largest pre-period deviation we actually observe. So we report it honestly as a diagnostic that is consistent with the story, not as the backbone of the identification. The backbone is the bounded under-the-gun comparison and the same-firm test.

Evidence strength: Partial (strongly supported). The placebo on never-litigated items returns a negotiated-price coefficient of −0.020 (SE 0.032), null (AN-008) — the pattern does not reproduce off the litigation margin. A stricter buyer-by-class placebo battery (AN-013) restricts never-litigated items to procurement environments that also contain litigated purchases; the negotiated-price coefficient remains null (−0.046, SE 0.053), as do the quantity-controlled negotiated-price, reference-price, and bidder-count placebos. The BJS event study (AN-010) gives a first post-exposure estimate of 0.052 (SE 0.018) rising to 0.147 (SE 0.026) by the fifth period, but Honest-DiD sensitivity does not survive deviations at the observed maximum pre-period scale, so the dynamic evidence is diagnostic, not the primary design. The partial-strong status rests on the placebo specificity battery, not on treating the dynamics as causal identification.

Theory¶

A credible mechanism story should be specific: the cost margin should attach to the litigation margin, not to urgent procurement in general or to the data universe at large. A placebo on never-litigated items operationalizes this specificity — there is no litigation channel there, so any apparent effect would signal a confound rather than the mechanism. The dynamic event study addresses a different question: the timing of the margin around exposure. Modern heterogeneity-robust event-study estimators (Borusyak, Jaravel & Spiess, 2024) recover an interpretable dynamic path, but their credibility depends on parallel-trends assumptions that cannot be tested directly. Honest-DiD sensitivity analysis (Rambachan & Roth, 2023) asks how large a pre-trend violation the dynamic estimates could tolerate before the conclusion flips; when the answer is "smaller than the pre-period deviations we actually observe," the dynamic design cannot carry the primary identification and is properly demoted to a diagnostic.

Prediction¶

Placebo: the urgent-procurement price coefficient on never-litigated items should be economically and statistically zero.
Dynamics: the BJS event study should show a post-exposure rise in the urgent margin (directionally consistent with the main results), while Honest-DiD sensitivity will reveal whether that path is robust to plausible pre-trend violations.

Competing prediction¶

General artifact. If the urgent-procurement pattern were a generic feature of the data — an item-mix effect, a reporting artifact, a general urgency phenomenon unrelated to litigation — it would reproduce on never-litigated items. The null placebo coefficient (−0.020, SE 0.032) rejects this: the pattern is specific to the litigation margin. For the dynamics, the competing reading would treat the rising event-study path as primary causal evidence; the Honest-DiD result blocks that overclaim, which is exactly why the paper labels the dynamic evidence diagnostic rather than load-bearing.

Setting evidence¶

The BEC pharmaceutical data contain many items that are never subject to right-to-health litigation, providing a natural placebo universe with the same procurement procedures but no litigation channel. The same data support an exposure-timed event study around the onset of litigation for affected items. The institutional account in docs/paper.md describes how the litigation margin is identified in the data and why never-litigated items form a clean placebo group for specificity (not for the main causal contrast).

Empirical test¶

Placebo outcome: log negotiated price on never-litigated items, urgent contrast.
Dynamic outcome: BJS heterogeneity-robust event-study coefficients on the urgent margin around exposure, with a Rambachan-Roth Honest-DiD sensitivity overlay.
Specifications: placebo regression mirroring the main urgent specification on the never-litigated subsample; BJS estimator with pre- and post-exposure leads/lags; Honest-DiD bounds anchored to the observed maximum pre-period deviation.
Sample: never-litigated items for the placebo; the exposure panel for the dynamics.

Data requirements and limitations¶

Requires the never-litigated subsample for the placebo and an exposure-timed panel for the event study. The placebo is a specificity check and supports the mechanism by rule-out; a null there is consistent with, but not proof of, the main contrast. The dynamic event study is explicitly diagnostic: because Honest-DiD sensitivity does not survive deviations at the observed maximum pre-period scale, the event-study path should not be read as the primary causal design, and the paper does not rest any headline magnitude on it. The primary identification is the selection-bounded UTG comparison (H:utg-gap-selection-bounded) and the within-firm-buyer-item test (H:no-broad-same-firm-markup).

Evidence¶

Analysis	Bearing	Key takeaway
AN-008	Supports (via rule-out)	Placebo on never-litigated items: negotiated-price coefficient −0.020 (SE 0.032), economically and statistically null. The pattern is specific to the litigation margin.
AN-013	Supports	Stricter buyer-by-class placebo: matched never-litigated negotiated-price coefficient −0.046 (SE 0.053), −0.020 (SE 0.070) with quantity control; reference-price and bidder-count placebos are also null.
AN-010	Diagnostic	BJS event study: first post-exposure estimate 0.052 (SE 0.018), five-period estimate 0.147 (SE 0.026), directionally consistent. Honest-DiD does not survive deviations at the observed maximum pre-period scale — diagnostic, not primary.

Open tests¶

Classifier-threshold sensitivity¶

The placebo battery now includes a matched buyer-by-class never-litigated comparison. A remaining implementation check is to re-run the placebo battery under stricter classifier-confidence thresholds if regime-level confidence scores are exposed in the public replication files.

Alternative dynamic estimators as a triangulation¶

Re-estimating the dynamic path with a second heterogeneity-robust estimator and reporting its own Honest-DiD bound would triangulate the diagnostic reading. This would not promote the dynamics to primary status — that is precluded by the pre-period sensitivity — but it would document the diagnostic more fully.

Classifier robustness underpinning the samples¶

All of these samples rest on the urgent-class classifier (764,362 classified purchase orders; 98.6% exact agreement vs 179,148 ground-truth POs; urgent-class F1 0.93 judicial, 0.96 administrative, macro-F1 0.94). A sensitivity sweep re-running the placebo and dynamics under stricter classification thresholds would confirm the robustness pieces do not hinge on borderline classifications.