AN-028: RAIS-validated SME winner composition¶
Intuition (plain-language)
Use employer records (RAIS) to confirm the 'SME' winners are genuinely small firms. The price effect survives restricting to RAIS-validated small winners — but the distance effect vanishes, exactly as a firm-size composition story predicts (the distant winners were the larger non-SMEs). This separately identifies the composition channel.
Question¶
The composition channel in AN-008 loaded +185% of the Gelbach gap on the SME-winner mediator, but the SME-winner indicator there is from BEC's own SME classification. Two complementary questions: (a) does the price effect survive restricting to RAIS-validated SME winners (i.e., firms with ≤49 formal employment links in 2017)? (b) does the distance effect vanish in the SME-validated subsample, as the geographic-catchment channel would predict if non-SMEs drive distance widening?
Design¶
- Sample: BEC items, 18-month window, with winner CNPJ linked to RAIS 2017 employment counts via CNPJ raiz (82.6% match rate; project memory). Five subsamples by RAIS employment:
- Full (N = 649,714)
- Winner in RAIS (N = 599,020; drops unmatched winners)
- SME (≤49 employment links) (N = 532,538)
- Micro (≤9 employment links) (N = 437,975)
- Not large (<100 employment links) (N = 544,757)
- Specification (a): same DiDiR as AN-001 on each subsample.
- Specification (b): outcome = \(1\{\text{winner has } \leq 49 \text{ RAIS employment links}\}\) with item FE; DiDiR same.
Results¶
Panel A — Log price across RAIS subsamples (tab_rais_validation.tex):
| Subsample | β on \(g65 \times \text{Pre}\) | SE | N |
|---|---|---|---|
| Full | −0.109* (sign-flipped baseline) | (0.012) | 649,714 |
| Winner in RAIS | −0.102* | (0.012) | 599,020 |
| SME (≤49) | −0.121* | (0.010) | 532,538 |
| Micro (≤9) | −0.089* | (0.013) | 437,975 |
| Not large (<100) | −0.118* | (0.010) | 544,757 |
Panel B — Distance across RAIS subsamples:
| Subsample | β on \(g65 \times \text{Pre}\) | SE | p |
|---|---|---|---|
| Full | +14.25* | (2.36) | <0.01 |
| Winner in RAIS | +17.10* | (2.37) | <0.01 |
| SME (≤49) | −0.28 | (2.44) | >0.10 (null) |
| Micro (≤9) | +3.06 | (2.81) | >0.10 |
| Not large (<100) | +6.68*** | (2.48) | <0.01 |
Panel C — Log firms (entry) across RAIS subsamples: all subsamples show positive log-firm effects (0.07 to 0.12), all p<0.01.
Winner composition DiDiR (tab_winner_rais.tex): outcome
\(1\{\text{winner is SME (≤49 RAIS)}\}\):
| Window | β | SE |
|---|---|---|
| 6-month | −0.110*** | (0.008) |
| 12-month | −0.168*** | (0.007) |
| 18-month | −0.213* | (0.007) |
Open competition reduces the probability of an RAIS-validated SME winner by 21 percentage points at 18 months.
Output: output/tables/tab_rais_validation.tex,
output/tables/tab_winner_rais.tex.
Interpretation¶
The price effect survives — and slightly strengthens — under SME-only validation. Restricting to SME-validated subsamples keeps the price coefficient between −0.089 (Micro) and −0.121 (SME) — all negative, all p<0.01. The composition channel cannot fully explain the price effect: even conditional on the winner being a verified SME, the policy regime moves prices. The reduced-form result is therefore both a competition effect and a composition effect (per the Gelbach decomposition in AN-008) — neither component is residual.
The distance effect vanishes in the SME-validated subsamples. The full-sample distance coefficient (+14.25 km) is driven by non-SME winners*: restricting to SME (≤49) gives β = −0.28 (null); Micro (≤9) gives β = +3.06 (null). Adding back firms up to <100 employment links recovers β = +6.68. This is the cleanest evidence available for H:distance-widens-under-open: the geographic-catchment widening is entirely* a non-SME composition effect. RAIS-validated SMEs win locally; non-SMEs win at a wider radius.
The composition shift is large. Open competition reduces the RAIS-validated SME-winner probability by 21 percentage points at 18 months — a structural shift in winner composition. This is the strongest test of H:sme-winner-share-falls: the BEC SME-status flag could be unreliable (firms self-classify), but RAIS employment is administratively recorded and orthogonal to BEC behavior. The 21 pp composition shift on the cleaner indicator matches the Gelbach reading.
Reading-bridge. Together, AN-008 (Gelbach decomposition: +185% composition / −85% competition / large unexplained), AN-009 (reduced-form SME-winner shift on BEC indicator), and this AN (RAIS-validated 21 pp shift) form a triangulation: all three identify the same composition channel through different lenses; none separately explains the full reduced-form price effect.
Confidence: yellow. RAIS is the cleanest available firm-size proxy in Brazilian data — administratively recorded, lagged from BEC behavior, no self-reporting bias. The 82.6% match rate is high. The yellow caveat is that the SME (≤49) cut is the federal SME definition (Lei Complementar 123/2006), which uses revenue and employment thresholds. The pure-employment cut here is a proxy for the dual definition.
Follow-ups¶
- Revenue-based SME validation: link to Receita Federal CNPJ revenue records to complement RAIS employment. The dual SME definition (employment + revenue) is closer to the legal indicator used by BEC.
- Firm-age cross-cut: combine RAIS employment with firm age (project script 15) to isolate young SMEs vs old micro firms — the policy may benefit one more than the other.
- CNAE heterogeneity (project script 16): the composition shift may differ across sector codes within Group 65 (e.g., medical equipment vs disposables); document the per-CNAE breakdown.