From bd639ce3b66325452648b7e15c7c045cf7f250d0 Mon Sep 17 00:00:00 2001
From: Max Ghenis <mghenis@gmail.com>
Date: Fri, 22 May 2026 06:26:09 -0400
Subject: [PATCH 1/2] Clarify simultaneous-dollar-and-count calibration vs
 caseload-only
MIME-Version: 1.0
Content-Type: text/plain; charset=UTF-8
Content-Transfer-Encoding: 8bit

PE-US's L0 reweighting and TRIM3's caseload-driven calibration sound
similar at the surface ("aligned to administrative totals") but differ
on what's actually solved for. The previous YAML wording underplayed
the difference. Tightened three imputation rows:

PE-US SNAP: description now states that L0 simultaneously matches
  state and national SNAP outlay dollars AND state-level SNAP
  recipient-household counts in a single constrained optimization
  pass — total dollars and caseloads are both pinned to admin
  targets, not just one or the other. Calibration-targets list adds
  the previously-implicit recipient-household-count target (it's in
  policyengine-us-data/calibration/target_config.yaml but wasn't
  surfaced here).

TRIM3 SNAP: clarifies that total SNAP benefit outlay dollars are an
  emergent property of (participation × rule-computed benefit), not
  an explicit calibration target. The benefit-band composition
  target approximates the dollar total via the participant
  distribution, but doesn't solve for dollars directly. Cites
  Wheaton & Tran (Urban) on SNAP anti-poverty effects.

TRIM3 TANF: same structural note — TANF outlay dollars not an
  explicit constraint; PE-US's L0 has them as a state and national
  target alongside recipient-unit counts.

Other rows (TRIM3 SSI explicitly calibrates both caseload AND
benefits per their own docs, which our existing row already
captures; PE-US TANF, SSI, Medicaid rows already list both dollar
and count targets) are unchanged.

61/61 tests pass; lint clean.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
---
 data/comparisons/imputations.yaml | 25 ++++++++++++++++++++-----
 1 file changed, 20 insertions(+), 5 deletions(-)

diff --git a/data/comparisons/imputations.yaml b/data/comparisons/imputations.yaml
index 856927d..c393222 100644
--- a/data/comparisons/imputations.yaml
+++ b/data/comparisons/imputations.yaml
@@ -15,12 +15,18 @@ imputations:
       Enhanced CPS first assigns `takes_up_snap_if_eligible` using a
       USDA take-up-rate prior while preserving CPS-reported SNAP
       recipients as take-up anchors. Calibration/local-area builds then
-      rerandomize the SNAP gate with block-level seeded draws and L0
-      reweighting governs final state and national SNAP totals.
+      rerandomize the SNAP gate with block-level seeded draws, and L0
+      reweighting solves a single constrained optimization that
+      simultaneously matches state and national SNAP outlay dollars
+      AND state-level SNAP recipient-household counts (so total
+      dollars and caseloads are both pinned to administrative targets
+      in the same pass, rather than benefits emerging from caseload
+      alignment).
     baseDataset: Enhanced CPS
     calibrationTargets:
-      - state-level SNAP dollars
-      - national SNAP outlays
+      - state-level SNAP outlay dollars
+      - national SNAP outlay dollars
+      - state-level SNAP recipient-household counts
     documentationUrl: https://github.com/PolicyEngine/policyengine-us-data
     reproducible: yes
     sources:
@@ -46,7 +52,12 @@ imputations:
       selects other eligible units by comparing participation
       probabilities to program-specific random numbers. Probabilities
       vary by unit type, benefit level, state, and citizenship; the
-      baseline is aligned to administrative caseload targets.
+      baseline is aligned to administrative caseload counts and
+      composition. Total benefit outlay dollars are an emergent
+      property (participation × rule-computed benefit) rather than an
+      explicit calibration target — they're approximated through the
+      benefit-band composition target, not solved for directly. See
+      Wheaton & Tran (Urban) on SNAP anti-poverty effects.
     baseDataset: CPS-ASEC
     calibrationTargets:
       - administrative caseload by state
@@ -160,6 +171,10 @@ imputations:
       employment, disability, and state-status predictors; baseline
       probabilities and random numbers are adjusted so true reporters
       are included and the caseload matches administrative targets.
+      Total TANF outlays are not an explicit calibration constraint —
+      they emerge from (caseload × rule-computed benefit). PE-US, by
+      contrast, includes TANF outlay dollars (state + national) and
+      TANF-recipient unit counts as simultaneous L0 targets.
     baseDataset: CPS-ASEC
     calibrationTargets:
       - administrative TANF caseload size

From 725fcfb06fa8d0cf05397d255dd2cf3ca3317e69 Mon Sep 17 00:00:00 2001
From: Max Ghenis <mghenis@gmail.com>
Date: Fri, 22 May 2026 06:28:30 -0400
Subject: [PATCH 2/2] Medicaid: spending target is derivative of per-capita
 constant, not formula
MIME-Version: 1.0
Content-Type: text/plain; charset=UTF-8
Content-Transfer-Encoding: 8bit

Per-program calibration framing needs one more distinction. PE-US's
SNAP / TANF / SSI rows have rule-computed per-unit benefits (benefit
= f(income, household size, deductions, etc.)), so calibrating BOTH
caseload counts AND dollar outlays is genuinely informative — the
two targets constrain different dimensions of the imputation.

Medicaid is different. PE-US has no per-individual benefit formula
in the model; per-capita spending is assigned as a CMS-derived
constant. So "Medicaid spending" effectively falls out as (enrolled
people × per-capita spend), and adding a separate L0 dollar target
provides little information beyond the enrollment-count target.

Updated the PE-US Medicaid imputation row to call this out
explicitly, and re-labelled the calibration targets to mark
enrollment counts as primary and national spending as derivative.

61/61 tests, lint clean.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
---
 data/comparisons/imputations.yaml | 19 ++++++++++++++-----
 1 file changed, 14 insertions(+), 5 deletions(-)

diff --git a/data/comparisons/imputations.yaml b/data/comparisons/imputations.yaml
index c393222..adc3484 100644
--- a/data/comparisons/imputations.yaml
+++ b/data/comparisons/imputations.yaml
@@ -252,13 +252,22 @@ imputations:
       Medicaid eligibility is computed rule-by-rule by state. The
       Enhanced CPS assigns `takes_up_medicaid_if_eligible` with
       state-specific KFF / MACPAC-derived priors and preserves reported
-      Medicaid coverage at interview as an enrollment anchor. Calibration
-      rerandomizes the Medicaid gate for take-up-affected targets and L0
-      reweighting governs final Medicaid enrollment counts and spending.
+      Medicaid coverage at interview as an enrollment anchor.
+      Calibration rerandomizes the Medicaid gate for take-up-affected
+      targets and L0 reweighting governs final Medicaid enrollment
+      counts. Important methodological caveat: unlike SNAP / TANF /
+      SSI where the per-unit benefit is a rule-computed dollar
+      amount (a function of income, household size, deductions),
+      Medicaid has no per-individual benefit formula in the model —
+      spending is assigned as a per-capita constant from CMS
+      administrative data. So "Medicaid spending" effectively falls
+      out as (enrolled people × per-capita spend) and a separate
+      L0 dollar target adds little information beyond the enrollment
+      count target.
     baseDataset: Enhanced CPS
     calibrationTargets:
-      - state and national Medicaid enrollment counts
-      - national Medicaid spending
+      - state and national Medicaid enrollment counts (primary)
+      - national Medicaid spending (derivative — = enrolled × per-capita constant)
     documentationUrl: https://github.com/PolicyEngine/policyengine-us-data
     reproducible: yes
     sources: