Make the model comparison properly reflect PolicyEngine — audited data + host-model UI#43
Open
MaxGhenis wants to merge 4 commits into
Open
Make the model comparison properly reflect PolicyEngine — audited data + host-model UI#43MaxGhenis wants to merge 4 commits into
MaxGhenis wants to merge 4 commits into
Conversation
Comparison tables now distinguish 'this model' from peers: the PolicyEngine column/row gets a teal tint and a 'This model' chip in the coverage matrix, validation benchmarks, behavioral tables, calibration, pipeline, and About panels. The coverage matrix pins the program column while scrolling horizontally, the compare drawer groups peers by sector with model-type sublabels, and validation rows show percent deviation from the administrative target (with GBP formatting for UK benchmarks). Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>
Corrects nine PolicyEngine US rows that no longer matched the code: estate tax is partial (the IRC 2001(c) schedule and unified credit are implemented over an exogenous taxable estate), OASI documents the full PIA/AIME benefit engine alongside the survey-reported microsim input, UI notes the NJ and PA benefit engines, LIHEAP corrects a nonexistent federal-module claim to the DC/IL/MA/TX(+Riverside) state programs, CCDF updates to the ~21-state ruleset, TANF quantifies 28 states + DC, SSI state supplements list all 20 modeled states, state EITC/CTC counts match the code (28+DC, 17+DC), and Section 8 is partial per the model's own metadata (AMI inputs cover selected geographies). Adds eight programs PolicyEngine models end to end that the matrix lacked — school meals, Pell Grant, Head Start, Lifeline, ACP, local income taxes (NYC/Philadelphia/MD), AMT, and NIIT — with statute citations and coverage rows grounded in variable paths and test counts. Also resolves peer cells: TRIM3 models LIHEAP (Urban overview + ASPE TRIM3 brief; the older boreas list is non-exhaustive) and ACA subsidies live in Urban's sibling HIPSM model rather than TRIM3; TPC treats SNAP/TANF/SSI as TRIM3-adjusted data inputs, not simulated programs; Tax-Calculator is federal-only (no state EITC/CTC); and entitledto includes the Scottish Child Payment. Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>
PolicyEngine US rows replace 'predicted value pending' placeholders with actual model runs (2026-06-09, versions noted per row) against 2024 administrative targets: SNAP +2.3% vs USDA FY2024, EITC -3.2% vs IRS TY2024, Medicaid enrollment +5.5% vs CMS December 2024, CTC +11.5% vs the latest complete IRS SOI total, and SSI -13%/-22% with the documented take-up gap stated plainly. Income tax is deliberately excluded: the current enhanced CPS build overstates AGI via an inflated miscellaneous_income imputation (tracked upstream in policyengine-us-data#1107), and publishing that number would misrepresent the model. Peer benchmarks give the page independent grounding: the Census Bureau's evaluation of TRIM3 (87% of the IRS income-tax target, 73% of EITC, 101% of CTC) and TAXSIM (88%, 73%) for TY2012, Tax-Calculator's sub-dollar agreement with TAXSIM-35, and six UKMOD simulated-vs-official 2023 benchmarks from the CeMPA country report, including the documented 37% Housing Benefit shortfall. Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>
…e facts Usage rows replace blanket unknowns with verified, sourced facts: HM Treasury's Algorithmic Transparency Record documenting its PolicyEngine UK pilot (with The Times coverage), the Nuffield Foundation grant, a US House press release citing PolicyEngine estimates, the Niskanen Center CTC report, MyFriendBen and Benefit Navigator API integrations, the NBER TAXSIM-emulator MoU, and exact GitHub/PyPI counts retrieved 2026-06-09. Transparency rows fill contributor counts (132 US / 23 UK) and test counts (21,697 / 1,084 named cases) with API-backed sources. The JOSS citation is corrected everywhere: DOI 10.21105/joss.04494 belongs to an unrelated paper; PolicyEngine's JOSS submission is under review, so the rows now link the actual review thread and academicCitations stays unknown. Behavioral rows add the optional CBO-derived elasticity presets (income -0.05, substitution 0.22-0.31 by decile, capital gains -0.79) clearly marked off-by-default, complementing the existing static-default rows. Modeling mechanics add state/congressional-district and local-authority/ constituency calibrated weights, the continuous test suite + TAXSIM cross-validation, and the off-by-default behavioral-response design. Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>
|
The latest updates on your projects. Learn more about Vercel for GitHub.
|
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Improves the open microsimulation reference so it accurately and richly represents PolicyEngine alongside its peers, on both the data and UI sides.
Make the comparison reflect the actual model
Every PolicyEngine row was audited against the model code at v1.722.5 (2026-06-09):
partial, notnot-implemented(IRC §2001(c) schedule + §2010 unified credit are implemented over an exogenous taxable estate); OASI notes now document the full PIA/AIME benefit engine (88 test cases) alongside the survey-reported microsim input; UI notes the NJ + PA benefit-determination engines; LIHEAP drops a nonexistent federal-module claim in favor of the actual DC/IL/MA/TX(+Riverside County) implementations; CCDF (~21 states), TANF (28 + DC), SSI state supplements (20 states), state EITC (28 + DC), and state CTC (17 + DC) counts now match the code; Section 8 ispartialper the model's own metadata.miscellaneous_incomeimputation (tracked in enhanced_cps_2024 overshoots CBO income_tax target by ~1.86x across 2024-2026 — loss weighting drowns out aggregate targets policyengine-us-data#1107), so publishing it would misrepresent the model.academicCitations: unknown.Fairer, deeper peer rows
UI: the host model reads as "ours"
Tests: 77 passing (11 new),
bun run buildclean,eslint .clean.🤖 Generated with Claude Code