Make microimpute.Imputer the canonical regime-gated sequential imputer by MaxGhenis · Pull Request #193 · PolicyEngine/microimpute

MaxGhenis · 2026-06-06T06:22:28Z

Summary

Makes microimpute.Imputer the canonical, opinionated imputer — sign-regime gating + QRF base + sequential chained-equations imputation, all on by default — and bakes in what we know works while exposing knobs only for what's worth A/B-testing.

This started as "add chaining to ZeroInflatedImputer" and grew into the right API:

Sequential chaining is always on (and was the missing piece). Imputing a list of targets conditions each on the previously-imputed ones, so the imputed vector preserves cross-variable joint structure. The old per-variable-independent path — which was accidentally comonotonic (each target's base QRF is seeded identically, so they drew the same per-row quantile) — is gone, along with its sequential flag. Turning chaining off is never what you want.
signregime: bool = True — the {neg, 0, pos} gate (which fixes the "QRF on y>0 drops the negative tail" bug, e.g. capital/business losses). signregime=False imputes with the base model directly, for comparison.
base_imputer_class (default QRF) — the model knob, for experiments (OLS, MDN, …).
Rename: ZeroInflatedImputer → Imputer (it was a misnomer — it's a regime-gated/hurdle model, not zero-inflation); the old abstract base Imputer → BaseImputer (still exported). Breaking — see migration below.
Lineage: the fitted result's .lineage() returns per-variable VariableLineage: regime, the chained predictor set, training-support counts, the fitted models, and feature importances where the model exposes them.

Migration

from microimpute.models.zero_inflated import ZeroInflatedImputer → from microimpute import Imputer
references to the old base class Imputer (subclassing / isinstance) → BaseImputer

Testing

test_regime_gated_chaining.py: a target pair correlated only through an unobserved latent factor is recovered by one chained list call (corr −0.92 vs true −0.93) but not by separate per-variable calls (the old microplex per-column pattern); .lineage() reports chained predictors/metrics/importances; signregime=False disables gating.
Full suite: 312 passed, 3 skipped. The 8 test_autoimpute failures are pre-existing — they reference the optional rpy2-backed Matching imputer, absent in this env.
black --line-length 79 + isort clean.

🤖 Generated with Claude Code

When imputing a list of targets, each numeric target is now conditioned on the original predictors plus the previously-imputed targets, so the imputed vector preserves cross-variable joint structure instead of imputing each variable independently. This is the correct way to reproduce dependence that runs through the targets themselves (e.g. tax components on the same return) rather than only through the shared predictors. New `sequential` parameter (default True); single-target lists unaffected. Added a test that a target pair correlated only through an unobserved latent factor is recovered by chaining (corr -0.92 vs true -0.93) but not by independent per-variable imputation (-0.21). Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>

vercel · 2026-06-06T06:22:31Z

The latest updates on your projects. Learn more about Vercel for GitHub.

Project	Deployment	Actions	Updated (UTC)
microimpute-dashboard	Ready	Preview, Comment	Jun 6, 2026 6:29am

vercel Bot deployed to Preview June 6, 2026 06:23 View deployment

Format zero-inflated chaining changes

49a831b

MaxGhenis mentioned this pull request Jun 6, 2026

Chain numeric zero-inflated imputations #194

Closed

vercel Bot deployed to Preview June 6, 2026 06:29 View deployment

MaxGhenis marked this pull request as ready for review June 6, 2026 06:42

MaxGhenis merged commit 36c10ba into main Jun 6, 2026
7 checks passed

MaxGhenis changed the title ~~Sequential (chained-equations) imputation in ZeroInflatedImputer~~ Make microimpute.Imputer the canonical regime-gated sequential imputer Jun 6, 2026

MaxGhenis mentioned this pull request Jun 6, 2026

Make microimpute.Imputer the canonical regime-gated sequential imputer #196

Draft

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Make microimpute.Imputer the canonical regime-gated sequential imputer#193

Make microimpute.Imputer the canonical regime-gated sequential imputer#193
MaxGhenis merged 2 commits into
mainfrom
claude/zeroinflated-chaining

MaxGhenis commented Jun 6, 2026 •

edited

Loading

Uh oh!

vercel Bot commented Jun 6, 2026 •

edited

Loading

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

MaxGhenis commented Jun 6, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Summary

Migration

Testing

Uh oh!

vercel Bot commented Jun 6, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

MaxGhenis commented Jun 6, 2026 •

edited

Loading

vercel Bot commented Jun 6, 2026 •

edited

Loading