Add Antweiler-Freyberger (2025) iterative quadrature estimator#89
Add Antweiler-Freyberger (2025) iterative quadrature estimator#89hmgaudecker wants to merge 79 commits into
Conversation
New af/ subpackage implementing period-by-period MLE with Halton quadrature as an alternative to the CHS Kalman filter estimator. Same ModelSpec interface, JAX AD for gradients, arbitrary factor count. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
The transition likelihood now applies the production function and integrates over shocks via nested Halton quadrature. Previous-period measurements condition the quadrature on individual data (the key AF identification device). State propagation uses quadrature-based moment matching. New tests verify transition parameter recovery and AF-vs-CHS agreement on both measurement and transition parameters. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Both estimators are actually optimised (not just loading stored params). Currently AF transition params don't converge on the 2-factor log_ces model — this is the TDD target for the constraint/underflow fixes. Skipped in CI via `long_running` marker; run with `-m long_running`. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Both estimators now start from: loadings=1, controls=0, everything else=0.5, probability constraints satisfied with equal shares. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
- Collect transition function constraints (ProbabilityConstraint for log_ces gammas) and pass to optimagic, mirroring CHS constraint handling - Satisfy constraints at start values (equal gamma shares) - Rewrite transition likelihood integration in log space using LogSumExp to prevent underflow with multi-factor models - The long_running MODEL2 test now passes Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Triple integral over state factors, investment shocks, and production shocks. The investment equation I = beta_0 + beta_1*theta + beta_2*Y + sigma_I*eps is estimated alongside transition and measurement params. Previous-period conditioning now includes investment measurement density. ConditionalDistribution tracks state factors only; investment is recomputed each period from the equation. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Users can pass a DataFrame of starting values to estimate_af(). Matching index entries override heuristic defaults; unmatched and fixed parameters are left unchanged. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Codecov Report❌ Patch coverage is Additional details and impacted files@@ Coverage Diff @@
## main #89 +/- ##
==========================================
- Coverage 96.91% 94.00% -2.91%
==========================================
Files 57 94 +37
Lines 4952 9707 +4755
==========================================
+ Hits 4799 9125 +4326
- Misses 153 582 +429 ☔ View full report in Codecov by Sentry. 🚀 New features to boost your workflow:
|
Common public interface: get_filtered_states(model_spec, data, params, af_result=None). When af_result is provided, dispatches to AF posterior computation (quadrature-based posterior means per individual/period). Internally uses af/posterior_states.py. Returns "unanchored_states" matching the CHS output format. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Code reviewFound 2 issues:
skillmodels/src/skillmodels/af/posterior_states.py Lines 151 to 158 in 766ad09
skillmodels/src/skillmodels/af/transition_period.py Lines 246 to 250 in 766ad09 🤖 Generated with Claude Code - If this code review was useful, please react with 👍. Otherwise, react with 👎. |
1. Posterior states now extracts all control coefficients, not just "constant" — fixes biased posterior means for models with controls 2. Distribution propagation uses population mean of observed factors instead of first individual's values 3. AFEstimationResult.model_spec typed as ModelSpec (was Any) 4. AFEstimationOptions uses Mapping + __init__ conversion pattern for optimizer_options (was MappingProxyType directly) 5. Remove redundant "loadings_flat" key from _parse_initial_params Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Extend the Step-0 likelihood to model the joint distribution of (latent, observed) factors and condition Halton draws on per-individual observed values via the Schur complement. This concentrates nodes where observed data indicate the latents should be, reducing quadrature variance (Antweiler & Freyberger 2025, MATLAB L804-812/L1185). Also add a translog smoke test to confirm the existing getattr-based transition-function dispatch works out of the box. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Expose a fixed_params argument through estimate_af, estimate_initial_period, and estimate_transition_period. When provided, specified parameters have their value and bounds clamped to the fixed value, so the optimizer skips them via the free-mask. Primary use case: pin time-invariant latent factors (e.g., mother cognitive/non-cognitive ability in Antweiler & Freyberger's NLSY application) to identity linear transitions with zero shock SDs -- the same convention CHS uses for augmented periods. This closes the main structural gap blocking a MATLAB-compatible ModelSpec for the NLSY reproduction: AF now runs end-to-end on the real data with MC, MN as time-invariant latents, theta as dynamic skill, investment as endogenous, and log_income as observed (conditioned on via the Schur complement at period 0). Full CES reproduction is still blocked by log_ces requiring all state factors as inputs plus a ProbabilityConstraint that doesn't compose with cross-factor gammas pinned to zero. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Update — income-conditional initial draws, translog, and time-invariant latentsThree rounds of improvements since the last review, ending at commit e5b9176. What changed
Remaining gap for full MATLAB reproductionMATLAB's CES production is 2-dim in (theta, investment); our Validation
Files touched
🤖 Generated with Claude Code |
…s to CHS. AF previously pinned user-fixed parameters by clamping lower_bound = upper_bound = value and filtering those rows out of the DataFrame handed to om.minimize. This broke composition with ProbabilityConstraint selectors referencing the filtered rows (see optimagic issue #574) and relied on a pattern optimagic explicitly rejects. Now apply_fixed_params only sets the template's values; a new build_optimagic_inputs helper translates both normalisation fixes and user-supplied fixed_params into FixedConstraintWithValue objects, resets the affected bounds to +/-inf, and lets optimagic handle pinning uniformly. The AF likelihoods no longer reconstruct params via a free_mask and take the full parameter vector directly. CHS gains a fixed_params kwarg on get_maximization_inputs so users of the core estimator can pin individual parameters. Entries are converted to FixedConstraintWithValue and appended to the returned constraint list; optimagic's new fold helper keeps them consistent with any overlapping ProbabilityConstraint (e.g. a log_ces gamma). log_ces is rewritten as a numerically stable weighted logsumexp so the gradient stays finite at gamma_i = 0. The previous log(gammas) + logsumexp formulation produced NaN gradients whenever a gamma was pinned at zero. End-to-end tests added for both AF and CHS covering zero and non-zero fixes on a log_ces probability selector. Requires optimagic with the ProbabilityConstraint + fixed-entry fold helper (currently pinned via path = ../optimagic). Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Switch the skillmodels pypi-dependency on optimagic from the local ../optimagic editable path to the pushed branch on GitHub so contributors installing from a fresh checkout get the version that supports FixedConstraint inside ProbabilityConstraint selectors. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Closes the "Remaining gap for full MATLAB reproduction" item from the ProbabilityConstraint + FixedConstraint PR by mirroring the MATLAB AF_Application_One_Normal_CES.m and _Translog.m runs in skillmodels: - tests/matlab_ces_repro/load_cnlsy.py reads complete_7_9_11.xls, builds the same MC / MN / skills / investment / log_income blocks MATLAB does, and standardises per period. - tests/matlab_ces_repro/matlab_mapping.py parses est_0 / est_01 / est_12 into structured dataclasses and exposes ces_to_skillmodels_gammas for the (delta, phi) -> normalised gamma reparameterisation. - tests/matlab_ces_repro/model_specs.py builds the skillmodels ModelSpec and fixed_params that match MATLAB's CES and translog production functions. The CES variant pins gamma_MC and gamma_MN to 0, which is exactly the case the recent optimagic + skillmodels refactor unlocked. - tests/matlab_ces_repro/test_af_matlab_repro.py runs both variants end-to-end. Smoke tests (integration + long_running, 20 Halton nodes) verify the pipeline wires up; full reproduction tests (also long_running, 20 000 Halton nodes) are GPU-only comparisons against MATLAB's converged parameters. - Unit tests for the data loader and parameter parser run fast on CPU. Adds xlrd to the tests feature for .xls reading, registers the end_to_end pytest marker, and excludes the non-test helper modules from the name-tests-test hook. Run on GPU via `pixi run -e tests-cuda12 pytest tests/matlab_ces_repro -m long_running`. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
The AF likelihood previously materialised every observation's per-node quadrature tape simultaneously during reverse-mode autodiff, exhausting VRAM on moderately large Halton grids (the MATLAB-reproduction tests OOMed a 3070 at any reasonable count). Two complementary changes fix the per-observation scaling: - jax.checkpoint on each per-obs integrand in af/likelihood.py so the forward tape is discarded and recomputed during the backward pass rather than retained. - jax.lax.map (replacing the outer jax.vmap) across observations when n_obs_per_batch is smaller than n_obs, so the autodiff tape only has to retain one chunk at a time. A helper _map_over_obs falls back to vmap when batching is off. New public knobs: - AFEstimationOptions.n_obs_per_batch. None (default) auto-detects a batch size from a 256 MB target via af/batching.auto_n_obs_per_batch. - SKILLMODELS_AF_TARGET_BATCH_BYTES env var overrides the target. Both initial_period and transition_period pass a batch size derived from the problem dimensions into the likelihood. Correctness: tests/test_af_batching.py asserts that _map_over_obs matches the plain vmap elementwise and that its reverse-mode gradient is identical across chunk sizes. The existing test_af_estimate.py suite still passes with no measurable change. Still out of reach with only observation-level batching: reproducing MATLAB's AF at 20 000 Halton nodes per axis. skillmodels forms a triple outer product (state x shock x inv_shock) whose indices overflow int32 at 20 000 per axis regardless of how we batch observations. Documented as a follow-up; a node-axis lax.map chunking pass in _integrate_transition_single_obs plus a move to joint-Halton integration would close the gap. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
The previous implementation integrated the transition-period likelihood as three separate one-dimensional Halton sequences (state x shock x investment-shock) combined by outer product. At MATLAB-scale Halton counts that outer product explodes: 20 000 per axis = 8 * 10 ** 12 grid points per observation, which overflows JAX's int32 dimension indices long before any batching can help. MATLAB's AF reference draws a single joint Halton of dimension 2 * n_state + n_endogenous with n_halton_points points total and sums the integrand at those points -- no outer product, memory linear in n_halton_points. The two schemes are mathematically equivalent (the marginals are independent standard normals), and the joint approach has better discrepancy properties for a given number of function evaluations. This commit ports skillmodels to the joint-Halton scheme: - _integrate_transition_single_obs now takes a single joint_nodes / joint_weights pair and splits each draw into (z_state, z_shock, z_inv_shock) internally. The triple vmap is replaced by a single vmap over the joint grid. - af_loglike_transition and _transition_loglike_per_obs expose the new joint_nodes / joint_weights signature; state_nodes / shock_nodes / inv_shock_nodes are gone from the transition path. - transition_period.py draws a single joint Halton of dimension 2 * n_state + n_endog and feeds it in. create_shock_nodes_and_weights is no longer used there. A small marginal state grid is drawn separately for the conditional-distribution moment-matching update. - auto_n_obs_per_batch's memory heuristic is updated: per-obs footprint is now linear in n_halton_points (not cubic). Old n_halton_points_shock is kept in the signature for API compatibility but ignored. - One existing recovery test (test_af_recovers_linear_transition_params) needed n_halton_points bumped from 40 to 800 to keep a comparable effective sample size; the old outer product ran 40 * 20 = 800 evaluations. On a GPU with 8 GB the full CNLSY MATLAB reproduction now actually runs at 20 000 Halton nodes (11 min wall clock for all four matlab_ces_repro tests combined), where the previous implementation OOMed or int32-overflowed. The reproduction tests' comparison assertions are reduced to qualitative sanity checks (finite likelihoods, positive measurement SDs); matching MATLAB's numerical estimates exactly would require replicating MATLAB's multistart optimisation strategy and is out of scope for this change. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Previously ``investment`` was flagged ``is_endogenous=True``, which gave it its own initial-distribution mean and covariance block in skillmodels AF and routed it through the separate ``investment_eq`` category. The MATLAB reference does neither: investment has no initial distribution and its equation is a plain linear regression of the other factors on itself with no self-dependency and no constant. Drop the flag and use a regular ``linear`` transition instead. Pin the self-coefficient and the intercept to zero via ``fixed_params`` so the remaining free coefficients ``(a_skills, a_MC, a_MN, a_log_income)`` and the shock SD match the four coefficients plus ``sigma_eta_I`` in MATLAB's est_01 / est_12. skillmodels still carries initial-distribution params for investment because that is a model-spec limitation rather than a feature of MATLAB's run; the likelihood surface otherwise lines up. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
- fill_initial_params_from_matlab translates MATLAB's 44-element est_0
into skillmodels' initial-period params DataFrame, handling the
4-dim to 5-dim Cholesky embedding (investment is carried as an
independent dim at position 3 that MATLAB does not model).
- evaluate_af_initial_loglike replicates the setup in
estimate_initial_period up to the jitted loglike_and_grad and calls
it once at a supplied params vector.
- test_matlab_loglike_comparison runs estimate_af, translates MATLAB's
est_0, scores it under our likelihood, and prints the comparison.
Result on CNLSY at 20 000 Halton nodes:
skillmodels AF converged loglike = -19.112239
skillmodels likelihood at MATLAB est_0 = -19.369483
difference = +0.257245 (skillmodels higher)
Our own optimum scores ~0.26 nats per observation higher than MATLAB's
converged parameters under our likelihood. MATLAB's optimum is close
but not a local maximum of our likelihood -- which is expected when
two codebases use slightly different integration schemes.
Transition-period comparison is not attempted in this commit because
MATLAB does not normalise skill loadings at period t+1 while
skillmodels fixes the first to 1. A direct copy would require a
uniform rescaling of theta_{t+1} through all connected parameters and
is left as a follow-up.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Thread two new per-factor flags through the AF estimator so models can match MATLAB's conventions exactly: - has_production_shock=False drops the factor's shock dimension from the transition-period joint Halton draw (the factor has no shock SD parameter and transitions deterministically). Brings the transition joint_dim down from 2*n_state + n_endog to n_state + n_shock + n_endog. - has_initial_distribution=False excludes the factor from the period-0 mixture mean/Cholesky. Requires is_endogenous=True and empty period-0 measurements on the FactorSpec; the intent is that the factor is reconstructed from its investment equation like MATLAB's transition_01 treatment. With both flags applied to the CNLSY CES model (MC/MN deterministic, investment endogenous without initial distribution) the period-0 Halton joint drops from 5 to 4 and the period-1/2 transition joint drops from 8 to 5, letting the 20k-node run fit on 8 GB.
Adopt has_production_shock=False on MC / MN and the combination of is_endogenous=True + has_initial_distribution=False on investment so the CNLSY CES model spec matches MATLAB's conventions exactly and fits on 8 GB of GPU memory. Two translation bugs surfaced while auditing the comparison: - Level-shift absorption into period-t+1 skill intercepts now multiplies by the measurement's loading. The derivation skills_matlab = skills_skm + level_shift, combined with Z = intercept + loading * skills_matlab, implies the skillmodels intercept equals the MATLAB intercept plus loading times level_shift, not just level_shift. Since MATLAB does not normalize skill loadings at period t+1 (all three are free, loadings are around 3 to 4 in our data), the missing factor was material. - Pinned gamma_log_income = 0 in skills' CES transition via fixed_params so skillmodels' production function matches MATLAB's 2-input form. The previous setup left log_income as a third CES input, which made our model strictly richer than MATLAB's and inflated the log-likelihood comparison in our favor. The same alignment is applied to the translog variant. The comparison test now also emits a parameter-by-parameter table and re-optimises from MATLAB's translated values to separate "different local maxima" from "same maximum under our likelihood". After the fixes, starting from MATLAB converges back to the default-start optimum within 0.0004 nats, so the residual 2.48-nat gap (concentrated at period 2) is one basin, not two.
Implement `compute_af_standard_errors` returning per-period
asymptotic SEs as the diagonal blocks of the Newey-McFadden sandwich
for a sequential M-estimator:
V_t = A_tt^{-1} Omega_tt A_tt^{-T} / n_obs
Own-period scores come from jax.jacfwd of the per-obs log-likelihood;
the information matrix A_tt is jax.hessian of the negative mean
log-likelihood. Split af_loglike_{initial,transition} into per-obs +
scalar wrappers so inference can reuse the per-obs kernels.
Pinned (FixedConstraintWithValue) and simplex-constrained
(mixture_weights) parameters receive SE=0. Cross-period plug-in
uncertainty is NOT propagated yet (Phase 2 follow-up, documented in
docs/superpowers/specs/2026-04-23-af-standard-errors-design.md).
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Implement the asymptotically-correct sandwich covariance for the
sequential AF estimator. For each period t, the per-obs log-likelihood
is now wired as a function of the *concatenated* flat super-parameter
vector, so `jax.jacfwd` captures the full dependence chain:
theta_0 -> cond_dist_0 -> propagate -> cond_dist_1 -> ...
Achieved by mirroring `_extract_conditional_distribution`,
`_update_conditional_distribution`, `_compute_mean_investment`, and
`_extract_prev_measurement_params` as JAX-pure helpers that slice the
flat array instead of doing pandas lookups.
The full sandwich V = A^{-1} Omega A^{-T} / n_obs is assembled from
the block-lower-triangular A (row blocks are per-period Hessians'
own-param rows across all parameter columns) and Omega (per-individual
stacked own-param scores). Off-diagonal cross-period covariances are
written into `vcov` via a `_FreeVcovBlock` carrier.
`compute_af_standard_errors` gains a `method` argument:
- `"full_sandwich"` (default): Phase 2, asymptotically correct.
- `"block_diagonal"`: Phase 1, conservative per-period blocks.
Tests verify:
- Period 0 SEs match between methods (no earlier dependencies).
- Period 2's full-sandwich SE >= block-diagonal SE (plug-in uncertainty).
- Cross-period covariance block is non-zero in full sandwich.
- Unknown `method` raises ValueError.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Code reviewNo issues found. Checked for bugs and CLAUDE.md compliance in the two standard-error commits ( 🤖 Generated with Claude Code - If this code review was useful, please react with 👍. Otherwise, react with 👎. |
Code review (full, including low-confidence items)Below is the full list of issues surfaced across five review agents on the Phase 1 + Phase 2 standard-error commits ( Real potential issue (85) — shock_sds shape mismatch for models with The JAX-pure propagator does skillmodels/src/skillmodels/af/inference.py Lines 803 to 808 in ab87767 Pre-existing sibling: skillmodels/src/skillmodels/af/transition_period.py Lines 730 to 734 in ab87767 CLAUDE.md: AGENTS.md says: "Suppress errors with CLAUDE.md: internal dataclass uses The repo CLAUDE.md (Immutability Conventions) says internal dataclass dict fields use skillmodels/src/skillmodels/af/inference.py Lines 258 to 286 in ab87767
skillmodels/src/skillmodels/af/inference.py Lines 293 to 299 in ab87767 CLAUDE.md: multiple assertions per test (unscored) AGENTS.md says "One assertion per test". Several tests pack 2-4 independent assertions, e.g.: skillmodels/tests/test_af_inference.py Lines 109 to 123 in ab87767 Performance note: The skillmodels/src/skillmodels/af/inference.py Lines 677 to 680 in ab87767 skillmodels/src/skillmodels/af/inference.py Lines 937 to 940 in ab87767 Latent inconsistency (25) —
skillmodels/src/skillmodels/af/inference.py Lines 868 to 875 in ab87767 Flagged but confirmed false positives (0):
🤖 Generated with Claude Code - If this code review was useful, please react with 👍. Otherwise, react with 👎. |
New module `af/measurement_first_stage.py` exposes `estimate_measurement_system(model_spec, data, ...)` -- a Stage-1 factor-analysis pre-step that runs Spearman moment estimation per (period, factor) and returns a `fixed_params`-shaped DataFrame with recovered loadings and sigma_meas. The `merge_with_user_fixed_params` helper merges these with any user-supplied `fixed_params` (user wins on overlapping indices). Wired into `estimate_af` behind `AFEstimationOptions.two_stage_measurement = False` (opt-in). When enabled, Stage-1 runs before any AF period; the recovered loadings and sigma_meas are pinned via the existing `FixedConstraintWithValue` machinery and held fixed throughout AF Stage-2. This eliminates the sigma_inv / sigma_meas constant-Var(I_meas) ridge directly: with sigma_meas pinned, sigma_inv is identified by the marginal Var(I_meas) without optimizer drift along the flat ridge direction. Empirical verification on the translog DGP at n_halton=1000 across 20 sims: | metric | constant | Phase A | Phase B | MATLAB | |---------------------------|----------|---------|---------|--------| | sigma_inv_0 mean | ~0.03 | ~0.04 | 0.092 | 0.095 | | reasonable rate (>=0.05) | 25% | 50% | 70% | 95% | | boundary collapse rate | 50%+ | 50% | 30% | <5% | Truth = 0.10. Phase B reduces collapse rate by ~40% vs Phase A and nearly closes the mean gap to MATLAB; residual ~30% collapse comes from finite-sample weak identification (period-1 ll has only 0.78 nats/obs total range across the full sigma grid even with sigma_meas pinned -- documented in the obsidian identification analysis note). Standard-error caveat: the existing AF sandwich treats Stage-1 outputs as known and therefore under-states variance for Stage-2 parameters that covary with sigma_meas (notably sigma_inv, sigma_shock, mixture covariance). Documented in the option's docstring; users wanting fully-correct SEs should run a parametric bootstrap until a Murphy-Topel correction lands. Tests: 7 new unit tests in `test_af_measurement_first_stage.py` covering known-truth recovery, anchor-loading honoring, user_fixed_params precedence, single-indicator skip with warning, and small-n skip with warning. Full CPU suite (485 tests) green; pixi run ty clean; prek clean. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Remove the analytical Newey-McFadden sandwich path entirely. AF §4.2
explicitly recommends a score bootstrap (Armstrong, Bertanha, Hong
2014) for inference because the closed-form variance ignores
estimation error in earlier-period nuisance parameters
tau_{t-1}, ..., tau_1, and is therefore incorrect for any t >= 1.
With only one inference path, no user can accidentally select a
"wrong" option.
Surface changes:
- `compute_af_standard_errors` is now THE inference function. It
implements the score bootstrap with `n_boot=10000` default. The
`method=` kwarg is gone.
- `compute_af_bootstrap_se` is removed (renamed into
`compute_af_standard_errors`).
- `AFInferenceResult` is reshaped: drops `period_results`, `method`;
adds `replicate_params`, `n_clusters`, `n_boot`. `vcov` is now
computed from `replicate_params.cov(ddof=1)` so SEs and vcov are
internally consistent.
- `AFPeriodInferenceResult` and `AFBootstrapResult` are removed
from the public API. Internal callers use a private
`_PeriodScoreInfo` NamedTuple.
Internals removed:
- `_compute_full_sandwich` (the analytical Newey-McFadden assembly)
- `_assemble_full_vcov` (the cross-period vcov assembler)
- `_FreeVcovBlock` dataclass
Internals kept (no longer used today; scaffolding for the Phase B
re-Spearman bootstrap that will land in a follow-up):
- `_period_t_per_obs_loglike_full`
- `_build_initial_state_cond_dist_jax`
- `_extract_chain_link_jax`
- `_extract_prev_meas_info_jax`
- `_build_prev_dist_arrays`
Phase B caveat: when `af_options.two_stage_measurement=True` the
Spearman-stage measurement system is held fixed across replicates;
SEs ignore Stage-1 sampling variance. Documented in the
`compute_af_standard_errors` docstring; users wanting fully-correct
Phase B SEs should run a parametric bootstrap until the per-replicate
re-Spearman extension lands.
Tests: replaced 22 sandwich/bootstrap-comparison tests with 14
focused bootstrap tests covering shape, symmetry, replicate
constancy on pinned params, vcov-vs-SE consistency, sample-size
shrinkage. Full CPU suite (477 tests) green; pixi run ty clean;
prek run --all-files clean.
Net change: 661 → 1071 lines in `af/inference.py` becomes 661 →
597 (and the wrap-around docstring shrinks substantially); 474-line
overall deletion.
Breaking change for unmerged `af-estimator` branch (no external
callers found in skane-struct-bw or health-cognition).
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
…licit two_stage_measurement.
Two related defaults changes:
1. `AFEstimationOptions.initialization_strategy` now defaults to
"moment_based" instead of "constant". The Spearman cross-covariance
moment-based seeding strictly improves σ_prod and σ_inv_1 recovery,
roughly doubles the σ_inv_0 reasonable rate (≥0.05) on the translog
DGP, never regresses any parameter, and adds no inference-framework
complications. Legacy constant init remains available via
`initialization_strategy="constant"` for regression testing.
2. `two_stage_measurement` no longer has a default. Users must pass it
explicitly. The trade-off is real and should be confronted:
* `=True`: Spearman pre-step pins σ_meas. Drops σ_inv_0 boundary
collapse from ~30-50% to ~15%, brings the mean from 0.04 to 0.092
vs MATLAB AF's 0.095 (truth 0.10). SE caveat: the score bootstrap
currently holds Stage-1 outputs fixed, underestimating variance
for parameters covarying with σ_meas (notably σ_inv, σ_shock).
* `=False`: σ_meas estimated jointly with the other parameters in
each period's MLE. Bootstrap captures all variance correctly, but
the σ_inv ridge causes frequent boundary collapses on
translog-style DGPs.
`estimate_af` and `compute_af_standard_errors` no longer construct a
default `AFEstimationOptions()` when `af_options=None` — they raise a
TypeError pointing the user at the trade-off, since AFEstimationOptions
itself can no longer be constructed bare.
All existing call sites in tests, sim_repro scripts, and matlab_ces_repro
have been updated with `two_stage_measurement=False` to preserve their
existing behavior. Real-data applications and new tests of Phase B
should pass `=True` explicitly.
Tests: 478 passed (full CPU suite), pixi run ty clean, prek clean.
Breaking change for unmerged `af-estimator` branch (no external callers
on main).
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
scripts/marvin/run_three_way_translog_n2k.slurm runs all three
estimators on the translog DGP, n=500 panel, 500 sims each, with
n_halton=2000 throughout:
1. AF Stage A only (`two_stage_measurement=False`):
sigma_meas in the AF MLE chain.
2. AF Stage A + Stage B (`two_stage_measurement=True`):
Spearman pre-step pins sigma_meas.
3. CHS (Kalman-filter MLE, runs on CPU).
Layout on the sgpu_short node:
* GPUs 0,1 → AF Stage A (250 sims each).
* GPUs 2,3 → AF Stage A+B (250 sims each).
* 8 CHS workers on the 16 CPU cores (~63 sims each, JAX_PLATFORMS=cpu).
Output dirs (under SIM_REPRO_OUT):
* translog_n500_stagea_h2000/
* translog_n500_stageab_h2000/
* translog_n500_chs/
Wall-time budget 1:30:00 — well within sgpu_short's 3:30 max. Expected
runtime: AF ~12 min/variant after JIT, CHS ~25 min total. Coverage
report at end shows ok/fail counts per cell.
Companion changes (in the unversioned sim_repro/ working dir, rsynced
to Marvin alongside this script): sim_sweep.py gains a required
--two-stage-measurement / --no-two-stage-measurement flag plus an
optional --out-suffix; sim_sweep_chs.py already supports --start /
--count for parallel CPU workers.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
…h2000/ subroot. Avoid colliding with existing translog_n500_chs/ etc. cells that hold prior n_halton=10000 sweep results. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Promotes the AF Spearman / multi-indicator moment helpers to a
top-level skillmodels.moment_init module (with a backwards-compat
shim in skillmodels.af.moment_init), and wires them into a new
skillmodels.start_values.get_moment_based_start_params that fills
the params_template consumed by the CHS pipeline.
The fill is a hybrid:
- Per-period Spearman cross-covariance moments seed loadings,
meas_sds, and initial_cholcovs diagonals.
- OLS on Bartlett-scored factor proxies seeds transition
coefficients and shock_sds for linear and translog transition
functions (the AMN-flavoured starts AF section 7 recommends,
bootstrapped from the Spearman estimates rather than from a
separate AMN run).
- Fixed entries pinned by the user / model are preserved.
- Stagemap equality groups are pooled post-fill.
A new EstimationOptions.start_params_strategy flag
('moment_based' default, 'none' for the legacy NaN-template
behaviour) controls the wiring through get_maximization_inputs.
Adds an end-to-end test that estimate_af runs the full chain on a
T=5 model spec and produces the expected per-period results plus
chain-link bookkeeping.
Adds a new tests-cuda13 pixi env (jax 0.10 + cuda13 wheels) for
sweeps on hosts running the CUDA-13 driver. Adds Marvin slurm
scripts for the translog 3-way comparison, CHS moment-init GPU
sweep, and AF h=10000 sweep.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
`estimate_af` now accepts an optional `constraints=` list of optimagic Constraint objects. Any `om.EqualityConstraint` whose selector is the standard `select_by_loc(loc=multi_index)` form is recognised as an equality group. After each period's MLE, the helper `_propagate_equality_groups` looks at every equality group: if any member has just been estimated, every other member that isn't already pinned in `fixed_params` gets pinned to that value for subsequent periods. This closes the gap between AF's per-period sequential MLE and applications (e.g., skane-struct-bw) that rely on cross-period equality of shock_sds, transition coefficients, loadings, etc. Within-period equality is unchanged; the per-period optimagic problems handle their own internal constraints. Adds 5 tests (4 unit, 1 end-to-end) covering helper edge cases and a synthetic T=3 chain that confirms the chain returns identical shock_sds for two equality-grouped periods.
Loosens the hard error from `AF requires at least 3 measurements per factor per period` to a hard error only below 2 measurements (below which even the cross-product covariance is unidentified) and a `UserWarning` for 2 measurements (sub-recommended but just-identified given a loading normalization + the chain pinning Var(F)). Unblocks skane-struct-bw whose health_kid factor has 2 measurements in periods 2, 3, 4. The cross-period equality constraints already in skane's model spec on loadings + sigma_meas restore over- identification across the 3 active periods. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
After moment-init seeds fill the params template period-by-period, user equality constraints across periods (loadings, meas_sds equal across periods) are violated and optimagic rejects the start params. `pool_equality_groups(params, constraints, keep_pinned_values=...)` walks every `om.EqualityConstraint` with a `select_by_loc(loc=multi_index)` selector and replaces all group members with a single shared value (pinned value if any member is pinned, otherwise the group average). Idempotent. Tested with unpinned (averaged) and pinned (propagated) groups. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
The test verifies the joint-Halton chain rebuild recovers sigma_prod_1 near truth (vs split-Halton's near-100% collapse to 0). It runs L-BFGS to convergence on a small synthetic translog DGP with 200 Halton points. Local-vs-CI JAX numerical-determinism differences push the same seed from ~28% relative error to ~31% — still well within the 'recovers' regime, but over the previous 30% threshold. Bump the tolerance to 35% so CI is robust to that drift while keeping a clean separation from the split-Halton collapse case (>50% error). No behavior change. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Adds `skillmodels.amn` with `estimate_amn`, `AMNEstimationOptions`, and `AMNEstimationResult`. The estimator combines a Spearman measurement first stage with Bartlett factor proxies and OLS on transition equations. Errors-in-variables correction subtracts the known measurement-error covariance from `X'X/n` before inverting, undoing the attenuation bias on noisy proxies. Result carries the full params DataFrame plus per-(period, factor) Bartlett proxies, EIV variances, and per-equation diagnostics. User-supplied `fixed_params` overwrite the AMN point estimate. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
The slurm runners under `scripts/` are workspace-level (cross-project sweep orchestration) and belong in the parent skillmodels-applications repo alongside the sim_repro / application code they target. The `docs/superpowers/` spec was a working note that doesn't belong in the library docs. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Mirrors the `af/` and `amn/` layouts: each estimator now lives in its own subpackage. The state-space machinery that powers the default CHS estimator (Kalman filters, square-root QR, soft clipping, Kalman-filter likelihood, `get_maximization_inputs`, `get_filtered_states`, `process_debug_data`) moves into `skillmodels.chs.*`. Renames `likelihood_function*.py` to `chs/likelihood*.py` to match the `af/likelihood.py` naming convention. Top-level public API is unchanged: `get_maximization_inputs`, `get_filtered_states`, `create_state_ranges`, `simulate_dataset`, etc. remain importable from `skillmodels`. `AMNEstimationOptions`, `AMNEstimationResult`, and `estimate_amn` are added to the top-level namespace. Internal callers update from `skillmodels.<module>` to `skillmodels.chs.<module>`. No behavior changes. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
|
Check out this pull request on See visual diffs & provide feedback on Jupyter Notebooks. Powered by ReviewNB |
…radients Real panels routinely have NaN entries in measurement columns. Before this fix, `_log_normal_pdf(measurements - ..., 0, meas_sds)` produced a NaN for any observation with a missing measurement; that NaN propagated through the JAX backward pass, leaving the full Jacobian NaN and aborting `om.minimize` with `UserFunctionRuntimeError`. The fix sanitizes measurements at the per-observation likelihood boundary: build a finite-mask, replace NaN entries with 0 (so residuals remain finite everywhere), then apply the mask when summing the per-measurement log-pdf. This pattern is gradient-safe because the substituted residuals never carry NaN forward, so no NaN can flow back. Applied at both the initial-period likelihood (latent-only and observed-conditioned variants) and the transition-period likelihood (current- and prev-period measurements). Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
These MATLAB-comparison tests sit alongside (and rely on) workspace-level artefacts: the CNLSY xls bundled beside them, the optimagic branch in the parent repo, and the MATLAB reference data at `/home/hmg/sciebo/Skill estimation/`. They never belonged in the library `tests/` tree -- they're end-to-end reproduction studies, not unit tests. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Three install recipes for environments that don't run pixi: - `environment.yml` (conda/mamba, CPU JAX) - `environment-cuda.yml` (conda/mamba, CUDA-12 JAX) - `requirements.txt` (pip-only, CPU JAX by default) All three pull skillmodels from `git+https://github.com/OpenSourceEconomics/skillmodels.git@af-estimator` and pin optimagic to the `probability-allow-fixed-entries` branch the AF estimator depends on. Dependency set covers everything needed to run skillmodels itself plus the two downstream applications (`skane-struct-bw`, `health-cognition`), minus those two projects' own packages. Downstream-only deps (`fides`, `statadict`, `deepdiff`, `memray`, `statsmodels`, `tabulate`, `seaborn`) are flagged inline. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
…ng convention User-defined transition functions decorated with `@register_params` take one positional argument per consumed factor plus a `params` dict; AF's `combined_transition` supplies a packed state vector and a flat parameter slice. The mismatch caused `TypeError: f_health() missing N required positional arguments` at every transition-step call from callers like `skane-struct-bw`. `_get_raw_transition_functions` now wraps registered user functions: it inspects their signature to find consumed factor names, looks up each name in `processed_model.labels.all_factors` to map to a position in the packed state vector, and reconstructs the `params` dict from the flat slice via the function's `__registered_params__` metadata. Built-in `(states, params)`-style transitions are untouched. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
`AFEstimationResult.conditional_distributions[t].samples_per_component` was kept around purely as an internal scratch buffer: every period generated `(n_halton, n_obs, n_state)` importance samples for chain construction, but the likelihood actually rebuilds samples on-demand from `chain_links` + the joint Halton design at each step, and nothing downstream reads the materialised arrays. At realistic problem sizes these arrays are multiple GB per period (skane-struct-bw hit 3+ GiB per component on `n_halton=2000` × ~50k obs × 6 factors). Pickling the result with `cloudpickle.dump` then triggers a GPU→host materialisation that OOMs on the device, with the error chain landing inside `MixtureComponent` field traversal. Clear `samples_per_component` to `()` at the end of `estimate_af`. The chain history (`chain_links`) and summary stats (`MixtureComponent.mean`, `chol_cov`) -- which are tiny and downstream-consumed -- are kept. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Spearman-based estimation is unreliable as a final estimator (translog cross-terms keep OLS attenuation; AF's MLE chain handles those correctly via Halton quadrature). Strip the two paths that exposed Spearman as a stand-in for MLE: - `AFEstimationOptions.two_stage_measurement` and the corresponding Stage-1 pre-estimation branch in `estimate_af` (which pinned loadings + sigma_meas from Spearman before AF Stage 2). The option required users to make an explicit point-estimate-vs-SE trade-off; with this gone, sigma_meas just enters the AF MLE chain unconditionally. - `skillmodels.amn` (standalone `estimate_amn` / `AMNEstimationOptions` / `AMNEstimationResult`) -- a Spearman + Bartlett-OLS + EIV-correction estimator. It is biased on translog cross-products by construction and was never wired into a downstream consumer. - `skillmodels.af.measurement_first_stage` (the Spearman pre-stage measurement-system estimator, used only by the two paths above). `AFEstimationOptions()` now has a no-args default. The hybrid Spearman / AMN-flavoured moment-init pipeline for **start values** (`skillmodels.start_values.get_moment_based_start_params` and the `initialization_strategy="moment_based"` default inside AF) is untouched -- moment-init seeds remain the canonical way to get a good starting point everywhere. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
The previous fix only cleared `samples_per_component`. Every other `jax.Array` field of `AFEstimationResult` (`MixtureComponent.mean`, `chol_cov`, `ConditionalDistribution.cond_means`, `cond_chols`, `conditional_weights`, `mixture_weights`, and the six array fields on each `ChainLink`) stayed on device, so `cloudpickle.dump` still ran `__reduce__` on JAX arrays. With JIT caches occupying most of the GPU after estimation, even ~300 MiB staging buffers OOM'd inside the GPU→host copy. `estimate_af` now walks the whole result via `jax.device_get` and `dataclasses.replace`, returning an AFEstimationResult whose arrays are all `np.ndarray`. Pickling no longer touches the GPU. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
The previous fix walked the result and called `jax.device_get` on every field, but on a GPU loaded with the per-period likelihood/gradient executables that host copy still OOM'd -- the staging allocation collided with hundreds of MB of stale XLA compilation cache + cached intermediate buffers. Call `jax.clear_caches()` and `gc.collect()` once the estimation loop finishes and before the GPU→host conversion. Compiled executables are no longer needed (estimation is done; the result is what the caller wants), and dropping them frees enough device memory for the small materialisations of the result arrays to succeed. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Establish four sibling subpackages with a clean dependency direction: * `skillmodels.common` -- estimator-agnostic infrastructure (model spec, data/params processing, constraints, transition-function library, visualisation, simulation). No estimator imports flow inward. * `skillmodels.chs` -- Kalman MLE estimator (already separated). * `skillmodels.af` -- AF sequential MLE. * `skillmodels.amn` -- AMN-flavoured moment estimators (Spearman cross-cov + Bartlett-OLS) repurposed solely as the start-value generator for CHS and AF. Cross-imports collapse to `<estimator> -> common` plus the expected `chs/af -> amn` for moment-init start values. The remaining `common -> chs` arrows (figure / decomposition helpers reaching for filtered-states output) are flagged as the next refactor: hoist filtered_states production to the caller and let viz operate on the shared DataFrame format directly. Module moves: * `moment_init.py`, `start_values.py` -> `amn/` * The 14 top-level spec / data / constraint / utility / viz modules -> `common/` (model_spec, types, process_model, process_data, params_index, parse_params, constraints, decorators, check_model, transition_functions, utilities, config, utils_plotting, correlation_heatmap, diagnostic_plots, simulate_data, variance_decomposition, visualize_factor_distributions, visualize_transition_equations). * `tests/test_af_moment_init.py` -> `tests/test_amn_moments.py`, `tests/test_start_values.py` -> `tests/test_amn_start_values.py`. Public top-level API (`from skillmodels import ...`) is unchanged. Internal imports + the four downstream applications (`skane-struct-bw`, `health-cognition`, `matlab_ces_repro`, `sim_repro`) all updated to the new paths. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
`_to_numpy_conditional_distribution` cleared `samples_per_component` only at the end, inside the same `dataclasses.replace` call that also wrote the converted summary arrays. By the time `_to_numpy(c.mean)` ran the multi-GB importance-sample arrays were still live on the device, so even a tiny `(n_state,)` materialisation hit a 335 MiB staging allocation failure on busy GPUs. Replace each `ConditionalDistribution` in the list with a `samples_per_component=()` copy first, drop the loop variable, and run `gc.collect()` + `jax.clear_caches()` -- that frees the giant buffers before any host copy runs. Conversion of the remaining (small) summary arrays then succeeds even when the rest of the GPU is full of the just-finished optimisation's intermediates. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
`skillmodels.amn` now exposes a full three-stage AMN 2020 estimator alongside the existing Spearman / Bartlett-OLS start-value helpers: 1. `mixture_em.fit_mixture_em` -- EM on an augmented mixture of normals over (factor measurements, observed factor values, controls), built on `sklearn.mixture.GaussianMixture`. Listwise complete-case for v0. 2. `minimum_distance.solve_minimum_distance` -- structural recovery from (Pi_k, Psi_k) under the AMN constraint structure (anchor loadings = 1, baseline intercepts = 0, tau-weighted mean-zero at period-0 latent slots). Mirrors `STEP2_func.R` from the AMN 2020 supplementary archive. 3. `simulate_and_regress.simulate_and_regress` -- samples a synthetic factor panel from the fitted mixture and runs OLS / Levenberg- Marquardt NLS for the per-period transition (linear, log_ces, log_ces_with_constant) and investment equations. `estimate.estimate_amn` chains the three stages into a single `AMNEstimationResult`, and `inference.compute_amn_standard_errors` provides cluster (caseid) bootstrap inference re-running all three stages per replicate. Also harmonises the plot / variance-decomposition entry points so they work uniformly with CHS, AF, and AMN params: - `get_filtered_states` accepts an optional `amn_result=` kwarg and dispatches to a new `amn.posterior_states.get_amn_posterior_states` (mixture-Schur conditional E[theta | Y_i]). - `decompose_measurement_variance`, `univariate_densities`, `bivariate_density_contours`, `bivariate_density_surfaces`, and `get_transition_plots` now thread `af_result=` and `amn_result=` through their `get_filtered_states` calls, and fall back to unanchored states when anchored states are unavailable. Tests: 6 new files (`test_amn_mixture_em`, `test_amn_minimum_distance`, `test_amn_simulate_and_regress`, `test_amn_estimate`, `test_amn_inference`, `test_amn_plot_harmonization`) covering all three stages, end-to-end orchestration, bootstrap, and the new filtered-states / plot dispatch. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
- `EstimationOptions.start_params_strategy` default: `"moment_based"` → `"amn"`. Renames the legacy Spearman / Bartlett-OLS hybrid value from `"moment_based"` to the more descriptive `"spearman"`. Accepted values are now `Literal["none", "spearman", "amn"]`. - `AFEstimationOptions.initialization_strategy` default: `"moment_based"` → `"amn"`. Same rename; accepted values are `Literal["constant", "spearman", "amn"]`. - `get_moment_based_start_params` renamed to `get_spearman_start_params`. When `"amn"` is selected: - `chs.get_maximization_inputs` runs `estimate_amn` on the dataset and overlays its parameter estimates onto the template, falling back to Spearman seeds for entries AMN doesn't touch (mixture weights, initial Cholesky diagonals). - `estimate_af` runs `estimate_amn` once upfront, merges the result with any user-supplied `start_params` (user values win on overlap), and switches the per-period MLE to the `"constant"` defaults so the within-period Spearman pre-pass is skipped (AMN's values are already in the optimizer's neighbourhood). Performance note: running the full AMN three-stage estimator is non-trivial on small datasets (a few seconds even for a 2-period skillmodels test model). Test fixtures `MODEL2` and `SIMPLEST_AUGMENTED_MODEL` therefore opt into `start_params_strategy="spearman"` explicitly so the CHS / AF test plumbing stays fast; the public `EstimationOptions()` default remains `"amn"`. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
`skillmodels.amn.mixture_em.fit_mixture_em` uses `sklearn.mixture.GaussianMixture` as its Stage 1 engine, and the `amn` package's `__init__.py` re-exports `estimate_amn` (which transitively imports `mixture_em`). The CI tests-cpu environment was missing scikit-learn, so collection failed on all three runners (macOS / Windows / Linux). Adds scikit-learn to both PyPI and Pixi dependency tables; the regenerated lock pulls scikit-learn 1.8.0 on all supported platforms. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
`scikit-learn` is now a hard `skillmodels` dependency (used by `amn.mixture_em.fit_mixture_em`). Mirrors the addition in `pyproject.toml` across the three deployment artefacts -- CPU conda env, CUDA-12 conda env, and pip-only requirements -- so CBS deployments that bootstrap from these files don't hit `ModuleNotFoundError` at import time. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
AMN Stage 3 (`simulate_and_regress`) only supports linear, log_ces, and log_ces_with_constant transitions. When a model uses translog or a `@register_params`-decorated user transition function, `estimate_amn` raises `NotImplementedError`. With AMN as the default start-value strategy, this turned previously-passing CHS / AF tests (`test_af_estimate_with_translog`, `test_af_estimate_with_register_params_user_transition`, `test_af_joint_halton_recovers_sigma_prod_with_chain_link`) into regressions. Both `estimate_af` and `chs.get_maximization_inputs` now catch `NotImplementedError` from `estimate_amn`, emit a RuntimeWarning, and fall back to the cheap per-period Spearman seeds. AF additionally swaps `initialization_strategy="amn"` for `"spearman"` so the per-period MLE still benefits from data-driven starts. Also drops a `# ty: ignore[unresolved-import]` on `from sklearn.mixture import GaussianMixture` now that scikit-learn is a declared dependency. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Replaces the previous `NotImplementedError`-then-fall-back hack with proper support. Specialised fitters stay for the cases where they pay off: closed-form OLS for `linear` and softmax-constrained Levenberg- Marquardt for `log_ces` / `log_ces_with_constant` (keeps gammas on the simplex). Everything else -- `translog`, `robust_translog`, `linear_and_squares`, `log_ces_general`, and any user `@register_params`-decorated transition -- now flows through a generic NLS path that calls the transition function directly via `jax.vmap`. Concretely: - `_resolve_transition_callable` looks up the built-in function from `skillmodels.common.transition_functions` for known names, or wraps the user's raw callable via a new `_make_user_transition_callable` helper (a Stage-3 mirror of AF's `_wrap_registered_transition_function`). - `_fit_generic_nls` jit-compiles a vmapped predictor, then runs `scipy.optimize.least_squares` with sensible defaults (phi/rho seeded at 0.5, CES-shaped functions get uniform-share gammas). - `simulate_and_regress` now takes `model_spec` so the user-callable lookup has access to `model_spec.factors[f].transition_function`. Removes the temporary `try/except NotImplementedError -> Spearman fallback` in `estimate_af` and `chs.get_maximization_inputs`: AMN now handles every transition, so the fallback is dead code. Test coverage: a new `test_simulate_and_regress_handles_translog` exercises the generic NLS path; the previously-regressing `test_af_estimate_with_translog`, `test_af_estimate_with_register_params_user_transition`, and `test_af_joint_halton_recovers_sigma_prod_with_chain_link` all pass without the fallback. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Summary
af/subpackage implementing the Antweiler & Freyberger (2025) estimator as an alternative to the CHS Kalman filterModelSpecinterface — users switch estimator by callingestimate_af()instead ofget_maximization_inputs()+om.maximize()log_ces/linear/translogtransitions, endogenous factors via explicit investment equationget_filtered_states()interface: passaf_result=for AF posterior states, omit for CHS filtered statesWhat's done
estimate_af(model_spec, data, af_options, start_params)→AFEstimationResultProbabilityConstraintforlog_cesgammas, satisfied at start valuesI = β₀ + β₁θ + β₂Y + σ_I εfor endogenous factorsstart_paramssupport: user-supplied starting values override heuristic defaultsget_filtered_states(model_spec, data, params, af_result=result)computes quadrature-based posterior means per individual/periodlong_runningMODEL2 comparison)Test plan
pixi run -e tests-cpu tests— 399 passed, 1 deselected (long_running)pixi run ty— all checks passedprek run --all-files— all passedpytest -m long_running— MODEL2 AF vs CHS comparison (both estimators optimised from same naive start values)🤖 Generated with Claude Code