discreteNPIV

Interface Selection

Choosing an entry path

The package separates a low-level NPIV interface from a surrogate application layer. Use the toggle below to choose the route that matches your data.

The homepage stays package-first, while the surrogate application lives on its own page.

Use the core API when the NPIV problem is already explicit.

Use this route when you already have observed covariates or basis features X, a discrete instrument Z, outcomes Y, and a target covariate sample X_new defining the linear functional of interest.

fit_structural_nuisance
fit_dual_nuisance
estimate_average_functional

Structural nuisance Dual / Riesz nuisance Debiased functional inference

See core interface

Use the surrogate API when historical experiments are the instrument.

Use this route when you have historical experiments with surrogate and long-term outcomes, plus a novel experiment that contributes only surrogate outcomes. The wrappers translate this application structure into the core NPIV estimand.

encode_experiment_arms
estimate_long_term_mean_from_surrogates
estimate_long_term_effect_from_surrogates

Historical arms -> Z Novel surrogates -> X_new Support diagnostics

See surrogate interface

Install

Editable install and repo entry points

The repository documents local development and reproduction from the checkout. The install command below matches the README.

The repository layout is already documentation-friendly: src/discreteNPIV for the supported package code, docs for notes, scripts for runnable examples, and tests for validation.

Install

python3 -m pip install -e .

This is the documented path for working from the current source tree.

Core reproduction

The repository includes paper-style and validation scripts such as scripts/reproduce_small_paper_experiment.py and scripts/evaluate_npjive_validation.py.

Surrogate case study

The long-term application workflow is illustrated in scripts/reproduce_surrogate_case_study.py and expanded on in the separate long-term page.

Core NPIV

Fit the structural nuisance, fit the dual nuisance, estimate the functional.

At the core level, the package solves three tasks: fit the structural nuisance for the minimum-norm NPIV solution, fit the dual / Riesz nuisance for a target covariate law, and estimate a debiased average linear functional with uncertainty.

The notation in the docs is consistent throughout: X for observed features or basis representation, Z for the discrete instrument, Y for the observed outcome, and X_new for the target covariate sample.

Example

from discreteNPIV import (
    estimate_average_functional,
    fit_dual_nuisance,
    fit_structural_nuisance,
    generate_synthetic_data,
)

data = generate_synthetic_data(
    n_per_instrument=20,
    n_instruments=8,
    n_features=5,
    n_target_samples=4000,
    random_state=7,
)

structural = fit_structural_nuisance(
    data["X"],
    data["Z"],
    data["Y"],
    n_splits=2,
    random_state=7,
)

dual = fit_dual_nuisance(
    data["X"],
    data["Z"],
    data["X_new"],
    n_splits=2,
    random_state=7,
)

result = estimate_average_functional(
    data["X"],
    data["Z"],
    data["Y"],
    data["X_new"],
    n_splits=2,
    random_state=7,
)

print(structural.selected_method)
print(dual.selected_method)
print(result.selected.estimate, result.selected.se)

Main Entry Points

`fit_structural_nuisance`

Estimates the structural nuisance, that is, the minimum-norm NPIV solution for the structural map represented through X.

`fit_dual_nuisance`

Estimates the dual or Riesz representer nuisance for the target functional defined by X_new. This is the nuisance needed to debias the plug-in functional.

`estimate_average_functional`

Combines the structural and dual nuisances to estimate the debiased average functional E[h(X_new)], together with a standard error and confidence interval.

Workflow

The package is organized around nuisance fitting plus a debiased functional estimator.

The core estimator has three pieces: estimate the structural function h, estimate the dual or Riesz nuisance for the target functional, and combine them in a debiased estimator of E[h(X_new)].

Here X_new is the target covariate sample. It defines the average functional of interest, not a new outcome sample.

Workflow diagram for the core NPIV estimator stack — Observed data `(X, Z, Y)` identify the structural function and the dual nuisance, while `X_new` defines the target average `E[h(X_new)]`. These pieces are combined in `estimate_average_functional`.

Structural fit

The structural fit estimates the minimum-norm NPIV solution for the structural map represented through X. The returned object includes the selected estimator, the npJIVE fit, the 2SLS baseline, and tuning details.

Dual fit

The dual fit estimates the dual or Riesz representer nuisance for the target average defined by X_new. This is the nuisance required for the debiasing step, and it is returned with the same selected / npJIVE / 2SLS structure.

Inference result

The final NPIVInferenceResult reports the debiased estimate of E[h(X_new)] together with standard errors, confidence intervals, and the fitted structural and dual nuisance objects.

Theoretical Framing

Many weak instruments

The package is built for settings where the instrument is discrete and the number of instrument levels can grow with the sample size, while the support available within each instrument level may remain limited.

The motivating example in the README is long-term causal inference: many past experiments, but only a limited number of units in each.

Target functional and interpretation

The core estimand is a linear functional of the minimum-norm NPIV solution, represented in the API by the target average E[h(X_new)]. This keeps the package focused on functionals of the structural map rather than full point identification of that map.

npJIVE is the many-instrument correction

We use jackknife estimation to remove the bias that arises with many weak instruments. The package reports this npJIVE estimate alongside a grouped 2SLS baseline.

Bias correction and tuning

Group-level leave-one-out means are used inside the npJIVE nuisance construction, while regularization parameters are chosen by a stratified K-fold cross-validation routine.

API

Public package surface

The current public exports are intentionally compact: core NPIV estimators, surrogate wrappers, grouped utilities, simulation helpers, and typed result objects.

This site adds documentation, not new package interfaces.

Core estimators

fit_structural_nuisance
fit_dual_nuisance
estimate_average_functional

Surrogate wrappers

encode_experiment_arms
estimate_long_term_mean_from_surrogates
estimate_long_term_effect_from_surrogates

Utilities and results

group_means, leave_one_out_group_means, make_stratified_folds
generate_synthetic_data
NPIVInferenceResult, StructuralFitResult, DualFitResult

References

Related resources

The long-term application has its own page because it is a distinct entry path built on the same core NPIV estimand.

Paper GitHub repository Long-term causal inference page Core NPIV section

The website text is drawn from the current repository sources: README.md, docs/api_reference.md, docs/paper_notation_map.md, docs/experiment_encoding.md, and docs/loo_jackknife.md.
For the application layer and design-support discussion, continue to the dedicated long-term page.