Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
1 change: 1 addition & 0 deletions CHANGELOG.md

Large diffs are not rendered by default.

2 changes: 1 addition & 1 deletion TODO.md
Original file line number Diff line number Diff line change
Expand Up @@ -128,13 +128,13 @@ Deferred items from PR reviews that were not addressed before merge.
| Conley + survey weights / `survey_design`. Score-reweighted meat `s_i = w_i · X_i · ε_i` is mechanical, but PSU clustering interaction with the spatial kernel and replicate-weights variance under spatial correlation are non-trivial (Bertanha-Imbens 2014 covers cluster-sample but not the explicit Conley case). Phase 5 of the spillover-conley initiative; paper review prerequisite. Currently raises `NotImplementedError` at the linalg validator. | `linalg.py::_validate_vcov_args` | Phase 5 (spillover-conley) | Medium |
| `SyntheticDiD(vcov_type="conley")` support. Currently raises `TypeError` at `__init__` because SyntheticDiD uses `variance_method ∈ {bootstrap, jackknife, placebo}` rather than the analytical sandwich that Conley plugs into. Wiring would require either reimplementing an analytical sandwich path for SyntheticDiD or designing a spatial-block bootstrap (new methodology, Politis-Romano 1994 territory). | `synthetic_did.py::SyntheticDiD` | follow-up (spillover-conley) | Low |
| `SpilloverDiD` Gardner GMM first-stage uncertainty correction at stage 2. Wave B MVP uses standard `solve_ols` variance (HC1 / Conley / cluster) without the influence-function adjustment for stage-1 FE estimation. Extending `two_stage.py::_compute_gmm_variance` to accept a Conley kernel matrix in place of HC1's identity at the IF outer-product step gives the full Butts (2021) Section 3.1 + Gardner (2022) Section 4 composition. See plan Risks #2 for the IF formula. | `spillover.py::SpilloverDiD.fit`, `two_stage.py::_compute_gmm_variance` | follow-up (Wave B) | Medium |
| `SpilloverDiD(event_study=True)` per-event-time × ring decomposition (Butts Section 5 / Table 2 `S^k_{it}` / `Ring^k_{it,j}`). Currently raises `NotImplementedError`. The implementation adds event-time dummies × ring covariates to the stage-2 design and emits a MultiIndex on `spillover_effects`. | `spillover.py::SpilloverDiD.fit` | follow-up (Wave B) | Medium |
| `SpilloverDiD(survey_design=...)` integration. Currently raises `NotImplementedError`. Requires threading survey weights through the inline stage 1 + stage 2 and lifting `two_stage.py`'s survey path patterns. | `spillover.py::SpilloverDiD.fit` | follow-up (Wave B) | Low |
| `SpilloverDiD(ring_method="count")` extension. Currently only the nearest-treated-ring specification is exposed. Count-of-treated-in-ring (paper Section 3.2 end) is methodologically supported by Butts but re-introduces functional-form dependence; expose with an explicit kwarg gate and documentation warning. | `spillover.py::SpilloverDiD.fit` | follow-up | Low |
| `SpilloverDiD` data-driven `d_bar` selection (Butts 2021b / Butts 2023 JUE Insight cross-validation). | `spillover.py::SpilloverDiD` | follow-up | Low |
| `SpilloverDiD` T22 TVA tutorial (`docs/tutorials/22_spillover_did.ipynb`): synthetic TVA-style DGP reproducing Butts (2021) Section 4 Table 1 Panel A bias-correction direction (~40% understatement). Split from the methodology PR per user-confirmed scope split (2026-05-15). | `docs/tutorials/`, `tests/test_t22_*_drift.py` | follow-up (Wave B) | Medium |
| Extend `TwoStageDiD` with Conley vcov as a first-class feature (mirrors Wave A's TWFE/MPD/DiD extension). Currently `TwoStageDiD.__init__` lacks `vcov_type` / `conley_*` kwargs; `SpilloverDiD` works around this by threading Conley directly via `solve_ols` at stage 2. Promoting Conley to TwoStageDiD's API removes the workaround and lets non-spillover users access Conley + Gardner two-stage. | `diff_diff/two_stage.py` | follow-up | Medium |
| `SpilloverDiD` sparse cKDTree path for the staggered nearest-treated-distance helper (mirrors the static helper's sparse branch). Currently `_compute_nearest_treated_distance_staggered` always builds dense `(n_units, n_treated_by_onset)` pairwise distance matrices per cohort; on large staggered panels with many cohorts this is avoidable memory/runtime. Add a sparse k-d-tree branch analogous to `_compute_nearest_treated_distance_sparse`, gated on `n > _CONLEY_SPARSE_N_THRESHOLD`. | `spillover.py::_compute_nearest_treated_distance_staggered` | follow-up (Wave B) | Low |
| `SpilloverDiDResults` in `DiagnosticReport` dispatch tables. Wave C event-study emits a TwoStageDiD-compatible `event_study_effects: Dict[int, Dict]` alias that `plot_event_study` consumes via the new `reference_period` attribute fallback in `_extract_plot_data`, but `SpilloverDiDResults` is NOT registered in `DiagnosticReport`'s `_APPLICABILITY` / `_PT_METHOD` tables — so `DiagnosticReport(spillover_result)` doesn't currently route to event-study diagnostics. Registering requires (a) deciding which diagnostics apply (parallel trends, pre-trends power, heterogeneity, design-effect) AND (b) adding an end-to-end test. | `diff_diff/diagnostic_report.py::_APPLICABILITY`, `_PT_METHOD` | follow-up (Wave C) | Low |

#### Performance

Expand Down
7 changes: 3 additions & 4 deletions diff_diff/guides/llms-full.txt
Original file line number Diff line number Diff line change
Expand Up @@ -477,8 +477,8 @@ SpilloverDiD(
cluster: str | None = None,
alpha: float = 0.05,
anticipation: int = 0,
event_study: bool = False, # Deferred: raises NotImplementedError if True
horizon_max: int | None = None, # Deferred (event-study mode)
event_study: bool = False, # Wave C: per-event-time × ring decomposition (Butts Table 2)
horizon_max: int | None = None, # Bin event-times outside [-H,+H] into endpoint pools (event-study mode); H>=1 or None — H=0 rejected (use event_study=False for aggregate spec)
rank_deficient_action: str = "warn",
)
```
Expand All @@ -502,8 +502,7 @@ sp.fit(

- `covariates=` raises `NotImplementedError`. Gardner two-stage requires covariate effects estimated on the untreated-and-unexposed Omega_0 subsample at stage 1; appending raw covariates only at stage 2 silently biases `tau_total` / `delta_j` on panels with time-varying covariates. Planned follow-up.
- `survey_design=` raises `NotImplementedError` (planned: SurveyDesign integration)
- `event_study=True` raises `NotImplementedError` (planned: per-event-time × ring decomposition per Butts Table 2)
- `horizon_max=` raises `NotImplementedError` (used only with event_study)
- `event_study=True` SHIPPED (Wave C): emits per-event-time `tau_k` and per-(ring, event-time) `delta_jk` as `att_dynamic: pd.DataFrame` (indexed by event-time `k`) plus MultiIndex `spillover_effects: pd.DataFrame` (levels `(ring_label, event_time)`). TwoStageDiD-compatible `event_study_effects: Dict[int, Dict]` alias also emitted for `plot_event_study` consumption — `_extract_plot_data` prefers the new `reference_period` attribute over the legacy `n_obs==0` heuristic. (DiagnosticReport integration: NOT yet wired; queued as a follow-up.) (schema: `{k: {"effect", "se", "n_obs", "t_stat", "p_value", "conf_int": (low, high)}}` mirroring `two_stage.py:1355-1389`). Reference period `ref_period = -1 - anticipation` (TwoStageDiD `two_stage.py:486` convention); reference row uses `coef=0.0, se=0.0, n_obs=0, conf_int=(0.0, 0.0)`. Scalar `att` field becomes a sample-share-weighted average of post-treatment `tau_k` (`att = sum_{k>=0} w_k * tau_k` with `w_k = n_treated_at_k / total`) with SE from linear-combination inference `Var(att) = w' V_subset w` on the post-treatment vcov block — no separate fit. **Two-clock K_it:** direct-effect clock is `K_direct = t - effective_first_treat(i)` for ever-treated rows; spillover clock is `K_spill = t - earliest-in-range-cohort-onset(i)` (running min across activated cohorts, NaN pre-trigger). `K_spill >= 0` structurally; negative-k spillover cells are rectangularly emitted with `coef = NaN, n_obs = 0`. **`horizon_max` semantics:** bins event-times outside `[-H, +H]` into endpoint pools (no observations dropped — divergence from TwoStageDiD which filters; intentional, per `feedback_no_silent_failures`). With `horizon_max=None`, auto-detects bin set from observed K. **Validation:** `horizon_max < 0` raises `ValueError`; `ref_period < -horizon_max` (i.e., `anticipation > horizon_max - 1`) raises `ValueError` — silently floor-shifting the reference would change identification. **Reduce-to-aggregate:** under constant-tau DGP with `horizon_max=None`, the share-weighted scalar `att` reproduces Wave B's aggregate bit-identically. **Note:** `horizon_max=0` does NOT reduce to Wave B (binning collapses pre-treatment K values to `k=0`, making `D^0 = D_i` ever-treated indicator rather than `D_it`). Per-event-time SEs share the same Wave B Gardner-GMM caveat (biased downward by a few percent; Wave D follow-up).
- Stage-2 variance is `solve_ols` HC1 / Conley / cluster — Gardner GMM first-stage uncertainty correction NOT applied (planned follow-up; SE is biased downward / too small, CIs too narrow, p-values too small — treat reported significance conservatively until the GMM correction lands)
- Only nearest-treated rings supported; `ring_method="count"` (count of treated neighbors in ring) not yet exposed

Expand Down
121 changes: 97 additions & 24 deletions diff_diff/results.py
Original file line number Diff line number Diff line change
Expand Up @@ -408,37 +408,102 @@ class SpilloverDiDResults(DiDResults):
event_study: Optional[bool] = field(default=None)
stage1_n_obs: Optional[int] = field(default=None)
anticipation: Optional[int] = field(default=None)
# Wave C event-study fields (None when event_study=False):
att_dynamic: Optional[pd.DataFrame] = field(default=None)
# Per-event-time direct effects DataFrame indexed by integer k.
# Columns: ["coef", "se", "t_stat", "p_value", "ci_low", "ci_high", "n_obs"].
# Includes the reference period row with coef=0.0, se=0.0, n_obs=0.
event_study_effects: Optional[Dict[int, Dict[str, Any]]] = field(default=None)
# TwoStageDiD-compatible alias for ``att_dynamic`` consumable by
# ``plot_event_study`` (wired in Wave C via the ``reference_period``
# attribute fallback in ``_extract_plot_data``). ``DiagnosticReport``
# routing is NOT yet wired — registering ``SpilloverDiDResults`` in
# ``DiagnosticReport``'s applicability/method tables is a planned
# follow-up (see TODO.md).
# Schema mirrors ``two_stage.py:1355-1389``:
# {k: {"effect", "se", "n_obs", "t_stat", "p_value", "conf_int": (low, high)}}
# Reference row uses ``conf_int = (0.0, 0.0)`` (TwoStageDiD parity).
horizon_max: Optional[int] = field(default=None)
reference_period: Optional[int] = field(default=None)

def summary(self, alpha: Optional[float] = None) -> str:
"""Extended summary with ATT row plus per-ring rows."""
"""Extended summary with ATT row, per-event-time direct block, and
per-(ring, event-time) spillover block."""
base = super().summary(alpha=alpha)
if self.spillover_effects is None or self.spillover_effects.empty:
insert_blocks: List[str] = []

# Wave C event-study: per-event-time direct effects block.
if self.att_dynamic is not None and not self.att_dynamic.empty:
insert_blocks.append("")
insert_blocks.append("Dynamic Direct Effects by Event Time".center(70))
insert_blocks.append("-" * 70)
insert_blocks.append(
f"{'k':<15} {'Estimate':>12} {'Std. Err.':>12} "
f"{'t-stat':>10} {'P>|t|':>10} {'':>5}"
)
insert_blocks.append("-" * 70)
for k, row in self.att_dynamic.iterrows():
coef = row.get("coef", np.nan)
se = row.get("se", np.nan)
t_stat = row.get("t_stat", np.nan)
p_value = row.get("p_value", np.nan)
stars = _get_significance_stars(p_value)
k_str = f"{int(k):+d}"
insert_blocks.append(
f"{k_str:<15} {coef:>12.4f} {se:>12.4f} "
f"{t_stat:>10.3f} {p_value:>10.4f} {stars:>5}"
)
insert_blocks.append("-" * 70)

# Spillover block (per-ring OR per-(ring, k) under MultiIndex).
# When the index is a MultiIndex (event-study mode), the ring and `k`
# are rendered as separate columns so distinct horizons within the same
# ring remain visually distinguishable. The non-MultiIndex aggregate
# path retains the single `Ring` column for Wave B compatibility.
if self.spillover_effects is not None and not self.spillover_effects.empty:
insert_blocks.append("")
insert_blocks.append("Spillover Effects (ring-indicator, Butts 2021)".center(70))
insert_blocks.append("-" * 70)
is_multi = isinstance(self.spillover_effects.index, pd.MultiIndex)
if is_multi:
header = (
f"{'Ring':<15} {'k':>5} {'Estimate':>12} {'Std. Err.':>12} "
f"{'t-stat':>10} {'P>|t|':>10} {'':>5}"
)
else:
header = (
f"{'Ring':<15} {'Estimate':>12} {'Std. Err.':>12} "
f"{'t-stat':>10} {'P>|t|':>10} {'':>5}"
)
insert_blocks.append(header)
insert_blocks.append("-" * len(header.rstrip()))
for label, row in self.spillover_effects.iterrows():
coef = row.get("coef", np.nan)
se = row.get("se", np.nan)
t_stat = row.get("t_stat", np.nan)
p_value = row.get("p_value", np.nan)
stars = _get_significance_stars(p_value)
if is_multi and isinstance(label, tuple):
ring_str = str(label[0])[:15]
k_str = f"{int(label[1]):+d}"
insert_blocks.append(
f"{ring_str:<15} {k_str:>5} {coef:>12.4f} {se:>12.4f} "
f"{t_stat:>10.3f} {p_value:>10.4f} {stars:>5}"
)
else:
label_str = str(label)[:15]
insert_blocks.append(
f"{label_str:<15} {coef:>12.4f} {se:>12.4f} "
f"{t_stat:>10.3f} {p_value:>10.4f} {stars:>5}"
)
insert_blocks.append("-" * len(header.rstrip()))

if not insert_blocks:
return base
lines = base.split("\n")
# Find the closing separator line and inject ring rows before it.
ring_rows = ["", "Spillover Effects (ring-indicator, Butts 2021)".center(70), "-" * 70]
header = (
f"{'Ring':<15} {'Estimate':>12} {'Std. Err.':>12} "
f"{'t-stat':>10} {'P>|t|':>10} {'':>5}"
)
ring_rows.append(header)
ring_rows.append("-" * 70)
for label, row in self.spillover_effects.iterrows():
coef = row.get("coef", np.nan)
se = row.get("se", np.nan)
t_stat = row.get("t_stat", np.nan)
p_value = row.get("p_value", np.nan)
stars = _get_significance_stars(p_value)
label_str = str(label) if not isinstance(label, tuple) else f"{label[0]} k={label[1]}"
ring_rows.append(
f"{label_str[:15]:<15} {coef:>12.4f} {se:>12.4f} "
f"{t_stat:>10.3f} {p_value:>10.4f} {stars:>5}"
)
ring_rows.append("-" * 70)
# Insert ring block before the final "==..." line (last row of base).
for idx in range(len(lines) - 1, -1, -1):
if lines[idx].startswith("="):
lines = lines[:idx] + ring_rows + lines[idx:]
lines = lines[:idx] + insert_blocks + lines[idx:]
break
return "\n".join(lines)

Expand All @@ -460,6 +525,14 @@ def to_dict(self) -> Dict[str, Any]:
"event_study": self.event_study,
"stage1_n_obs": self.stage1_n_obs,
"anticipation": self.anticipation,
"att_dynamic": (
self.att_dynamic.reset_index().to_dict(orient="records")
if self.att_dynamic is not None
else None
),
"event_study_effects": self.event_study_effects,
"horizon_max": self.horizon_max,
"reference_period": self.reference_period,
}
)
return base
Expand Down
Loading
Loading