Skip to content

Make _handle_stats_nesting tolerant of missing drop columns#257

Draft
thodson-usgs wants to merge 1 commit intoDOI-USGS:mainfrom
thodson-usgs:fix/handle-stats-nesting-errors-ignore
Draft

Make _handle_stats_nesting tolerant of missing drop columns#257
thodson-usgs wants to merge 1 commit intoDOI-USGS:mainfrom
thodson-usgs:fix/handle-stats-nesting-errors-ignore

Conversation

@thodson-usgs
Copy link
Copy Markdown
Collaborator

Summary

_handle_stats_nesting has two .drop(columns=...) calls (one for the geopandas branch, one for the pandas branch) that hardcode literal column names — [\"type\", \"properties.data\"] and [\"data\"] respectively. If a stats response ever comes back in a slightly different shape (renamed key, missing optional key, edge-case feature without the expected nesting), drop() raises KeyError and aborts the helper.

The sibling pd.json_normalize(...) call later in the same function already passes errors=\"ignore\", so this PR adds the same to the two drop() calls for parity.

Diff

# dataretrieval/waterdata/utils.py:894-901
if not geopd:
    df = pd.json_normalize(body[\"features\"]).drop(
        columns=[\"type\", \"properties.data\"], errors=\"ignore\"  # <-- added
    )
    df.columns = df.columns.str.split(\".\").str[-1]
else:
    df = gpd.GeoDataFrame.from_features(body[\"features\"]).drop(
        columns=[\"data\"], errors=\"ignore\"  # <-- added
    )

Test plan

  • New test test_handle_stats_nesting_tolerates_missing_drop_columns constructs a stats body whose features lack the top-level type key and confirms the function returns a populated DataFrame.
  • Verified that the new test fails on main with KeyError: \"['type'] not found in axis\".
  • Full waterdata_utils test suite passes (5 tests).
  • Full test suite (excluding deprecated nwis_test.py): 197 passed.

🤖 Generated with Claude Code

Both `.drop()` calls in `_handle_stats_nesting` (for the geopandas and
pandas branches) hardcoded literal column names — `["type",
"properties.data"]` and `["data"]`. If a stats response is ever returned
in a slightly different shape (or one of those keys is renamed/removed),
`drop()` raises `KeyError` and aborts the helper. The sibling
`pd.json_normalize(...)` call later in the same function already passes
`errors="ignore"`, so add the same to the two `drop()` calls for parity.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
@thodson-usgs thodson-usgs changed the title Make tolerant of missing drop columns Make _handle_stats_nesting tolerant of missing drop columns May 3, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant