Skip to content

feat: Go walking skeleton for cmd/ingestion#57

Merged
koinsaari merged 1 commit into
mainfrom
feat/cmd-ingestion-walking-skeleton
May 23, 2026
Merged

feat: Go walking skeleton for cmd/ingestion#57
koinsaari merged 1 commit into
mainfrom
feat/cmd-ingestion-walking-skeleton

Conversation

@koinsaari
Copy link
Copy Markdown
Contributor

What

Replaces the placeholder cmd/ingestion with a working Go binary that reads a .osm.pbf file, filters to POIs, transforms them into models.Place records, and bulk-upserts them into the places table.

Changes

  • Add internal/osm package: PBF streaming via paulmach/osm, allowlist-based POI filter, node-to-place transformer, rank derivation
  • Add internal/place repository with UpsertBatch using (osm_id, osm_type) conflict key
  • Add migration 000007 with partial unique index on (osm_id, osm_type) WHERE osm_id <> 0
  • Expand models.Category to cover the OSM POI taxonomy (Bar, Healthcare, Education, Finance, Entertainment, Government, Transport, Social, Worship)
  • Replace the cmd/ingestion placeholder with a full-import dispatcher; diff-sync is wired up to print "not implemented" for now
  • Add end-to-end integration test against a small Andorra fixture PBF
  • Add testhelpers.StartPostgresWithConnInfo so cmd/ingestion tests can build their own configs from a testcontainers PostGIS instance
  • Update CLAUDE.md to describe the new internal/osm and internal/place packages and the real cmd/ingestion role

Notes

  • First slice of the larger Kotlin → Go migration captured in docs/superpowers/migration-roadmap.md. Accessibility ingestion, diff-sync, and place soft-vs-hard-delete logic come in later sub-projects.
  • The migration's partial predicate (WHERE osm_id <> 0) lets existing cmd/api tests keep creating places without OSM IDs. In production every place is OSM-sourced and has a non-zero osm_id, so the predicate covers every real row. A follow-up issue should settle whether POST /places (the endpoint that justifies the partial predicate) should be removed.
  • Andorra fixture is 3.2 MB committed to testdata/; gives the smoke test ~492k nodes to process and ~976 places to import.

Test plan

  • go test ./... — all unit tests pass
  • go test -tags integration -timeout 300s ./... — all integration tests pass (including the pre-existing cmd/api ones that rely on the partial-index predicate)
  • go vet ./... clean

@claude
Copy link
Copy Markdown

claude Bot commented May 23, 2026

Claude finished @koinsaari's task in 3m 12s —— View job


Review

  • Read CLAUDE.md
  • Read changed files
  • Analyze for must-fix issues
  • Post review

No must-fix issues found. The PR is clean and ready to merge.

A few observations for completeness:

internal/osm/filter.go — building branch

buildingQualifyingKeys includes "amenity" and "public_transport". A node with building=yes + amenity=<unknown> (e.g. amenity=bench) would slip through as CategoryOther instead of being filtered out, because the amenity branch only catches values in amenityToCategory. In practice this is harmless — building=yes almost never appears on OSM nodes (buildings are mapped as ways/relations), so this code path is essentially unreachable in real data. Worth knowing about if the pipeline is ever extended to process ways.

internal/place/repository.gosource not in DoUpdates

source is intentionally omitted from the conflict-update columns. Since every row from this pipeline has source="osm", this is correct. Just noting it so a future reader doesn't add it by mistake.

cmd/ingestion/main_integration_test.go — double migration

StartPostgresWithConnInfo runs migrations, and then runFullImport runs them again on the same container. golang-migrate handles this gracefully (ErrNoChange), so no issue.

Replace the placeholder cmd/ingestion with a working Go binary that reads
a .osm.pbf file, filters to POIs, transforms them into models.Place
records, and bulk-upserts them into the places table.

- Add internal/osm package: PBF streaming via paulmach/osm, allowlist-based POI filter, node-to-place transformer, rank derivation
- Add internal/place repository with UpsertBatch using (osm_id, osm_type) conflict key
- Add migration 000007 with partial unique index on (osm_id, osm_type) WHERE osm_id <> 0
- Expand models.Category to cover OSM POI taxonomy (Bar, Healthcare, Education, Finance, Entertainment, Government, Transport, Social, Worship)
- Replace cmd/ingestion placeholder with full-import dispatcher; diff-sync stays unimplemented
- Add end-to-end integration test against an Andorra fixture PBF
- Add testhelpers.StartPostgresWithConnInfo so cmd/ingestion tests can build their own configs
@koinsaari koinsaari force-pushed the feat/cmd-ingestion-walking-skeleton branch from 9533413 to 6581f48 Compare May 23, 2026 10:56
@claude
Copy link
Copy Markdown

claude Bot commented May 23, 2026

Claude encountered an error —— View job


Tasks

  • Read CLAUDE.md
  • Read changed files
  • Analyze for must-fix issues
  • Post review

@koinsaari koinsaari merged commit 5862acd into main May 23, 2026
6 of 8 checks passed
@koinsaari koinsaari deleted the feat/cmd-ingestion-walking-skeleton branch May 23, 2026 11:02
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant