Skip to content

feat(elasticsearch_api): pushdown start/end timestamp on _mapping endpoint#6355

Open
congx4 wants to merge 1 commit intoquickwit-oss:mainfrom
congx4:cong.xie/es-mapping-time-range-pushdown
Open

feat(elasticsearch_api): pushdown start/end timestamp on _mapping endpoint#6355
congx4 wants to merge 1 commit intoquickwit-oss:mainfrom
congx4:cong.xie/es-mapping-time-range-pushdown

Conversation

@congx4
Copy link
Copy Markdown
Contributor

@congx4 congx4 commented Apr 29, 2026

Summary

Adds two optional URL query parameters to the ES-compatibility _mapping endpoint (GET /_elastic/{index}/_mapping and /_mappings):

  • start_timestamp (epoch seconds)
  • end_timestamp (epoch seconds)

When present, they are forwarded into quickwit_proto::search::ListFieldsRequest.start_timestamp / end_timestamp verbatim, enabling timestamp-based split pruning during field discovery. When absent, behaviour is identical to today (both proto fields stay None).

The proto already documents these fields:

// Time filter, expressed in seconds since epoch.
// That filter is to be interpreted as the semi-open interval:
// [start_timestamp, end_timestamp).
optional int64 start_timestamp = 3;
optional int64 end_timestamp = 4;

The infrastructure (list_relevant_splits honouring the window) has long been wired through ListFieldsRequest; this change just exposes the existing knob at the HTTP surface.

Why

Today es_compat_index_mapping hardcodes start_timestamp: None, end_timestamp: None, so every call scans every published split for every dynamic field. On large clusters with many custom.* / tag.* fields and months of data, the response can balloon to tens of megabytes and 30+ seconds — long enough to time out behind any modestly aggressive HTTP gateway. With the time window pushed down, mapping responses shrink dramatically.

Changes

  • New quickwit-serve/src/elasticsearch_api/model/index_mapping_query_params.rsIndexMappingQueryParams struct with Option<i64> fields, serde::Deserialize, #[serde(deny_unknown_fields)] (matching SearchQueryParams / CatIndexQueryParams convention), plus 5 parser-level unit tests.
  • Updated model/mod.rs — module declaration + pub use re-export.
  • Updated elasticsearch_api/filter.rs::elastic_index_mapping_filter — now extracts (String, IndexMappingQueryParams) via .and(warp::query()), mirroring the pattern in elasticsearch_filter and elastic_index_count_filter.
  • Updated elasticsearch_api/rest_handler.rs::es_compat_index_mapping — accepts params: IndexMappingQueryParams as its second argument and reads params.start_timestamp / params.end_timestamp into ListFieldsRequest. The factory es_compat_index_mapping_handler needs no edits — warp threads the new tuple element through positionally.
  • Updated Cargo.toml (workspace) and quickwit-serve/Cargo.tomlserde_urlencoded added as a dev-dependency for the parser-level tests (already a transitive dep via warp).

Backward compatibility

Fully backward-compatible. Clients that omit the query parameters get exactly today's behaviour. The deny_unknown_fields setting matches the existing convention on neighbouring endpoints (SearchQueryParams, CatIndexQueryParams); this means a stray ?pretty=true would now return 400 — but this matches _search's existing behaviour, so the surface is consistent across endpoints.

Test plan

  • cargo test -p quickwit-serve — 154 tests passed, 0 failed (5 new tests included).
  • cargo build -p quickwit-serve — builds clean.
  • cargo clippy -p quickwit-serve --all-features --tests — clean.
  • Unit tests cover: empty query string (None/None), both params present, only one present, unknown field rejected.

Notes

  • Did not update CHANGELOG.md because the [Unreleased] section is HTML-commented out in the file, suggesting maintainers update it during release. Happy to add an entry if preferred.
  • No proto regeneration is required because the ListFieldsRequest fields already exist.

🤖 Generated with Claude Code

…point

Today, GET /_elastic/{index}/_mapping always builds a ListFieldsRequest
with start_timestamp = end_timestamp = None, scanning every published
split to compute the response. On large clusters with many dynamic fields
this can produce multi-second latencies and tens of megabytes of payload
before any HTTP gateway timeout fires.

ListFieldsRequest already supports time-based split pruning via
optional int64 start_timestamp / end_timestamp (epoch seconds, half-open
interval) — that infrastructure has just never been exposed at the
ES-compat HTTP surface. This change adds two optional URL query params
with the same names and unit, deserializes them via warp::query(), and
forwards them into ListFieldsRequest verbatim. Behaviour with no query
string is identical to today (both fields stay None).

Files changed:
- New IndexMappingQueryParams model with serde derives + 5 unit tests
  covering parser defaults, both-present, partial, and the
  deny_unknown_fields rejection path.
- elastic_index_mapping_filter now extracts (String, IndexMappingQueryParams)
  via .and(warp::query()).
- es_compat_index_mapping takes the params struct as its second arg and
  threads start_timestamp / end_timestamp into ListFieldsRequest.
- serde_urlencoded added as a dev-dependency for parser-level tests
  (already a transitive dep via warp).

Backward-compatible: any client sending no query string keeps today's
exact behaviour.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
@congx4 congx4 requested review from PSeitz and guilload April 30, 2026 18:09
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant