You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
The openshell-server gained request-level logging via TraceLayer (#892), but every request is anonymous — there is no unique identifier tying a request's log lines together. Under concurrent load, interleaved log output from multiple requests sharing the same method and path is indistinguishable:
INFO request{method=POST path=/openshell.v1.OpenShell/CreateSandbox}: response status=200 latency_ms=45
INFO request{method=POST path=/openshell.v1.OpenShell/CreateSandbox}: response status=200 latency_ms=312
Which CreateSandbox call took 312ms? Without a request ID, operators have no way to:
Correlate logs across the middleware stack, handler, and downstream services for a single request
Reference requests in bug reports — clients cannot quote a request ID when reporting failures
Trace requests end-to-end — the sandbox inference proxy already passes through x-request-id headers (tested at crates/openshell-sandbox/src/l7/inference.rs:508), but the gateway never generates or propagates one
Build toward distributed tracing — a request ID is the minimum unit of correlation before adopting full OpenTelemetry
This is a natural follow-up to #892, completing the request observability story.
Proposed Design
Add a request-ID middleware using tower-http's request-id feature that generates a UUID for each inbound request (or preserves a client-supplied one), records it in the tracing span, and returns it in the response.
Implementation sketch
1. Enable the request-id feature on tower-http
In workspace Cargo.toml:
tower-http = { version = "0.6", features = ["cors", "trace", "request-id"] }
2. Implement MakeRequestId using the existing uuid crate
use tower_http::request_id::{MakeRequestId,RequestId};use http::HeaderValue;#[derive(Clone)]structUuidRequestId;implMakeRequestIdforUuidRequestId{fnmake_request_id<B>(&mutself,_req:&Request<B>) -> Option<RequestId>{let id = uuid::Uuid::new_v4().to_string();Some(RequestId::new(HeaderValue::from_str(&id).unwrap()))}}
3. Layer ordering in multiplex.rs
SetRequestId must run beforeTraceLayer so the span captures the ID. PropagateRequestId runs after (on the response path) to copy the ID into the response headers.
use tower_http::request_id::{SetRequestIdLayer,PropagateRequestIdLayer};let x_request_id = HeaderName::from_static("x-request-id");let grpc_service = ServiceBuilder::new().layer(SetRequestIdLayer::new(x_request_id.clone(),UuidRequestId)).layer(TraceLayer::new_for_http().make_span_with(make_request_span).on_request(()).on_response(log_response),).layer(PropagateRequestIdLayer::new(x_request_id.clone())).service(grpc_service);// Same for http_service
4. Record request ID in the tracing span
Update make_request_span to extract the ID from the request header (which SetRequestIdLayer has already inserted):
INFO request{method=POST path=/openshell.v1.OpenShell/CreateSandbox request_id=a1b2c3d4-...}: response status=200 latency_ms=45
INFO request{method=POST path=/openshell.v1.OpenShell/CreateSandbox request_id=e5f6a7b8-...}: response status=200 latency_ms=312
Client-supplied IDs
If a client sends x-request-id: my-correlation-id, SetRequestIdLayer (with overwrite: false, the default) preserves it. This lets CLI and SDK callers trace their own requests without server coordination.
Scope boundaries
Header name: x-request-id — the de facto standard, and already used by the sandbox inference proxy
Health check endpoints: Will receive request IDs like any other request. The health listener on the separate unauthenticated port (health_router() in lib.rs:195) does not get this middleware — it has no Tower layer stack
gRPC metadata: gRPC clients can set x-request-id as metadata; tonic maps custom metadata to HTTP/2 headers transparently
Response header: The response always includes x-request-id, whether server-generated or client-supplied
ID format: UUID v4 (128-bit, ~2^122 unique values). No risk of collision across instances
Alternatives Considered
Full OpenTelemetry with traceparent (W3C Trace Context) — provides distributed trace/span IDs, baggage propagation, and exporter integration. Significantly heavier: requires opentelemetry, opentelemetry-otlp, tracing-opentelemetry crates and a collector endpoint. The right long-term direction, but request-ID is the pragmatic first step that delivers immediate value. The two are not mutually exclusive — x-request-id can coexist with traceparent.
Custom middleware without tower-http — the logic is ~30 lines, but tower-http's SetRequestIdLayer and PropagateRequestIdLayer handle edge cases (header overwrite policy, type-safe RequestId newtype, integration with the tower ecosystem). No reason to reimplement.
Sequential integer IDs — simpler than UUIDs but not safe across multiple gateway instances in a Kubernetes deployment. UUIDs are globally unique without coordination.
Always overwrite client-supplied IDs — would break client correlation. The default overwrite: false behavior preserves client IDs, which is the expected behavior for proxies and API gateways.
x-trace-id or x-correlation-id header name — less widely adopted than x-request-id. The sandbox inference proxy already uses x-request-id in its passthrough test, so consistency favors this name.
Agent Investigation
Explored crates/openshell-server/src/ and the workspace configuration. Key findings:
No request ID exists anywhere in the server. Grep for request_id, x-request-id, trace-id, correlation-id across all server source files returns zero matches (outside sandbox inference proxy tests).
TraceLayer is applied at multiplex.rs:66-81 to both the gRPC and HTTP inner services via ServiceBuilder. The span currently records only method and path (make_request_span at line 236). Adding a request_id field is a ~3-line change.
tower-http v0.6.8 is the workspace version (Cargo.toml line 28). Only cors and trace features are enabled. The request-id feature adds SetRequestIdLayer, PropagateRequestIdLayer, and the MakeRequestId trait — no new transitive dependencies.
uuid v1.10 with v4 feature is already a workspace dependency (Cargo.toml line 101), used throughout the server for sandbox IDs, policy IDs, auth nonces, and session tokens.
The sandbox inference proxy already passes x-request-id through — there's a test at crates/openshell-sandbox/src/l7/inference.rs:508 that verifies the header is forwarded. This means the gateway generating the ID would flow through to inference backends automatically.
The health listener has no middleware (lib.rs:195 uses health_router().into_make_service() directly). This is intentional — it's unauthenticated and needs no request ID.
MultiplexedService implements hyper::service::Service, not tower::Service, confirming that the middleware must wrap the inner services (gRPC and HTTP), not the multiplexer — same constraint identified in feat: add request-level HTTP/gRPC tracing to openshell-server #892.
No OpenTelemetry dependencies exist in the workspace. The tracing subscriber (tracing_bus.rs:56) uses registry() + fmt::layer() + a custom SandboxLogLayer. Request-ID middleware does not require OpenTelemetry.
feat(server): metrics instrumentation #909 (metrics instrumentation) proposes a Tower middleware at the same location (multiplex.rs). The request-ID layer should be composed before the metrics layer so metrics can optionally use the request ID as a trace exemplar in the future.
Problem Statement
The openshell-server gained request-level logging via
TraceLayer(#892), but every request is anonymous — there is no unique identifier tying a request's log lines together. Under concurrent load, interleaved log output from multiple requests sharing the samemethodandpathis indistinguishable:Which
CreateSandboxcall took 312ms? Without a request ID, operators have no way to:x-request-idheaders (tested atcrates/openshell-sandbox/src/l7/inference.rs:508), but the gateway never generates or propagates oneThis is a natural follow-up to #892, completing the request observability story.
Proposed Design
Add a request-ID middleware using
tower-http'srequest-idfeature that generates a UUID for each inbound request (or preserves a client-supplied one), records it in the tracing span, and returns it in the response.Implementation sketch
1. Enable the
request-idfeature ontower-httpIn workspace
Cargo.toml:2. Implement
MakeRequestIdusing the existinguuidcrate3. Layer ordering in
multiplex.rsSetRequestIdmust run beforeTraceLayerso the span captures the ID.PropagateRequestIdruns after (on the response path) to copy the ID into the response headers.4. Record request ID in the tracing span
Update
make_request_spanto extract the ID from the request header (whichSetRequestIdLayerhas already inserted):Expected log output
Client-supplied IDs
If a client sends
x-request-id: my-correlation-id,SetRequestIdLayer(withoverwrite: false, the default) preserves it. This lets CLI and SDK callers trace their own requests without server coordination.Scope boundaries
x-request-id— the de facto standard, and already used by the sandbox inference proxyhealth_router()inlib.rs:195) does not get this middleware — it has no Tower layer stackx-request-idas metadata; tonic maps custom metadata to HTTP/2 headers transparentlyx-request-id, whether server-generated or client-suppliedAlternatives Considered
Full OpenTelemetry with
traceparent(W3C Trace Context) — provides distributed trace/span IDs, baggage propagation, and exporter integration. Significantly heavier: requiresopentelemetry,opentelemetry-otlp,tracing-opentelemetrycrates and a collector endpoint. The right long-term direction, but request-ID is the pragmatic first step that delivers immediate value. The two are not mutually exclusive —x-request-idcan coexist withtraceparent.Custom middleware without
tower-http— the logic is ~30 lines, buttower-http'sSetRequestIdLayerandPropagateRequestIdLayerhandle edge cases (header overwrite policy, type-safeRequestIdnewtype, integration with the tower ecosystem). No reason to reimplement.Sequential integer IDs — simpler than UUIDs but not safe across multiple gateway instances in a Kubernetes deployment. UUIDs are globally unique without coordination.
Always overwrite client-supplied IDs — would break client correlation. The default
overwrite: falsebehavior preserves client IDs, which is the expected behavior for proxies and API gateways.x-trace-idorx-correlation-idheader name — less widely adopted thanx-request-id. The sandbox inference proxy already usesx-request-idin its passthrough test, so consistency favors this name.Agent Investigation
Explored
crates/openshell-server/src/and the workspace configuration. Key findings:request_id,x-request-id,trace-id,correlation-idacross all server source files returns zero matches (outside sandbox inference proxy tests).TraceLayeris applied atmultiplex.rs:66-81to both the gRPC and HTTP inner services viaServiceBuilder. The span currently records onlymethodandpath(make_request_spanat line 236). Adding arequest_idfield is a ~3-line change.tower-httpv0.6.8 is the workspace version (Cargo.toml line 28). Onlycorsandtracefeatures are enabled. Therequest-idfeature addsSetRequestIdLayer,PropagateRequestIdLayer, and theMakeRequestIdtrait — no new transitive dependencies.uuidv1.10 withv4feature is already a workspace dependency (Cargo.toml line 101), used throughout the server for sandbox IDs, policy IDs, auth nonces, and session tokens.x-request-idthrough — there's a test atcrates/openshell-sandbox/src/l7/inference.rs:508that verifies the header is forwarded. This means the gateway generating the ID would flow through to inference backends automatically.lib.rs:195useshealth_router().into_make_service()directly). This is intentional — it's unauthenticated and needs no request ID.MultiplexedServiceimplementshyper::service::Service, nottower::Service, confirming that the middleware must wrap the inner services (gRPC and HTTP), not the multiplexer — same constraint identified in feat: add request-level HTTP/gRPC tracing to openshell-server #892.tracing_bus.rs:56) usesregistry()+fmt::layer()+ a customSandboxLogLayer. Request-ID middleware does not require OpenTelemetry.multiplex.rs). The request-ID layer should be composed before the metrics layer so metrics can optionally use the request ID as a trace exemplar in the future.