refactor agent sandbox infrastructure with new agent backends + rework logprobs capture#694
refactor agent sandbox infrastructure with new agent backends + rework logprobs capture#694rycerzes wants to merge 33 commits into
Conversation
- related tests
- fix agent handling
- tests
Greptile SummaryThis PR refactors the hardwired OpenCode environment into a shared
Confidence Score: 2/5Not safe to merge as-is — the server layer imports client code through the package init, violating a hard architectural boundary, and there is a real correctness race between setup commands and the running agent. The server indirectly pulls in client.py via coding_agent_env/init.py every time CodingAgentEnvironment initializes, crossing a boundary the repo treats as non-negotiable. The _exec_with_retry bug means install retries never fire for the most common failure mode (non-empty stderr), so cold sandbox installs will fail non-deterministically in CI. The setup/agent race is explicitly noted as a known compromise but affects any task that writes input files or installs dependencies before the agent's first model call. envs/coding_agent_env/server/coding_environment.py (client-server violation + setup race), envs/coding_agent_env/init.py (root init imports client), src/openenv/core/harness/agents/cli_driver.py (retry logic bug) Important Files Changed
Sequence DiagramsequenceDiagram
participant Orch as Orchestrator
participant Env as CodingAgentEnvironment
participant Factory as SessionFactory
participant Driver as CLIAgentDriver
participant SB as SandboxBackend (E2B/Docker/HF)
participant Proxy as InterceptionProxy
Orch->>Env: run_rollout(agent, instruction, setup, verify)
Env->>Factory: create(task)
Factory->>Driver: create_session(task, config)
Driver->>SB: create(timeout_s)
SB-->>Driver: SandboxHandle
Driver->>Driver: _bootstrap_sandbox
alt "mode == transparent_proxy"
Driver->>SB: start_bg(interception.py)
SB-->>Driver: proxy BgJob
end
Driver->>SB: start_bg(agent CLI)
SB-->>Driver: agent BgJob
Driver-->>Factory: CLIAgentSession
Factory-->>Env: session (agent already running)
Note over Env: setup commands run HERE — races with agent
loop setup commands
Env->>SB: exec(cmd)
end
Env->>Factory: wait_for_completion()
loop verify commands
Env->>SB: exec(cmd)
end
Env->>SB: kill()
Env-->>Orch: RolloutResult (JSON)
|
…roxy the transparent proxy was a passive forwarder that captured logprobs by injecting logprobs=true into upstream requests. It is replaced by the interception_gate mode where the trainer owns the forward pass entirely — no proxy needed inside the sandbox.
…eneration
InterceptionServer (aiohttp) runs on the trainer host. Each rollout
registers a queue. The agent's OPENAI_BASE_URL points at
`{base_url}/rollout/{id}/v1`. When the agent makes an LLM call it
blocks at the server. The training loop dequeues the request, calls
vLLM with logprobs=True and return_token_ids=True, and delivers the
response back via deliver_response().
There was a problem hiding this comment.
Reworked logprob capture
- Removed:
transparent_proxymode andsandbox/interception.py— a FastAPI proxy that ran inside the sandbox and injectedlogprobs=trueinto upstream LLM calls. - Added:
interception_gatemode withInterceptionServerrunning on the trainer host. The agent'sOPENAI_BASE_URLpoints to{base_url}/rollout/{id}/v1. LLM calls block at the server; the training loop dequeues viaqueue.get(), runs its own vLLM forward pass, and returns the response viadeliver_response(). HMAC-signed request IDs prevent cross-rollout leakage.
Driver modes are now black_box (eval/demos) and interception_gate (RL training). CLIAgentSession gains next_request() and deliver() for the training loop. the trainer runs vLLM, the agent points OPENAI_BASE_URL at it, and the trainer owns the forward pass and logprob capture entirely.
Same pattern as AReaL, verifiers, rLLM, AWS AgentCore RL Toolkit, Atropos, OpenRLHF, and OpenClaw-RL.
Network reachability is caller-owned.
Darktex
left a comment
There was a problem hiding this comment.
Alignment Review — PR #694
Reviewed by 4 parallel alignment-reviewer agents covering: core agents, core sandbox, environment changes, and tests.
Overall verdict: request_changes
Tier 1: Bugs & Mechanical Issues (16 findings)
Critical / High
| # | File | Issue |
|---|---|---|
| 1 | interception_server.py |
Deprecated API: asyncio.get_event_loop().create_future() → use asyncio.get_running_loop().create_future() (deprecated since 3.10, may error in future versions) |
| 2 | interception_server.py |
Race condition: self.intercepts and self.active_rollouts are plain dicts modified from concurrent coroutines without self._lock. Concurrent register_rollout + _handle_chat_completions + unregister_rollout can cause RuntimeError: dictionary changed size during iteration. Protect with asyncio.Lock. |
| 3 | interception_server.py |
Memory leak: Completed intercepts in self.intercepts are never removed after deliver_response(). Over long training runs every completed LLM call leaves a dead entry. Add del self.intercepts[request_id] after future resolves. |
| 4 | cli_driver.py |
Broken import: from openenv.core.harness import (...) references openenv.core.harness (singular) but the existing package is openenv.core.harnesses (plural). This will raise ModuleNotFoundError at runtime. |
| 5 | coding_environment.py |
NoneType crash: _build_session_factory calls self._E2BSandboxBackend(**kwargs) which is None when e2b is not installed. The old code raised a sentinel _RequiresE2B class. Add a guard: if self._E2BSandboxBackend is None: raise RuntimeError(...) |
| 6 | coding_environment.py |
Type mismatch: disable_thinking: Optional[bool] flows from the MCP tool through _run_rollout_impl into _build_agent_config(disable_thinking: bool). When None is passed, behavior may be wrong. Either pass disable_thinking_resolved or change the parameter type. |
| 7 | cli_driver.py |
Fragile private access: getattr(self._agent_bg_job, "_done", None) accesses a private BgJob implementation detail. If the attribute is renamed, next_request() silently spins until timeout. BgJob should expose a public is_done() method. |
Medium
| # | File | Issue |
|---|---|---|
| 8 | opencode.py |
Shell injection risk: _build_opencode_command interpolates {home}/{workdir} into shell commands without shlex.quote. If sandbox_home contains spaces or metacharacters, the command breaks or becomes injectable. |
| 9 | cli_driver.py |
Dead-code timeout logic: The two-step timeout resolution (budget = spec.default_timeout_s then override with config.agent_timeout_s) makes the first assignment dead when config has agent_timeout_s. Simplify to a single ternary chain. |
| 10 | docker_backend.py |
Invalid construction: DockerBgJob(poll_thread=None) with # type: ignore, then immediately mutated via job._poll_thread = poll_thread. Object is in invalid state between construction and mutation. |
| 11 | docker_backend.py |
Dead code: DockerBgJob._error is checked in wait() but never set anywhere. The error-raise branch is unreachable. Either wire it to the poll thread or remove it. |
| 12 | coding_environment.py |
Non-idempotent replay: Setup commands are run twice — once via setup_shell, then replayed individually "for observability". Common setup commands (pip install, mkdir) are not idempotent and may fail on the second run. |
| 13 | harness.py |
Leaky abstraction: CodingAgentSessionFactory._bootstrap_sandbox calls CLIAgentDriver private methods (_wait_for_sandbox_ready, _agent_already_installed, _exec_with_retry). If driver internals change, this breaks silently. Expose a public bootstrap() method instead. |
Low
| # | File | Issue |
|---|---|---|
| 14 | docker_backend.py |
UUID collision: uuid4().hex[:8] (2^32 values) for marker files. Use full hex (2^128) to avoid collisions under high-concurrency training. |
| 15 | test_hf_sandbox_backend.py |
Test isolation: _FakeSandboxAPI.calls is a class-level mutable list. Only one test clears it — others may see stale entries if run order changes. Move to instance variable or add autouse fixture. |
| 16 | test_docker_sandbox_backend.py |
Tautological assertion: assert isinstance(sandbox, SandboxBackend) or hasattr(sandbox, "exec") — the or arm defeats the purpose. Remove the fallback. |
Tier 2: Alignment Flags (8 findings)
Blocking Alignment Questions
FLAG 1 — Deleted security tests with no replacement
- Invariant: "No credential exposure" (INVARIANTS.md §Security §3)
- The old
test_opencode_env.pyhad two adversarial tests:test_start_proxy_keeps_upstream_key_out_of_command(shell injection withsk-test '$(leak)) andtest_interception_cli_reads_upstream_key_from_env. These verified API keys flow through env vars, not CLI argv. The new test suite only checksbg_envs["API_KEY"] == "sk-test-key"with clean inputs — no adversarial shell metacharacter testing. - Action needed: Restore equivalent adversarial security tests in
test_cli_agent_driver.py.
FLAG 2 — transparent_proxy mode removed / logprob capture moved outside env boundary
- Invariant: "Rewards inside environment" (RFC 002, INVARIANTS.md §Architectural §3)
_collect_proxy_turnsis now a stub returning[]. Logprob capture moves to the training loop viainterception_gate. This is a significant architectural boundary shift. Themode="transparent_proxy"string is still accepted by the API but silently produces emptyproxy_turns— a silent contract break.- Action needed: RFC documenting why logprob capture belongs outside the env boundary, or deprecation warning on
transparent_proxymode.
FLAG 3 — New harness infrastructure does not integrate with RFC 005
- Invariant: "One canonical way to build environments" (PRINCIPLES.md)
- RFC 005 defines
HarnessAdapter(ABC),HarnessConfig(Pydantic),HarnessEnvironment. This PR introduces a parallelCLIAgentSpec/CLIAgentDriver/ResourceSessionstack with no connection to RFC 005. Two diverging patterns for the same domain. - Action needed: Either extend RFC 005 abstractions or supersede with a new RFC.
Non-Blocking Alignment Concerns
FLAG 4 — InterceptionServer binds 0.0.0.0 (publicly exposed trainer host). Should default to 127.0.0.1.
FLAG 5 — --dangerously-skip-permissions hardcoded in OPENCODE_SPEC.base_command. Should be opt-in, not default.
FLAG 6 — docker_args passthrough allows callers to trivially bypass container isolation (--privileged, -v /:/mnt). Consider an allowlist.
FLAG 7 — HFSandboxCreateError may expose HF tokens via SDK exception messages propagated through f"... {last_error}".
FLAG 8 — No tests verify the dual API boundary invariant (agent cannot reach reset()/step()) in the new infrastructure. 5000+ lines of new code with no negative test for the most critical invariant.
What's Good
- The spec/driver declarative pattern is architecturally elegant — adding a new agent is writing a dataclass, not modifying the driver.
- The rename (
opencode_env→coding_agent_env) is clean and consistent across imports, docs, pyproject.toml, Dockerfile, and openenv.yaml. - The sandbox protocol abstraction (
SandboxBackend/SandboxHandle/BgJob) is well-designed and provides a clean seam for swapping backends. - Test coverage is extensive for the new driver lifecycle (1006 lines for cli_driver alone).
Recommended Next Steps
- Fix the 7 critical/high bugs (broken import, InterceptionServer race + leak, NoneType crash, type mismatch, deprecated API, private attribute access)
- Restore adversarial security tests for credential-in-argv
- Add a deprecation warning or validation for
mode="transparent_proxy" - Write an RFC (or RFC addendum) for the logprob ownership shift and the CLIAgentSpec/CLIAgentDriver pattern vs RFC 005
- Add a negative test asserting agents cannot access
reset()/step()through the new infrastructure
Darktex
left a comment
There was a problem hiding this comment.
Note: This is an automated review by Claude Code, not a human review.
Alignment Review — PR #694
Tier 1: Bugs & Code Quality
Bug 1 — Deprecated asyncio.get_event_loop() in InterceptionServer
src/openenv/core/harness/agents/interception_server.py uses asyncio.get_event_loop().create_future() inside an async def. This is deprecated in Python 3.10+ and raises RuntimeError when no event loop is running in the current thread. Since the handler is a coroutine, use asyncio.get_running_loop().create_future() instead.
Bug 2 — Double bootstrap: setup commands run twice
envs/coding_agent_env/server/coding_environment.py (_run_rollout_impl): Setup commands are passed to CodingAgentTask (which runs them in _bootstrap_sandbox), then the same commands are re-run individually "for observability." Non-idempotent commands (e.g. pip install --upgrade, destructive rm -rf) will break. Either drop the re-run loop or don't pass setup_shell to the task — the two-pass design is architecturally incoherent.
Bug 3 — CodingAgentSessionFactory.create bypasses its own CLIAgentDriver
envs/coding_agent_env/harness.py: The factory creates a CLIAgentDriver in __init__, but create() never calls self._driver.create_session(). Instead it manually reproduces the bootstrap by calling private driver methods (_wait_for_sandbox_ready, _agent_already_installed, _exec_with_retry). The _driver field is dead weight. Either delegate to the driver or remove it.
Bug 4 — _build_session_factory crashes when E2B is not installed
envs/coding_agent_env/server/coding_environment.py: self._E2BSandboxBackend is set to None when e2b is not installed. Calling None(...) raises a cryptic TypeError. Add an explicit guard: if self._E2BSandboxBackend is None: raise RuntimeError("E2B not installed").
Bug 5 — Dropped security tests without replacement
tests/envs/test_coding_agent_env.py removes test_start_proxy_keeps_upstream_key_out_of_command and test_interception_cli_reads_upstream_key_from_env with no equivalent in the new test suite. These verified that API keys never appear in process argv — a real security property that should be re-tested for InterceptionServer.
Resource leak — InterceptionServer._stream_response leaks intercept entries on connection reset
When ConnectionResetError is caught, the intercept dict entry is not removed. On retries, new entries accumulate. Consider cleaning up in the except block.
Tier 2: Alignment Flags (need human sign-off)
FLAG 1 — Logprob capture removed from environment without RFC
The old design captured per-token logprobs inside the environment and returned them as proxy_turns in RolloutResult. The new design returns proxy_turns = [] always and moves logprob capture to the training loop via interception_gate. This changes a public API contract — callers reading result.proxy_turns will silently get nothing. The RolloutResult model still declares the field. This is a significant behavioral change that should go through an RFC.
FLAG 2 — InterceptionServer introduces a third API boundary
INVARIANTS.md states OpenEnv exposes exactly two APIs: WebSocket (Gym-like) and MCP (for agents). The InterceptionServer is a third HTTP server (OpenAI-compatible proxy) that sits between the agent and the LLM. Whether this is a violation or a legitimate extension of training infrastructure deserves explicit discussion. The invariants doc should be updated either way.
FLAG 3 — Two parallel harness namespaces in core
There is already src/openenv/core/harnesses/ (plural). This PR adds src/openenv/core/harness/ (singular) with a parallel structure. Developers onboarding will see two harness abstractions with no guidance on which to use, violating the "one canonical way" principle.
Verdict
Well-motivated refactor with a clean CLIAgentSpec design and good test coverage for the new driver. However, three mechanical bugs need fixing before merge (deprecated asyncio API, double bootstrap, bypassed driver abstraction), and three alignment flags need human sign-off (removed logprob contract, third API boundary, duplicate harness namespaces).
cc: @Darktex for alignment flags
Automated review by Claude Code | Learn more
Darktex
left a comment
There was a problem hiding this comment.
Note: This is an automated review by Claude Code, not a human review.
Alignment Review — PR #694
Reviewed by 4 parallel alignment-reviewer agents covering: core agents, core sandbox, environment changes, and tests.
Overall verdict: request_changes
Tier 1: Bugs & Mechanical Issues (16 findings)
Critical / High
| # | File | Issue |
|---|---|---|
| 1 | interception_server.py |
Deprecated API: asyncio.get_event_loop().create_future() → use asyncio.get_running_loop().create_future() (deprecated since 3.10, may error in future versions) |
| 2 | interception_server.py |
Race condition: self.intercepts and self.active_rollouts are plain dicts modified from concurrent coroutines without self._lock. Concurrent register_rollout + _handle_chat_completions + unregister_rollout can cause RuntimeError: dictionary changed size during iteration. Protect with asyncio.Lock. |
| 3 | interception_server.py |
Memory leak: Completed intercepts in self.intercepts are never removed after deliver_response(). Over long training runs every completed LLM call leaves a dead entry. Add del self.intercepts[request_id] after future resolves. |
| 4 | cli_driver.py |
KeyError race: next_request() does bare server.intercepts[request_id] dict access after queue.get(). If close() calls unregister_rollout() from another thread between the queue read and the dict access, this raises an unhandled KeyError. Use .get() and handle None. |
| 5 | coding_environment.py |
NoneType crash: _build_session_factory calls self._E2BSandboxBackend(**kwargs) which is None when e2b is not installed. The old code raised a sentinel _RequiresE2B class. Add a guard: if self._E2BSandboxBackend is None: raise RuntimeError(...) |
| 6 | coding_environment.py |
Type mismatch: disable_thinking: Optional[bool] flows from the MCP tool through _run_rollout_impl into _build_agent_config(disable_thinking: bool). When None is passed, behavior may be wrong. Pass disable_thinking_resolved (the guaranteed-bool value) instead. |
| 7 | opencode.py |
Shell injection: _build_opencode_command uses "$(cat {instruction_file})" inside a double-quoted shell string. If the task instruction file contains shell metacharacters ($(...), backticks, "), they will be interpreted by the shell. Pass instruction via a method that doesn't expand shell metacharacters. |
Medium
| # | File | Issue |
|---|---|---|
| 8 | opencode.py + pi.py |
Unquoted paths: Both _build_opencode_command and _build_command interpolate home/workdir/instruction_file/log_file into shell strings without shlex.quote(). Paths with spaces or metacharacters will break. |
| 9 | docker_backend.py |
Invalid construction: DockerBgJob(poll_thread=None) with # type: ignore, then immediately mutated via job._poll_thread = poll_thread. Object is in invalid state between construction and mutation. Make poll_thread optional or use a factory method. |
| 10 | docker_backend.py |
Dead code: DockerBgJob._error is checked in wait() but never set anywhere. The error-raise branch is unreachable. Either wire it to the poll thread or remove it. |
| 11 | docker_backend.py |
Infinite loop: _poll_bg_job silences all exceptions with bare except Exception: pass. If the container is externally killed, the daemon thread loops forever at 0.5s intervals issuing docker exec calls against a gone container. Add a failure counter or container existence check. |
| 12 | coding_environment.py |
Double bootstrap: Setup commands run twice — once atomically inside _bootstrap_sandbox as setup_shell, then replayed individually post-factory.create() "for observability." Non-idempotent commands (pip install git+..., git clone, createdb) will fail on the second run. |
| 13 | harness.py |
Leaky abstraction: CodingAgentSessionFactory._bootstrap_sandbox calls CLIAgentDriver private methods (_wait_for_sandbox_ready, _agent_already_installed, _exec_with_retry). Expose a public bootstrap() method instead. |
Low
| # | File | Issue |
|---|---|---|
| 14 | docker_backend.py |
UUID collision: uuid4().hex[:8] (2^32 values) for marker files. Use full hex (2^128) to avoid collisions under high-concurrency training. |
| 15 | docker_backend.py + hf_backend.py |
Duplicated code: Identical _shell_quote implementation in both files. Move to base.py or a _util.py module. |
| 16 | interception_server.py |
Info leak: str(exc) from trainer-side exceptions sent to sandboxed agent in 500 responses. Replace with generic message; log detail server-side. |
Tier 2: Alignment Flags (8 findings)
Blocking Alignment Questions
FLAG 1 — Deleted security tests with no replacement
- Invariant: "No credential exposure" (INVARIANTS.md §Security §3)
- The old
test_opencode_env.pyhad two adversarial tests:test_start_proxy_keeps_upstream_key_out_of_command(shell injection withsk-test '$(leak)) andtest_interception_cli_reads_upstream_key_from_env. These verified API keys flow through env vars, not CLI argv. The new test suite only checksbg_envs["API_KEY"] == "sk-test-key"with clean inputs — no adversarial shell metacharacter testing. - Action needed: Restore equivalent adversarial security tests in
test_cli_agent_driver.py.
FLAG 2 — transparent_proxy mode removed / logprob capture moved outside env boundary
- Invariant: "Rewards inside environment" (RFC 002, INVARIANTS.md §Architectural §3)
_collect_proxy_turnsis now a stub returning[]. Logprob capture moves to the training loop viainterception_gate. This is a significant architectural boundary shift. Themode="transparent_proxy"string is still accepted by the API but silently produces emptyproxy_turns— a silent contract break.- Action needed: RFC documenting why logprob capture belongs outside the env boundary, or deprecation warning on
transparent_proxymode.
FLAG 3 — New harness infrastructure does not integrate with RFC 005
- Invariant: "One canonical way to build environments" (PRINCIPLES.md)
- RFC 005 defines
HarnessAdapter(ABC),HarnessConfig(Pydantic),HarnessEnvironment. This PR introduces a parallelCLIAgentSpec/CLIAgentDriver/ResourceSessionstack with no connection to RFC 005. Two diverging patterns for the same domain. - Action needed: Either extend RFC 005 abstractions or supersede with a new RFC.
FLAG 4 — InterceptionServer has zero dedicated tests
- Invariant: No credential exposure + Dual API boundary
- 324 new lines of security-critical async HTTP code (HMAC auth, per-rollout request queuing, streaming SSE) with exactly one test that only checks a constructor guard. No test of: 401 rejection, rollout-not-found 404,
deliver_responsepaths, concurrent rollout isolation, orunregister_rolloutcancellation. - Action needed: Add dedicated test suite for
InterceptionServer.
Non-Blocking Alignment Concerns
FLAG 5 — InterceptionServer binds 0.0.0.0 (publicly exposed trainer host). Should default to 127.0.0.1.
FLAG 6 — --dangerously-skip-permissions hardcoded in OPENCODE_SPEC.base_command. Should be opt-in, not default.
FLAG 7 — docker_args passthrough allows callers to trivially bypass container isolation (--privileged, -v /:/mnt). Consider an allowlist.
FLAG 8 — No tests verify the dual API boundary invariant (agent cannot reach reset()/step()) in the new infrastructure. 5000+ lines of new code with no negative test for the most critical invariant.
What's Good
- The spec/driver declarative pattern is architecturally elegant — adding a new agent is writing a dataclass, not modifying the driver.
- The rename (
opencode_env→coding_agent_env) is clean and consistent across imports, docs, pyproject.toml, Dockerfile, and openenv.yaml. - The sandbox protocol abstraction (
SandboxBackend/SandboxHandle/BgJob) is well-designed and provides a clean seam for swapping backends. - Test coverage is extensive for the new driver lifecycle (1006 lines for cli_driver alone).
Recommended Next Steps
- Fix the 7 critical/high bugs (broken import, InterceptionServer race + leak, NoneType crash, type mismatch, deprecated API, shell injection)
- Restore adversarial security tests for credential-in-argv
- Add a deprecation warning or validation for
mode="transparent_proxy" - Write an RFC (or RFC addendum) for the logprob ownership shift and the CLIAgentSpec/CLIAgentDriver pattern vs RFC 005
- Add a negative test asserting agents cannot access
reset()/step()through the new infrastructure
Automated review by Claude Code | Learn more
Darktex
left a comment
There was a problem hiding this comment.
Alignment Review — PR #694
Tier 1: Bugs / Security / Correctness (6 findings)
1. interception_server.py — asyncio.Queue created from wrong event loop
register_rollout() is a sync method that creates an asyncio.Queue(). If called from a training-loop thread that doesn't share the server's event loop (common pattern: asyncio.run() in a background thread), the queue binds to the wrong loop → RuntimeError: Task attached to a different loop at runtime.
2. docker_backend.py start_bg — fragile shell quoting for marker path
The marker path is concatenated into cmd before shell_quote():
wrapped = f"bash -c {shell_quote(cmd + f'; echo $? > {marker}')} &\necho $!"If cmd contains trailing single-quotes or && sequences, the concatenation produces invalid shell. Compare with hf_backend.py which correctly quotes the marker separately.
3. coding_environment.py _run_rollout_impl — setup_results fabricates exit_code=0
All setup commands are unconditionally recorded as exit_code=0 / "executed during bootstrap", even though actual per-command exit codes are unavailable. Callers inspecting setup_results[i].exit_code to diagnose failures will always see success. Should be an empty list rather than falsely populated.
4. interception_server.py — self.secret never validated non-empty
hmac.compare_digest("", "") returns True. While secrets.token_urlsafe(32) normally produces non-empty values, if secret="" is explicitly passed, all requests are authorized. Add assert self.secret or raise in __init__.
5. harness.py CodingAgentSessionFactory.create — interception_gate silently broken
The factory calls session.start_agent() instead of routing through CLIAgentDriver.create_session(). The driver's interception wiring path (which sets OPENAI_BASE_URL to the interception server URL and OPENAI_API_KEY to server.secret) is completely bypassed. Result: in interception_gate mode, sessions always use the original config.base_url instead of the interception server URL.
6. harness.py CodingAgentSession.start_agent — bypasses spec callables
Uses legacy opencode_runtime.build_run_cmd()/build_env_vars() directly instead of OPENCODE_SPEC-declared callables. Creates two divergent code paths for starting OpenCode — one via CLIAgentDriver and one via CodingAgentSession.
Tier 2: Alignment Flags (3 flags)
ALIGNMENT FLAG 1: Agent-to-trainer boundary crossing in interception_gate
- Invariant: "Agents cannot reset" / "Dual API boundary" (INVARIANTS.md)
- Concern: The new
interception_gategives agents network egress to the trainer'sInterceptionServerHTTP port, authenticated by a bearer token in sandbox env vars. The agent can trivially read its own env vars, discover the token, and craft arbitrary POST requests. No validation that request bodies are legitimate LLM calls. An agent could enumerate rollouts, poison other rollouts' response futures, or DoS the trainer. This changes the threat model vs. the prior in-sandbox proxy design. - Suggested reviewer: @Darktex
ALIGNMENT FLAG 2: setup_results observability regression
- Principle: Honest error reporting
- Concern: Replacing real exit codes with synthetic
exit_code=0removes debugging observability without RFC documentation.
ALIGNMENT FLAG 3: Two parallel code paths for same agent
- Principle: "Minimize lifecycle deltas" (PRINCIPLES.md)
- Concern:
CodingAgentSessionFactory→CodingAgentSession.start_agent()andCLIAgentSessionFactory→CLIAgentDriver._start_agent()are two paths that will diverge asOPENCODE_SPECevolves.
Verdict: Request Changes
The interception_gate broken wiring (#5), misleading setup_results (#3), and docker start_bg quoting bug (#2) need fixes before merge. The alignment flag on sandbox-to-trainer network boundary (#1) warrants explicit sign-off given it changes the threat model of agent execution.
Overall this is a well-structured refactor with good test coverage — the CLIAgentSpec/CLIAgentDriver pattern is clean and extensible. The issues above are fixable without rearchitecting.
Darktex
left a comment
There was a problem hiding this comment.
Note: This is an automated review by Claude Code, not a human review.
Alignment Review — PR #694
Automated Checks
- Lint: deferred to CI (PR branch not checked out locally)
- Debug code: CLEAN — no new
print/breakpoint/bareTODOin added files
Tier 1: Bugs / Issues
T1-1. Fabricated setup_results with hardcoded exit_code=0
envs/coding_agent_env/server/coding_environment.py — after bootstrap, the code synthesizes fake CommandResult entries with exit_code=0, stdout="executed during bootstrap", and duration_s=0.0 for every setup command regardless of actual outcome. Any monitoring or debugging relying on setup_results will get bogus data. Either remove the fabricated entries or document that individual-command telemetry is no longer available.
T1-2. InterceptionServer not validated as started in CLIAgentDriver
src/openenv/core/harness/agents/cli_driver.py — the constructor checks that the server is not None but never verifies start() was called. If a caller constructs the driver with an unstarted InterceptionServer, the agent will get a ConnectionRefusedError with no clear error message. Add a check (e.g., _app is not None) in the constructor or at create_session time.
T1-3. asyncio.Queue created outside running event loop — potential race
src/openenv/core/harness/agents/interception_server.py — register_rollout() is synchronous and creates asyncio.Queue(). In Python 3.10+, asyncio.Queue() should be created on a running event loop. If called from a different thread than the server's event loop (e.g., training thread), this can raise RuntimeError: no running event loop or bind the queue to the wrong loop. CLIAgentSession.next_request also uses asyncio.create_task with the same issue. Needs an explicit event-loop affinity strategy.
T1-4. _write_pi_models_config unconditionally writes to /root/
src/openenv/core/harness/agents/cli_driver.py — writes to both {home}/.pi/agent/models.json and /root/.pi/agent/models.json. When the sandbox user is non-root (as in E2B templates where user is user), the /root/ write may fail silently or propagate depending on backend. Gate on effective user or remove the /root/ path.
T1-5. _SUPPORTED_AGENTS hardcoded instead of using registry
envs/coding_agent_env/server/coding_environment.py — _SUPPORTED_AGENTS = ("opencode", "pi") is hardcoded. Adding a new agent via register_agent(spec) won't update validation. Use list_agents() from the registry or add a comment about the manual sync requirement.
Tier 2: Alignment Flags
🚩 ALIGNMENT FLAG: Core infrastructure added without RFC update
- Invariant: PRINCIPLES.md — "decisions documented in RFCs and should not be changed without a new RFC"
- Concern: This PR promotes
SandboxBackend,CLIAgentDriver,InterceptionServer, andCLIAgentSpecintosrc/openenv/core/harness/as shared infrastructure. RFC 005 describes a different design (HarnessAdapterwithsend_message()multi-turn pattern). This PR implementsCLIAgentSpec+ single-rolloutCLIAgentDriverinstead. The divergence from RFC 005 is not documented. Either update RFC 005 or create a new RFC explaining the chosen design. - Reviewer: @Darktex
🚩 ALIGNMENT FLAG: interception_gate secret exposure to sandbox agents
- Invariant: Agent isolation — agents cannot access reset/simulation controls
- Concern: The
InterceptionServerauthenticates via a bearer secret injected into the sandbox asOPENAI_API_KEY/ANTHROPIC_API_KEY. An agent that exfiltrates its env vars could construct arbitrary HTTP requests to the interception server, including calling registered tool handlers. The threat model for this design needs explicit documentation. The PR logs a warning whenhost="0.0.0.0"but doesn't document mitigations for secret exposure. - Reviewer: @Darktex
🚩 ALIGNMENT FLAG: proxy_turns silently becomes dead field
- Invariant: PRINCIPLES — "Minimize lifecycle deltas"
- Concern:
RolloutResult.proxy_turnswas the primary logprob capture mechanism. It's now always[]with a comment that logprob capture moved tointerception_gate. Any training code readingproxy_turnswill silently get nothing. The field should be deprecated with a warning or removed — leaving a dead field in the wire type without documentation violates lifecycle delta minimization. - Reviewer: @Darktex
🚩 ALIGNMENT FLAG: Server-side imports from env package root
- Invariant: Client-server separation
- Concern:
coding_environment.py(inserver/) imports fromcoding_agent_env.config,coding_agent_env.harness,coding_agent_env.task— these are primitive/config modules, not client code, so this is likely fine architecturally. However, the fallback import path uses barefrom models import ...which only works when CWD isserver/. This fragility should be verified. - Reviewer: @Darktex
Verdict: Request Changes
The T1 bugs — particularly the asyncio queue race in interception_gate mode (T1-3) and the fabricated setup_results (T1-1) — should be fixed before merge. The RFC 005 alignment question (Tier 2) is the most important architectural concern: if this PR implements RFC 005, the RFC should be updated to reflect the actual design. The interception_gate threat model also needs documentation given the agent isolation invariant.
Automated review by Claude Code | Learn more
Darktex
left a comment
There was a problem hiding this comment.
Note: This is an automated review by Claude Code, not a human review.
Alignment Review — PR #694
Tier 1: Bugs / Issues to Fix
-
coding_environment.py— dead import shadowed by second import:CLIAgentSessionFactoryis imported fromopenenv.core.harness.agents.cli_driver, then immediately shadowed byCodingAgentSessionFactoryimported fromcoding_agent_env.harness. The first import is never used and should be removed (will failruff check). -
coding_environment.py—setup_resultsalways reportsexit_code=0: When setup execution moved into_bootstrap_sandbox, the per-commandCommandResultentries hardcodeexit_code=0andstdout="executed during bootstrap". The old code tracked real exit codes; the new code discards them. This is a silent data-accuracy regression. -
coding_environment.py—_build_agent_configignoresdisable_thinking,top_logprobs, andmax_tokens_capfor theopencodeagent: These parameters are accepted byrun_rolloutand forwarded into_build_agent_config, but theagent == "opencode"branch constructsCodingAgentConfigwithout passingproxy_disable_thinking,proxy_top_logprobs, orproxy_max_tokens_cap. Callers who relied ondisable_thinking=True(e.g. for Qwen3) will silently get unexpected thinking tokens. -
interception_server.py—/healthendpoint bypasses auth:_handle_healthreturns{"status":"ok"}without calling_authorized. Given this server is designed to receive LLM requests from sandboxes tunneled over the public internet (the docstring explicitly says "your responsibility" for ngrok/frp), an unauthenticated health endpoint lets adversaries enumerate whether a trainer is running. -
docker_backend.py—write_textignores return code:sandbox.write_textcallssubprocess.runformkdir -pandcat >redirect, but neither return code is checked. If the write silently fails (full disk, unwriteable path), no exception is raised. Compare tohf_backend.pywhich checksr.exit_code != 0for mkdir. -
hf_backend.py—_normalize_exec_timeout(None)returns 86400s: A 24-hour timeout effectively means only HF-side timeouts protect against hung commands. IfNoneis intended to mean "use sandbox default", this should be documented; if unintentional, fall back to a sensible default (e.g. 300s).
Tier 2: Alignment Flags for Human Review
ALIGNMENT FLAG 1: Interception gate puts trainer-side server outside the sandbox boundary
- Principle at stake: RFC 002 — Rewards inside environment. The
interception_gatemode routes every LLM forward pass through a trainer-ownedInterceptionServerrunning on the trainer host outside any sandbox. This effectively moves trajectory-shaping logic outside theMCPEnvironmentboundary. This may be the right call for RL training, but it should be explicitly acknowledged as a trade-off against RFC 002's invariant. - Suggested reviewer: @Darktex
ALIGNMENT FLAG 2: CLIAgentSession dual-role object mixes trainer API + resource session
- Principle at stake: Agents cannot reset / access simulation controls.
CLIAgentSessionhasnext_request(),deliver(), andregister_tool_handler()methods (trainer APIs) on the same object as thereset/step/verifyinterface (ResourceSession). If a future code path passesCLIAgentSessionthrough normal session dispatch, these trainer methods become callable from contexts that should only have Gym-like access. Consider whether the trainer-side control surface should be a separate companion object. - Suggested reviewer: @Darktex
ALIGNMENT FLAG 3: Server imports from client-facing env package
- Principle at stake: Client-server separation.
server/coding_environment.pyimportsCodingAgentConfig,CodingAgentSessionFactory, andCodingAgentTaskfromcoding_agent_env(the env-level package). The env package's__init__.pyre-exports from bothclient.pyandharness.py. This is a pre-existing pattern but is now more visible. - Suggested reviewer: @Darktex
ALIGNMENT FLAG 4: Pi agent install.sh runs apt-get / curl | bash — assumes root-privileged sandbox
- Principle at stake: Container isolation. The Pi spec's setup runs
apt-get installandcurl | sh, requiring root inside the sandbox. E2B sandboxes default to non-root (user), which would silently fail the install. The Pi spec conflates root-privileged Docker sandboxes with user-level E2B sandboxes. At minimum, document that Pi requires a root-privileged backend. - Suggested reviewer: @Darktex
Verdict: REQUEST_CHANGES
The 6 Tier 1 issues should be fixed before merge. The Tier 2 alignment flags — particularly the interception_gate trade-off against RFC 002 — warrant explicit acknowledgment from @Darktex even if the decision is to accept the trade-off intentionally.
Automated review by Claude Code | Learn more
Darktex
left a comment
There was a problem hiding this comment.
Note: This is an automated review by Claude Code, not a human review.
Alignment Review — PR #694
Verdict: REQUEST_CHANGES
Automated Checks
- Lint: PASS —
ruff checkpasses onsrc/andtests/. - Debug code: CLEAN — No
print(),breakpoint(), or strayTODOdebug artifacts in new core files.
Tier 1: Bugs / Mechanical Issues
T1-1 (High). CodingAgentSessionFactory.create() bypasses CLIAgentDriver.create_session() — interception_gate mode silently broken for OpenCode path
envs/coding_agent_env/harness.py ~lines 740–746
The factory manually creates a sandbox, calls self._driver.bootstrap_sandbox(), then builds a CodingAgentSession and calls session.start_agent(). This entirely skips CLIAgentDriver.create_session(), which is the only place where interception_gate wiring happens: register_rollout(), interception_rollout_id, interception_queue, and base_url_override are never set. The resulting session has _base_url_override=None, so in interception_gate mode the OpenCode process still talks to the real config.base_url instead of the interception endpoint.
Fix: Delegate to self._driver.create_session(task, self._config, verifier=self._verifier, episode_id=episode_id) or restructure the factory to not re-implement what the driver already does.
T1-2 (Medium). Fabricated setup_results hides real setup failures from callers
envs/coding_agent_env/server/coding_environment.py ~lines 1534–1545
After setup commands move into bootstrap, _run_rollout_impl synthesizes fake CommandResult records with exit_code=0 and stdout="executed during bootstrap" regardless of whether they actually succeeded. The old code stopped rollout on the first setup failure and surfaced the real exit code. Training pipelines reading setup_results can never detect that their setup script failed.
Fix: Either pass real exit codes from _bootstrap_sandbox or don't populate setup_results at all rather than fabricating false successes.
T1-3 (Low). Missing ToolHandler export from agents/__init__.py
src/openenv/core/harness/agents/__init__.py
ToolHandler is used in the public API (CLIAgentSession.register_tool_handler()) but is not re-exported from the agents package __init__. Callers have to reach into openenv.core.harness.agents.interception_server directly, breaking the clean package boundary.
Fix: Add "ToolHandler" to __all__.
T1-4 (Low). Unauthenticated /health endpoint on InterceptionServer
src/openenv/core/harness/agents/interception_server.py ~line 5096
The /health endpoint does not call _authorized(). For a server reachable from sandboxes over a public tunnel (ngrok, frp, public IP — explicitly documented), this is acceptable for a simple {"status": "ok"} response, but should carry a comment noting it's intentionally unauthenticated so future contributors don't inadvertently add state information to it.
Tier 2: Alignment Flags for Human Review
FLAG 1: New third network interface (InterceptionServer) not covered by RFC
The InterceptionServer is a trainer-side HTTP server that sandboxed agents contact for LLM inference. This is a third interface beyond the documented dual-boundary model (WebSocket for orchestration, MCP for agent tools). Per PRINCIPLES.md, the two-interface architecture is an RFC-backed decision. The PR introduces a trainer endpoint reachable from "ngrok, frp, public IP, VPN" — the security tradeoffs and implications for the "rewards inside environment" principle (when the training loop owns generation) are not discussed in any RFC.
FLAG 2: Interception gate enables trainer to inject arbitrary content into agent context
In interception_gate mode, every LLM request is routed through a trainer-controlled server. The trainer can inject arbitrary response content, potentially including environment state, reward signals, or instructions that bypass the environment's own reward system. This shifts the "rewards inside environment" invariant enforcement entirely onto the training loop, with no framework-level guardrails.
FLAG 3: transparent_proxy mode removed without deprecation path
mode="transparent_proxy" was the documented default. This PR removes it entirely, replacing it with interception_gate. Any external caller with mode="transparent_proxy" hardcoded (training scripts, deployed Spaces) will break with an opaque error. Even pre-1.0, a deprecation warning or clear error message pointing to the replacement would be appropriate.
FLAG 4: Duplicate factory patterns
CodingAgentSessionFactory in harness.py partially duplicates CLIAgentDriver.create_session() while accepting interception_server/interception_base_url params it does not propagate. Should callers use CLIAgentSessionFactory(spec=OPENCODE_SPEC, ...) directly? Having two factories for the same agent spec that behave differently under the same mode is a hidden invariant violation.
Recommendation
Fix T1-1 (interception_gate silently broken) and T1-2 (fabricated setup results) before merge. The Tier 2 flags — especially the unRFC'd third interface — should be reviewed by a human maintainer to decide whether RFC coverage is needed before this lands on main.
Suggested human reviewer: @Darktex
Automated review by Claude Code | Learn more
|
Thank you for the reviews @Darktex! I will be fixing them shortly. Also could you clean up the duplicated noise caused by claude so that the thread is cleaner? |
- Wire disable_thinking and max_tokens_cap through CodingAgentConfig - Raise RuntimeError on mkdir/cat failures in docker backend - Propagate QueueFull exceptions instead of silently swallowing - Change CommandResult.exit_code to int | None for bootstrap clarity
replace asyncio.Queue with stdlib queue.Queue for the request notification path (server → training loop). This makes both directions of the InterceptionServer cross-loop/cross-thread safe: - Request notifications: queue.Queue (inherently thread-safe) - Response delivery: asyncio.Future via _resolve_future_threadsafe (already cross-loop safe) The consumer (next_request) uses asyncio.to_thread(q.get, timeout=...) to await without blocking the event loop. This follows the same pattern used by OpenClaw-RL at scale. chunk_queue (internal SSE streaming) remains asyncio.Queue since both producer and consumer run on the server's own event loop.
…andling - soak test
There was a problem hiding this comment.
fixed all the following issues:
interception_gatewiring fixed inCodingAgentSessionFactory.create()(37e549d)_build_agent_configparam forwarding,DockerBackend.write_textreturn code checks,_put_queue_threadsafeno longer drops onQueueFull, andsetup_resultsexit codes no longer fabricated (8aa9d18, a2b4388/448f6905)- Whitespace secret bypass and hardcoded
/root/path resolved (61e5524) - Cross-loop
asyncio.Queuedeadlock fixed by replacing with stdlibqueue.Queuefor the request notification path; consumer usesasyncio.to_thread(q.get, timeout=...)to avoid blocking the event loop.chunk_queueintentionally stays asasyncio.Queuesince both producer and consumer run on the server's own event loop (a2b4388/b10a4483)
transparent_proxy migration docs are a non-issue — only one env used it and it's already been migrated.
sergiopaniego
left a comment
There was a problem hiding this comment.
Inline comments on concrete issues found during review.
Summary
The coding environment was hardwired to OpenCode. This PR pulls the sandbox and harness infrastructure into
src/openenv/core/harness/so it's shared, adds a declarativeCLIAgentSpec+CLIAgentDriverthat any agent adapter can plug into, ships OpenCode and Pi as the first two adapters, adds Docker and HF Sandbox backends alongside the existing E2B one, and renamesopencode_env→coding_agent_env.Changes
Sandbox protocols moved to core (
src/openenv/core/harness/sandbox/).SandboxBackend,SandboxHandle,BgJobwere locked insideopencode_env. They're now the shared seam.E2BSandboxBackendis unchanged and already works with CubeSandbox (E2B-compatible self-hosted MicroVM sandbox) by pointingE2B_API_URLat it.Docker sandbox backend added (
src/openenv/core/harness/sandbox/docker_backend.py). Runs sandboxes viadocker run. Suitable for local development and CI.HF Sandbox backend added (
src/openenv/core/harness/sandbox/hf_backend.py). Wraps hf-sandbox.CLIAgentSpec+CLIAgentDriver(src/openenv/core/harness/agents/). Per-agent knowledge is expressed as a declarative dataclass — install script, files to upload, env vars, MCP config format, artifacts to collect — with three small callables for the parts that can't be data (command builder, MCP config serializer, stdout event parser). The driver reads these fields and runs the lifecycle; it has no per-agent branches. Same pattern as verifiers (Prime Intellect). Adding a new agent is writing a spec, not touching the driver.OpenCode refactored onto
CLIAgentSpec.envs/coding_agent_env/harness.pyis now a thin wrapper aroundCLIAgentSessionFactory.Pi adapter added (
src/openenv/core/harness/agents/pi.py). Runs Pi in print mode via--no-session -p @/instruction.txt, matching the verifiers Pi harness approach.opencode_env→coding_agent_env. The environment is for coding tasks; the harness is a config parameter.run_rolloutnow takesagent: str = "opencode"and supports"opencode"and"pi". Gradio UI gets an agent selector. All imports, docs,pyproject.toml,openenv.yaml, andDockerfileupdated.Tests
tests/core/test_cli_agent_driver.py— driver lifecycle via_FakeSandboxtests/core/test_harness_adapters.py— per-adapter: MCP config, CLI flags, event parsing, registrytests/core/test_docker_sandbox_backend.py— Docker backendtests/core/test_hf_sandbox_backend.py— HF backendtests/envs/test_coding_agent_env.py— end-to-end env tests (renamed, extended)References
CLIHarness— declarative harness pattern this followsremote_inf_engine.py— proxy-based logprob interception patternType of Change
Alignment Checklist
Before submitting, verify:
.claude/docs/PRINCIPLES.mdand this PR aligns with our principles.claude/docs/INVARIANTS.mdand no invariants are violated/pre-submit-pr(orbash .claude/hooks/lint.shand tests) and addressed all issuesRFC Status
Claude Code Review
n/a
cc: @burtenshaw