refactor agent sandbox infrastructure with new agent backends + rework logprobs capture by rycerzes · Pull Request #694 · huggingface/OpenEnv

rycerzes · 2026-05-15T09:08:05Z

Summary

The coding environment was hardwired to OpenCode. This PR pulls the sandbox and harness infrastructure into src/openenv/core/harness/ so it's shared, adds a declarative CLIAgentSpec + CLIAgentDriver that any agent adapter can plug into, ships OpenCode and Pi as the first two adapters, adds Docker and HF Sandbox backends alongside the existing E2B one, and renames opencode_env → coding_agent_env.

Changes

Sandbox protocols moved to core (src/openenv/core/harness/sandbox/). SandboxBackend, SandboxHandle, BgJob were locked inside opencode_env. They're now the shared seam. E2BSandboxBackend is unchanged and already works with CubeSandbox (E2B-compatible self-hosted MicroVM sandbox) by pointing E2B_API_URL at it.

Docker sandbox backend added (src/openenv/core/harness/sandbox/docker_backend.py). Runs sandboxes via docker run. Suitable for local development and CI.

HF Sandbox backend added (src/openenv/core/harness/sandbox/hf_backend.py). Wraps hf-sandbox.

CLIAgentSpec + CLIAgentDriver (src/openenv/core/harness/agents/). Per-agent knowledge is expressed as a declarative dataclass — install script, files to upload, env vars, MCP config format, artifacts to collect — with three small callables for the parts that can't be data (command builder, MCP config serializer, stdout event parser). The driver reads these fields and runs the lifecycle; it has no per-agent branches. Same pattern as verifiers (Prime Intellect). Adding a new agent is writing a spec, not touching the driver.

OpenCode refactored onto CLIAgentSpec. envs/coding_agent_env/harness.py is now a thin wrapper around CLIAgentSessionFactory.

Pi adapter added (src/openenv/core/harness/agents/pi.py). Runs Pi in print mode via --no-session -p @/instruction.txt, matching the verifiers Pi harness approach.

opencode_env → coding_agent_env. The environment is for coding tasks; the harness is a config parameter. run_rollout now takes agent: str = "opencode" and supports "opencode" and "pi". Gradio UI gets an agent selector. All imports, docs, pyproject.toml, openenv.yaml, and Dockerfile updated.

Tests

tests/core/test_cli_agent_driver.py — driver lifecycle via _FakeSandbox
tests/core/test_harness_adapters.py — per-adapter: MCP config, CLI flags, event parsing, registry
tests/core/test_docker_sandbox_backend.py — Docker backend
tests/core/test_hf_sandbox_backend.py — HF backend
tests/envs/test_coding_agent_env.py — end-to-end env tests (renamed, extended)

References

verifiers CLIHarness — declarative harness pattern this follows
CubeSandbox — E2B-compatible self-hosted sandbox
HF Sandbox
Pi
Firehorse — studied for tool conflict and MCP failure patterns, not adopted (eval-only, no sandbox isolation, no training path)
AReaL remote_inf_engine.py — proxy-based logprob interception pattern

Type of Change

Alignment Checklist

Before submitting, verify:

I have read .claude/docs/PRINCIPLES.md and this PR aligns with our principles
I have checked .claude/docs/INVARIANTS.md and no invariants are violated
I have run /pre-submit-pr (or bash .claude/hooks/lint.sh and tests) and addressed all issues

RFC Status

Not required (bug fix, docs, minor refactoring)
RFC exists: #___
RFC needed (will create before merge)

Claude Code Review

n/a

cc: @burtenshaw

- related tests

- fix agent handling

- tests

greptile-apps · 2026-05-15T09:23:37Z

Greptile Summary

This PR refactors the hardwired OpenCode environment into a shared CLIAgentDriver / CLIAgentSpec infrastructure in src/openenv/core/harness/, renames opencode_env → coding_agent_env, and ships Docker and HF Sandbox backends alongside the existing E2B one. It is a substantial architectural refactor with two new agent adapters (OpenCode and Pi) and a declarative spec pattern that makes adding future agents trivial.

Core driver (cli_driver.py): introduces CLIAgentDriver / CLIAgentSessionFactory as the shared lifecycle engine; the _exec_with_retry helper has a correctness bug (early break on any non-empty stderr) that defeats the advertised 3-attempt install retry.
Server layer (coding_environment.py): lazy-imports via from coding_agent_env import ... which triggers __init__.py → client.py, violating the INVARIANTS client-server separation boundary; setup commands also run after the agent has already started, creating a correctness race for tasks that depend on setup completing first.
New backends (docker_backend.py, hf_backend.py): correctly implement the SandboxBackend protocol but are not wired into _build_session_factory, so they are unreachable via the MCP run_rollout tool.

Confidence Score: 2/5

Not safe to merge as-is — the server layer imports client code through the package init, violating a hard architectural boundary, and there is a real correctness race between setup commands and the running agent.

The server indirectly pulls in client.py via coding_agent_env/init.py every time CodingAgentEnvironment initializes, crossing a boundary the repo treats as non-negotiable. The _exec_with_retry bug means install retries never fire for the most common failure mode (non-empty stderr), so cold sandbox installs will fail non-deterministically in CI. The setup/agent race is explicitly noted as a known compromise but affects any task that writes input files or installs dependencies before the agent's first model call.

envs/coding_agent_env/server/coding_environment.py (client-server violation + setup race), envs/coding_agent_env/init.py (root init imports client), src/openenv/core/harness/agents/cli_driver.py (retry logic bug)

Important Files Changed

Filename	Overview
src/openenv/core/harness/agents/cli_driver.py	Core CLI driver for all agent harnesses; contains a retry bug in `_exec_with_retry` where any non-empty stderr immediately aborts retries, defeating 3-attempt install guarantees.
envs/coding_agent_env/server/coding_environment.py	New MCP environment server — has two issues: server lazy-imports `coding_agent_env.__init__` which pulls in `client.py` (violates client-server separation); setup commands race with the already-running agent.
src/openenv/core/harness/sandbox/docker_backend.py	New Docker sandbox backend; protocol-correct and well-structured, but not wired into the server layer so unreachable via the MCP `run_rollout` tool.
src/openenv/core/harness/sandbox/hf_backend.py	New HF Sandbox backend; `HFBgJob.wait()` uses a blocking polling loop instead of a background thread — acceptable but the calling thread is blocked for the full agent duration with no interrupt path.
envs/coding_agent_env/init.py	Root package init unconditionally imports `from .client import CodingAgentEnv`, causing server code that does `from coding_agent_env import ...` to indirectly pull in client code — violates INVARIANTS client-server separation.
src/openenv/core/harness/agents/base.py	Declarative `CLIAgentSpec` dataclass and supporting protocols — clean, well-documented, no issues.
src/openenv/core/harness/agents/opencode.py	OpenCode adapter expressed as a declarative spec; `_build_opencode_mcp_config` intentionally returns empty string (config written via `spec.files`), which is correct.
src/openenv/core/harness/agents/pi.py	Pi adapter — maps `api_key` to both `HF_TOKEN` and `OPENAI_API_KEY`, which is correct for Pi's multi-provider support.
src/openenv/core/harness/sandbox/interception.py	Transparent forwarding proxy for logprob capture; streaming error-path handling (non-2xx returning JSON instead of empty SSE) is a correctness improvement over the prior version.
envs/coding_agent_env/harness.py	Thin wrapper around `CLIAgentDriver`; calls `_driver._start_proxy()` directly (private API) which is a coupling smell but functionally correct.

Sequence Diagram

sequenceDiagram
    participant Orch as Orchestrator
    participant Env as CodingAgentEnvironment
    participant Factory as SessionFactory
    participant Driver as CLIAgentDriver
    participant SB as SandboxBackend (E2B/Docker/HF)
    participant Proxy as InterceptionProxy

    Orch->>Env: run_rollout(agent, instruction, setup, verify)
    Env->>Factory: create(task)
    Factory->>Driver: create_session(task, config)
    Driver->>SB: create(timeout_s)
    SB-->>Driver: SandboxHandle
    Driver->>Driver: _bootstrap_sandbox
    alt "mode == transparent_proxy"
        Driver->>SB: start_bg(interception.py)
        SB-->>Driver: proxy BgJob
    end
    Driver->>SB: start_bg(agent CLI)
    SB-->>Driver: agent BgJob
    Driver-->>Factory: CLIAgentSession
    Factory-->>Env: session (agent already running)
    Note over Env: setup commands run HERE — races with agent
    loop setup commands
        Env->>SB: exec(cmd)
    end
    Env->>Factory: wait_for_completion()
    loop verify commands
        Env->>SB: exec(cmd)
    end
    Env->>SB: kill()
    Env-->>Orch: RolloutResult (JSON)

Comments Outside Diff (2)

envs/coding_agent_env/server/coding_environment.py, line 693-711 (link)

Setup commands race with running agent

The agent is launched inside factory.create() via session.start_agent() before this loop runs. For tasks that require setup to complete before the agent's first tool call (e.g., installing a dependency or writing an input file the agent reads on start), this is a silent correctness failure: the agent may attempt to import pandas while the pip install pandas setup command is still running. The inline comment acknowledges the race but frames it as "fine for typical work" — that assumption breaks for any task where the agent reads setup-created files or depends on pre-installed packages.

Prompt To Fix With AI

This is a comment left during a code review.
Path: envs/coding_agent_env/server/coding_environment.py
Line: 693-711

Comment:
**Setup commands race with running agent**

The agent is launched inside `factory.create()` via `session.start_agent()` before this loop runs. For tasks that require setup to complete before the agent's first tool call (e.g., installing a dependency or writing an input file the agent reads on start), this is a silent correctness failure: the agent may attempt to import `pandas` while the `pip install pandas` setup command is still running. The inline comment acknowledges the race but frames it as "fine for typical work" — that assumption breaks for any task where the agent reads setup-created files or depends on pre-installed packages.

How can I resolve this? If you propose a fix, please make it concise.

envs/coding_agent_env/server/coding_environment.py, line 831-854 (link)

ALIGNMENT FLAG: New Docker/HF sandbox backends are unreachable via the MCP tool

_build_session_factory always creates E2BSandboxBackend. The DockerSandboxBackend and HFSandboxBackend added in this PR to src/openenv/core/harness/sandbox/ are only usable programmatically via the harness primitive, not through the deployed run_rollout MCP tool. There is no backend parameter in the tool signature to select them. The PR description bills Docker as "suitable for CI" but CI agents can only reach the environment via MCP, so Docker is effectively unused in the primary run path.

Principle at stake: "Be hands-on: Provide ready-to-use implementations, not just specs" (PRINCIPLES.md)
The concern: New backends shipped to core are disconnected from the server layer with no path to adoption
Suggested reviewer: @darktex

Context Used: .claude/docs/PRINCIPLES.md (source)

Prompt To Fix With AI

This is a comment left during a code review.
Path: envs/coding_agent_env/server/coding_environment.py
Line: 831-854

Comment:
**ALIGNMENT FLAG: New Docker/HF sandbox backends are unreachable via the MCP tool**

`_build_session_factory` always creates `E2BSandboxBackend`. The `DockerSandboxBackend` and `HFSandboxBackend` added in this PR to `src/openenv/core/harness/sandbox/` are only usable programmatically via the harness primitive, not through the deployed `run_rollout` MCP tool. There is no `backend` parameter in the tool signature to select them. The PR description bills Docker as "suitable for CI" but CI agents can only reach the environment via MCP, so Docker is effectively unused in the primary run path.

- **Principle at stake**: "Be hands-on: Provide ready-to-use implementations, not just specs" (PRINCIPLES.md)
- **The concern**: New backends shipped to core are disconnected from the server layer with no path to adoption
- **Suggested reviewer**: `@darktex`

**Context Used:** .claude/docs/PRINCIPLES.md ([source](https://app.greptile.com/review/custom-context?memory=67b96369-b31e-4918-84c9-a0a4e3c8aa97))

How can I resolve this? If you propose a fix, please make it concise.

Prompt To Fix All With AI

Fix the following 4 code review issues. Work through them one at a time, proposing concise fixes.

---

### Issue 1 of 4
src/openenv/core/harness/agents/cli_driver.py:446-465
**`_exec_with_retry` breaks on first non-empty stderr, defeating retries**

The `if last_stderr.strip(): break` exits the retry loop on the very first failure that produces any stderr output — which is exactly the case for transient network errors (e.g., `curl: (6) Could not resolve host` or a flaky apt mirror). The install stage advertises 3-attempt retries with exponential backoff, but in practice retries only fire when a command exits non-zero with _empty_ stderr or throws a Python exception, which covers almost no real-world install failures.

For example, `curl -fsSL https://opencode.ai/install | bash` failing with a DNS error or a 503 will immediately raise without retry, rather than being retried after 3s and 6s as intended.

```suggestion
                last_stdout = r.stdout or ""
                last_stderr = r.stderr or ""
                last_exit = r.exit_code
```

### Issue 2 of 4
envs/coding_agent_env/server/coding_environment.py:693-711
**Setup commands race with running agent**

The agent is launched inside `factory.create()` via `session.start_agent()` before this loop runs. For tasks that require setup to complete before the agent's first tool call (e.g., installing a dependency or writing an input file the agent reads on start), this is a silent correctness failure: the agent may attempt to import `pandas` while the `pip install pandas` setup command is still running. The inline comment acknowledges the race but frames it as "fine for typical work" — that assumption breaks for any task where the agent reads setup-created files or depends on pre-installed packages.

### Issue 3 of 4
envs/coding_agent_env/server/coding_environment.py:425-432
**ALIGNMENT FLAG: Server imports client code via `coding_agent_env.__init__`**

The lazy import `from coding_agent_env import E2BSandboxBackend, CodingAgentConfig, CodingAgentSessionFactory, CodingAgentTask` triggers `coding_agent_env/__init__.py` which unconditionally executes `from .client import CodingAgentEnv`. This means `server/coding_environment.py` (server code) indirectly imports `client.py`, directly violating INVARIANTS.md §2 "Client-server separation": "Server code must never import client code."

The fix is to import from the specific sub-modules (`from coding_agent_env.config import CodingAgentConfig`, etc.) or split `__init__.py` into server-side and client-side exports.

- **Invariant at risk**: Client-server separation (INVARIANTS.md §2)
- **Suggested reviewer**: `@darktex`

### Issue 4 of 4
envs/coding_agent_env/server/coding_environment.py:831-854
**ALIGNMENT FLAG: New Docker/HF sandbox backends are unreachable via the MCP tool**

`_build_session_factory` always creates `E2BSandboxBackend`. The `DockerSandboxBackend` and `HFSandboxBackend` added in this PR to `src/openenv/core/harness/sandbox/` are only usable programmatically via the harness primitive, not through the deployed `run_rollout` MCP tool. There is no `backend` parameter in the tool signature to select them. The PR description bills Docker as "suitable for CI" but CI agents can only reach the environment via MCP, so Docker is effectively unused in the primary run path.

- **Principle at stake**: "Be hands-on: Provide ready-to-use implementations, not just specs" (PRINCIPLES.md)
- **The concern**: New backends shipped to core are disconnected from the server layer with no path to adoption
- **Suggested reviewer**: `@darktex`

_{Reviews (1): Last reviewed commit: "feat: hf sandbox backend" | Re-trigger Greptile}

…roxy the transparent proxy was a passive forwarder that captured logprobs by injecting logprobs=true into upstream requests. It is replaced by the interception_gate mode where the trainer owns the forward pass entirely — no proxy needed inside the sandbox.

…eneration InterceptionServer (aiohttp) runs on the trainer host. Each rollout registers a queue. The agent's OPENAI_BASE_URL points at `{base_url}/rollout/{id}/v1`. When the agent makes an LLM call it blocks at the server. The training loop dequeues the request, calls vLLM with logprobs=True and return_token_ids=True, and delivers the response back via deliver_response().

rycerzes

Reworked logprob capture

Removed: transparent_proxy mode and sandbox/interception.py — a FastAPI proxy that ran inside the sandbox and injected logprobs=true into upstream LLM calls.
Added: interception_gate mode with InterceptionServer running on the trainer host. The agent's OPENAI_BASE_URL points to {base_url}/rollout/{id}/v1. LLM calls block at the server; the training loop dequeues via queue.get(), runs its own vLLM forward pass, and returns the response via deliver_response(). HMAC-signed request IDs prevent cross-rollout leakage.

Driver modes are now black_box (eval/demos) and interception_gate (RL training). CLIAgentSession gains next_request() and deliver() for the training loop. the trainer runs vLLM, the agent points OPENAI_BASE_URL at it, and the trainer owns the forward pass and logprob capture entirely.

Same pattern as AReaL, verifiers, rLLM, AWS AgentCore RL Toolkit, Atropos, OpenRLHF, and OpenClaw-RL.

Network reachability is caller-owned.

Darktex

Alignment Review — PR #694

Reviewed by 4 parallel alignment-reviewer agents covering: core agents, core sandbox, environment changes, and tests.

Overall verdict: request_changes

Tier 1: Bugs & Mechanical Issues (16 findings)

Critical / High

#	File	Issue
1	`interception_server.py`	Deprecated API: `asyncio.get_event_loop().create_future()` → use `asyncio.get_running_loop().create_future()` (deprecated since 3.10, may error in future versions)
2	`interception_server.py`	Race condition: `self.intercepts` and `self.active_rollouts` are plain dicts modified from concurrent coroutines without `self._lock`. Concurrent `register_rollout` + `_handle_chat_completions` + `unregister_rollout` can cause `RuntimeError: dictionary changed size during iteration`. Protect with `asyncio.Lock`.
3	`interception_server.py`	Memory leak: Completed intercepts in `self.intercepts` are never removed after `deliver_response()`. Over long training runs every completed LLM call leaves a dead entry. Add `del self.intercepts[request_id]` after future resolves.
4	`cli_driver.py`	Broken import: `from openenv.core.harness import (...)` references `openenv.core.harness` (singular) but the existing package is `openenv.core.harnesses` (plural). This will raise `ModuleNotFoundError` at runtime.
5	`coding_environment.py`	NoneType crash: `_build_session_factory` calls `self._E2BSandboxBackend(**kwargs)` which is `None` when `e2b` is not installed. The old code raised a sentinel `_RequiresE2B` class. Add a guard: `if self._E2BSandboxBackend is None: raise RuntimeError(...)`
6	`coding_environment.py`	Type mismatch: `disable_thinking: Optional[bool]` flows from the MCP tool through `_run_rollout_impl` into `_build_agent_config(disable_thinking: bool)`. When `None` is passed, behavior may be wrong. Either pass `disable_thinking_resolved` or change the parameter type.
7	`cli_driver.py`	Fragile private access: `getattr(self._agent_bg_job, "_done", None)` accesses a private `BgJob` implementation detail. If the attribute is renamed, `next_request()` silently spins until timeout. `BgJob` should expose a public `is_done()` method.

Medium

#	File	Issue
8	`opencode.py`	Shell injection risk: `_build_opencode_command` interpolates `{home}`/`{workdir}` into shell commands without `shlex.quote`. If `sandbox_home` contains spaces or metacharacters, the command breaks or becomes injectable.
9	`cli_driver.py`	Dead-code timeout logic: The two-step timeout resolution (`budget = spec.default_timeout_s` then override with `config.agent_timeout_s`) makes the first assignment dead when config has `agent_timeout_s`. Simplify to a single ternary chain.
10	`docker_backend.py`	Invalid construction: `DockerBgJob(poll_thread=None)` with `# type: ignore`, then immediately mutated via `job._poll_thread = poll_thread`. Object is in invalid state between construction and mutation.
11	`docker_backend.py`	Dead code: `DockerBgJob._error` is checked in `wait()` but never set anywhere. The error-raise branch is unreachable. Either wire it to the poll thread or remove it.
12	`coding_environment.py`	Non-idempotent replay: Setup commands are run twice — once via `setup_shell`, then replayed individually "for observability". Common setup commands (`pip install`, `mkdir`) are not idempotent and may fail on the second run.
13	`harness.py`	Leaky abstraction: `CodingAgentSessionFactory._bootstrap_sandbox` calls `CLIAgentDriver` private methods (`_wait_for_sandbox_ready`, `_agent_already_installed`, `_exec_with_retry`). If driver internals change, this breaks silently. Expose a public `bootstrap()` method instead.

Low

#	File	Issue
14	`docker_backend.py`	UUID collision: `uuid4().hex[:8]` (2^32 values) for marker files. Use full `hex` (2^128) to avoid collisions under high-concurrency training.
15	`test_hf_sandbox_backend.py`	Test isolation: `_FakeSandboxAPI.calls` is a class-level mutable list. Only one test clears it — others may see stale entries if run order changes. Move to instance variable or add `autouse` fixture.
16	`test_docker_sandbox_backend.py`	Tautological assertion: `assert isinstance(sandbox, SandboxBackend) or hasattr(sandbox, "exec")` — the `or` arm defeats the purpose. Remove the fallback.

Tier 2: Alignment Flags (8 findings)

Blocking Alignment Questions

FLAG 1 — Deleted security tests with no replacement ⚠️

Invariant: "No credential exposure" (INVARIANTS.md §Security §3)
The old test_opencode_env.py had two adversarial tests: test_start_proxy_keeps_upstream_key_out_of_command (shell injection with sk-test '$(leak)) and test_interception_cli_reads_upstream_key_from_env. These verified API keys flow through env vars, not CLI argv. The new test suite only checks bg_envs["API_KEY"] == "sk-test-key" with clean inputs — no adversarial shell metacharacter testing.
Action needed: Restore equivalent adversarial security tests in test_cli_agent_driver.py.

FLAG 2 — transparent_proxy mode removed / logprob capture moved outside env boundary

Invariant: "Rewards inside environment" (RFC 002, INVARIANTS.md §Architectural §3)
_collect_proxy_turns is now a stub returning []. Logprob capture moves to the training loop via interception_gate. This is a significant architectural boundary shift. The mode="transparent_proxy" string is still accepted by the API but silently produces empty proxy_turns — a silent contract break.
Action needed: RFC documenting why logprob capture belongs outside the env boundary, or deprecation warning on transparent_proxy mode.

FLAG 3 — New harness infrastructure does not integrate with RFC 005

Invariant: "One canonical way to build environments" (PRINCIPLES.md)
RFC 005 defines HarnessAdapter (ABC), HarnessConfig (Pydantic), HarnessEnvironment. This PR introduces a parallel CLIAgentSpec/CLIAgentDriver/ResourceSession stack with no connection to RFC 005. Two diverging patterns for the same domain.
Action needed: Either extend RFC 005 abstractions or supersede with a new RFC.

Non-Blocking Alignment Concerns

FLAG 4 — InterceptionServer binds 0.0.0.0 (publicly exposed trainer host). Should default to 127.0.0.1.

FLAG 5 — --dangerously-skip-permissions hardcoded in OPENCODE_SPEC.base_command. Should be opt-in, not default.

FLAG 6 — docker_args passthrough allows callers to trivially bypass container isolation (--privileged, -v /:/mnt). Consider an allowlist.

FLAG 7 — HFSandboxCreateError may expose HF tokens via SDK exception messages propagated through f"... {last_error}".

FLAG 8 — No tests verify the dual API boundary invariant (agent cannot reach reset()/step()) in the new infrastructure. 5000+ lines of new code with no negative test for the most critical invariant.

What's Good

The spec/driver declarative pattern is architecturally elegant — adding a new agent is writing a dataclass, not modifying the driver.
The rename (opencode_env → coding_agent_env) is clean and consistent across imports, docs, pyproject.toml, Dockerfile, and openenv.yaml.
The sandbox protocol abstraction (SandboxBackend/SandboxHandle/BgJob) is well-designed and provides a clean seam for swapping backends.
Test coverage is extensive for the new driver lifecycle (1006 lines for cli_driver alone).

Recommended Next Steps

Fix the 7 critical/high bugs (broken import, InterceptionServer race + leak, NoneType crash, type mismatch, deprecated API, private attribute access)
Restore adversarial security tests for credential-in-argv
Add a deprecation warning or validation for mode="transparent_proxy"
Write an RFC (or RFC addendum) for the logprob ownership shift and the CLIAgentSpec/CLIAgentDriver pattern vs RFC 005
Add a negative test asserting agents cannot access reset()/step() through the new infrastructure

Darktex

Note: This is an automated review by Claude Code, not a human review.

Alignment Review — PR #694

Tier 1: Bugs & Code Quality

Bug 1 — Deprecated asyncio.get_event_loop() in InterceptionServer
src/openenv/core/harness/agents/interception_server.py uses asyncio.get_event_loop().create_future() inside an async def. This is deprecated in Python 3.10+ and raises RuntimeError when no event loop is running in the current thread. Since the handler is a coroutine, use asyncio.get_running_loop().create_future() instead.

Bug 2 — Double bootstrap: setup commands run twice
envs/coding_agent_env/server/coding_environment.py (_run_rollout_impl): Setup commands are passed to CodingAgentTask (which runs them in _bootstrap_sandbox), then the same commands are re-run individually "for observability." Non-idempotent commands (e.g. pip install --upgrade, destructive rm -rf) will break. Either drop the re-run loop or don't pass setup_shell to the task — the two-pass design is architecturally incoherent.

Bug 3 — CodingAgentSessionFactory.create bypasses its own CLIAgentDriver
envs/coding_agent_env/harness.py: The factory creates a CLIAgentDriver in __init__, but create() never calls self._driver.create_session(). Instead it manually reproduces the bootstrap by calling private driver methods (_wait_for_sandbox_ready, _agent_already_installed, _exec_with_retry). The _driver field is dead weight. Either delegate to the driver or remove it.

Bug 4 — _build_session_factory crashes when E2B is not installed
envs/coding_agent_env/server/coding_environment.py: self._E2BSandboxBackend is set to None when e2b is not installed. Calling None(...) raises a cryptic TypeError. Add an explicit guard: if self._E2BSandboxBackend is None: raise RuntimeError("E2B not installed").

Bug 5 — Dropped security tests without replacement
tests/envs/test_coding_agent_env.py removes test_start_proxy_keeps_upstream_key_out_of_command and test_interception_cli_reads_upstream_key_from_env with no equivalent in the new test suite. These verified that API keys never appear in process argv — a real security property that should be re-tested for InterceptionServer.

Resource leak — InterceptionServer._stream_response leaks intercept entries on connection reset
When ConnectionResetError is caught, the intercept dict entry is not removed. On retries, new entries accumulate. Consider cleaning up in the except block.

Tier 2: Alignment Flags (need human sign-off)

FLAG 1 — Logprob capture removed from environment without RFC
The old design captured per-token logprobs inside the environment and returned them as proxy_turns in RolloutResult. The new design returns proxy_turns = [] always and moves logprob capture to the training loop via interception_gate. This changes a public API contract — callers reading result.proxy_turns will silently get nothing. The RolloutResult model still declares the field. This is a significant behavioral change that should go through an RFC.

FLAG 2 — InterceptionServer introduces a third API boundary
INVARIANTS.md states OpenEnv exposes exactly two APIs: WebSocket (Gym-like) and MCP (for agents). The InterceptionServer is a third HTTP server (OpenAI-compatible proxy) that sits between the agent and the LLM. Whether this is a violation or a legitimate extension of training infrastructure deserves explicit discussion. The invariants doc should be updated either way.

FLAG 3 — Two parallel harness namespaces in core
There is already src/openenv/core/harnesses/ (plural). This PR adds src/openenv/core/harness/ (singular) with a parallel structure. Developers onboarding will see two harness abstractions with no guidance on which to use, violating the "one canonical way" principle.

Verdict

Well-motivated refactor with a clean CLIAgentSpec design and good test coverage for the new driver. However, three mechanical bugs need fixing before merge (deprecated asyncio API, double bootstrap, bypassed driver abstraction), and three alignment flags need human sign-off (removed logprob contract, third API boundary, duplicate harness namespaces).

cc: @Darktex for alignment flags

Automated review by Claude Code | Learn more

Darktex

Note: This is an automated review by Claude Code, not a human review.

Alignment Review — PR #694

Reviewed by 4 parallel alignment-reviewer agents covering: core agents, core sandbox, environment changes, and tests.

Overall verdict: request_changes

Tier 1: Bugs & Mechanical Issues (16 findings)

Critical / High

#	File	Issue
1	`interception_server.py`	Deprecated API: `asyncio.get_event_loop().create_future()` → use `asyncio.get_running_loop().create_future()` (deprecated since 3.10, may error in future versions)
2	`interception_server.py`	Race condition: `self.intercepts` and `self.active_rollouts` are plain dicts modified from concurrent coroutines without `self._lock`. Concurrent `register_rollout` + `_handle_chat_completions` + `unregister_rollout` can cause `RuntimeError: dictionary changed size during iteration`. Protect with `asyncio.Lock`.
3	`interception_server.py`	Memory leak: Completed intercepts in `self.intercepts` are never removed after `deliver_response()`. Over long training runs every completed LLM call leaves a dead entry. Add `del self.intercepts[request_id]` after future resolves.
4	`cli_driver.py`	KeyError race: `next_request()` does bare `server.intercepts[request_id]` dict access after `queue.get()`. If `close()` calls `unregister_rollout()` from another thread between the queue read and the dict access, this raises an unhandled `KeyError`. Use `.get()` and handle `None`.
5	`coding_environment.py`	NoneType crash: `_build_session_factory` calls `self._E2BSandboxBackend(**kwargs)` which is `None` when `e2b` is not installed. The old code raised a sentinel `_RequiresE2B` class. Add a guard: `if self._E2BSandboxBackend is None: raise RuntimeError(...)`
6	`coding_environment.py`	Type mismatch: `disable_thinking: Optional[bool]` flows from the MCP tool through `_run_rollout_impl` into `_build_agent_config(disable_thinking: bool)`. When `None` is passed, behavior may be wrong. Pass `disable_thinking_resolved` (the guaranteed-`bool` value) instead.
7	`opencode.py`	Shell injection: `_build_opencode_command` uses `"$(cat {instruction_file})"` inside a double-quoted shell string. If the task instruction file contains shell metacharacters (`$(...)`, backticks, `"`), they will be interpreted by the shell. Pass instruction via a method that doesn't expand shell metacharacters.

Medium

#	File	Issue
8	`opencode.py` + `pi.py`	Unquoted paths: Both `_build_opencode_command` and `_build_command` interpolate `home`/`workdir`/`instruction_file`/`log_file` into shell strings without `shlex.quote()`. Paths with spaces or metacharacters will break.
9	`docker_backend.py`	Invalid construction: `DockerBgJob(poll_thread=None)` with `# type: ignore`, then immediately mutated via `job._poll_thread = poll_thread`. Object is in invalid state between construction and mutation. Make `poll_thread` optional or use a factory method.
10	`docker_backend.py`	Dead code: `DockerBgJob._error` is checked in `wait()` but never set anywhere. The error-raise branch is unreachable. Either wire it to the poll thread or remove it.
11	`docker_backend.py`	Infinite loop: `_poll_bg_job` silences all exceptions with bare `except Exception: pass`. If the container is externally killed, the daemon thread loops forever at 0.5s intervals issuing `docker exec` calls against a gone container. Add a failure counter or container existence check.
12	`coding_environment.py`	Double bootstrap: Setup commands run twice — once atomically inside `_bootstrap_sandbox` as `setup_shell`, then replayed individually post-`factory.create()` "for observability." Non-idempotent commands (`pip install git+...`, `git clone`, `createdb`) will fail on the second run.
13	`harness.py`	Leaky abstraction: `CodingAgentSessionFactory._bootstrap_sandbox` calls `CLIAgentDriver` private methods (`_wait_for_sandbox_ready`, `_agent_already_installed`, `_exec_with_retry`). Expose a public `bootstrap()` method instead.

Low

#	File	Issue
14	`docker_backend.py`	UUID collision: `uuid4().hex[:8]` (2^32 values) for marker files. Use full `hex` (2^128) to avoid collisions under high-concurrency training.
15	`docker_backend.py` + `hf_backend.py`	Duplicated code: Identical `_shell_quote` implementation in both files. Move to `base.py` or a `_util.py` module.
16	`interception_server.py`	Info leak: `str(exc)` from trainer-side exceptions sent to sandboxed agent in 500 responses. Replace with generic message; log detail server-side.

Tier 2: Alignment Flags (8 findings)

Blocking Alignment Questions

FLAG 1 — Deleted security tests with no replacement ⚠️

Invariant: "No credential exposure" (INVARIANTS.md §Security §3)
The old test_opencode_env.py had two adversarial tests: test_start_proxy_keeps_upstream_key_out_of_command (shell injection with sk-test '$(leak)) and test_interception_cli_reads_upstream_key_from_env. These verified API keys flow through env vars, not CLI argv. The new test suite only checks bg_envs["API_KEY"] == "sk-test-key" with clean inputs — no adversarial shell metacharacter testing.
Action needed: Restore equivalent adversarial security tests in test_cli_agent_driver.py.

FLAG 2 — transparent_proxy mode removed / logprob capture moved outside env boundary

Invariant: "Rewards inside environment" (RFC 002, INVARIANTS.md §Architectural §3)
_collect_proxy_turns is now a stub returning []. Logprob capture moves to the training loop via interception_gate. This is a significant architectural boundary shift. The mode="transparent_proxy" string is still accepted by the API but silently produces empty proxy_turns — a silent contract break.
Action needed: RFC documenting why logprob capture belongs outside the env boundary, or deprecation warning on transparent_proxy mode.

FLAG 3 — New harness infrastructure does not integrate with RFC 005

Invariant: "One canonical way to build environments" (PRINCIPLES.md)
RFC 005 defines HarnessAdapter (ABC), HarnessConfig (Pydantic), HarnessEnvironment. This PR introduces a parallel CLIAgentSpec/CLIAgentDriver/ResourceSession stack with no connection to RFC 005. Two diverging patterns for the same domain.
Action needed: Either extend RFC 005 abstractions or supersede with a new RFC.

FLAG 4 — InterceptionServer has zero dedicated tests

Invariant: No credential exposure + Dual API boundary
324 new lines of security-critical async HTTP code (HMAC auth, per-rollout request queuing, streaming SSE) with exactly one test that only checks a constructor guard. No test of: 401 rejection, rollout-not-found 404, deliver_response paths, concurrent rollout isolation, or unregister_rollout cancellation.
Action needed: Add dedicated test suite for InterceptionServer.

Non-Blocking Alignment Concerns

FLAG 5 — InterceptionServer binds 0.0.0.0 (publicly exposed trainer host). Should default to 127.0.0.1.

FLAG 6 — --dangerously-skip-permissions hardcoded in OPENCODE_SPEC.base_command. Should be opt-in, not default.

FLAG 7 — docker_args passthrough allows callers to trivially bypass container isolation (--privileged, -v /:/mnt). Consider an allowlist.

FLAG 8 — No tests verify the dual API boundary invariant (agent cannot reach reset()/step()) in the new infrastructure. 5000+ lines of new code with no negative test for the most critical invariant.

What's Good

The spec/driver declarative pattern is architecturally elegant — adding a new agent is writing a dataclass, not modifying the driver.
The rename (opencode_env → coding_agent_env) is clean and consistent across imports, docs, pyproject.toml, Dockerfile, and openenv.yaml.
The sandbox protocol abstraction (SandboxBackend/SandboxHandle/BgJob) is well-designed and provides a clean seam for swapping backends.
Test coverage is extensive for the new driver lifecycle (1006 lines for cli_driver alone).

Recommended Next Steps

Fix the 7 critical/high bugs (broken import, InterceptionServer race + leak, NoneType crash, type mismatch, deprecated API, shell injection)
Restore adversarial security tests for credential-in-argv
Add a deprecation warning or validation for mode="transparent_proxy"
Write an RFC (or RFC addendum) for the logprob ownership shift and the CLIAgentSpec/CLIAgentDriver pattern vs RFC 005
Add a negative test asserting agents cannot access reset()/step() through the new infrastructure

Automated review by Claude Code | Learn more

Darktex

Alignment Review — PR #694

Tier 1: Bugs / Security / Correctness (6 findings)

1. interception_server.py — asyncio.Queue created from wrong event loop
register_rollout() is a sync method that creates an asyncio.Queue(). If called from a training-loop thread that doesn't share the server's event loop (common pattern: asyncio.run() in a background thread), the queue binds to the wrong loop → RuntimeError: Task attached to a different loop at runtime.

2. docker_backend.py start_bg — fragile shell quoting for marker path
The marker path is concatenated into cmd before shell_quote():

wrapped = f"bash -c {shell_quote(cmd + f'; echo $? > {marker}')} &\necho $!"

If cmd contains trailing single-quotes or && sequences, the concatenation produces invalid shell. Compare with hf_backend.py which correctly quotes the marker separately.

3. coding_environment.py _run_rollout_impl — setup_results fabricates exit_code=0
All setup commands are unconditionally recorded as exit_code=0 / "executed during bootstrap", even though actual per-command exit codes are unavailable. Callers inspecting setup_results[i].exit_code to diagnose failures will always see success. Should be an empty list rather than falsely populated.

4. interception_server.py — self.secret never validated non-empty
hmac.compare_digest("", "") returns True. While secrets.token_urlsafe(32) normally produces non-empty values, if secret="" is explicitly passed, all requests are authorized. Add assert self.secret or raise in __init__.

5. harness.py CodingAgentSessionFactory.create — interception_gate silently broken ⚠️
The factory calls session.start_agent() instead of routing through CLIAgentDriver.create_session(). The driver's interception wiring path (which sets OPENAI_BASE_URL to the interception server URL and OPENAI_API_KEY to server.secret) is completely bypassed. Result: in interception_gate mode, sessions always use the original config.base_url instead of the interception server URL.

6. harness.py CodingAgentSession.start_agent — bypasses spec callables
Uses legacy opencode_runtime.build_run_cmd()/build_env_vars() directly instead of OPENCODE_SPEC-declared callables. Creates two divergent code paths for starting OpenCode — one via CLIAgentDriver and one via CodingAgentSession.

Tier 2: Alignment Flags (3 flags)

ALIGNMENT FLAG 1: Agent-to-trainer boundary crossing in interception_gate

Invariant: "Agents cannot reset" / "Dual API boundary" (INVARIANTS.md)
Concern: The new interception_gate gives agents network egress to the trainer's InterceptionServer HTTP port, authenticated by a bearer token in sandbox env vars. The agent can trivially read its own env vars, discover the token, and craft arbitrary POST requests. No validation that request bodies are legitimate LLM calls. An agent could enumerate rollouts, poison other rollouts' response futures, or DoS the trainer. This changes the threat model vs. the prior in-sandbox proxy design.
Suggested reviewer: @Darktex

ALIGNMENT FLAG 2: setup_results observability regression

Principle: Honest error reporting
Concern: Replacing real exit codes with synthetic exit_code=0 removes debugging observability without RFC documentation.

ALIGNMENT FLAG 3: Two parallel code paths for same agent

Principle: "Minimize lifecycle deltas" (PRINCIPLES.md)
Concern: CodingAgentSessionFactory → CodingAgentSession.start_agent() and CLIAgentSessionFactory → CLIAgentDriver._start_agent() are two paths that will diverge as OPENCODE_SPEC evolves.

Verdict: Request Changes

The interception_gate broken wiring (#5), misleading setup_results (#3), and docker start_bg quoting bug (#2) need fixes before merge. The alignment flag on sandbox-to-trainer network boundary (#1) warrants explicit sign-off given it changes the threat model of agent execution.

Overall this is a well-structured refactor with good test coverage — the CLIAgentSpec/CLIAgentDriver pattern is clean and extensible. The issues above are fixable without rearchitecting.

Darktex

Note: This is an automated review by Claude Code, not a human review.

Alignment Review — PR #694

Automated Checks

Lint: deferred to CI (PR branch not checked out locally)
Debug code: CLEAN — no new print/breakpoint/bare TODO in added files

Tier 1: Bugs / Issues

T1-1. Fabricated setup_results with hardcoded exit_code=0
envs/coding_agent_env/server/coding_environment.py — after bootstrap, the code synthesizes fake CommandResult entries with exit_code=0, stdout="executed during bootstrap", and duration_s=0.0 for every setup command regardless of actual outcome. Any monitoring or debugging relying on setup_results will get bogus data. Either remove the fabricated entries or document that individual-command telemetry is no longer available.

T1-2. InterceptionServer not validated as started in CLIAgentDriver
src/openenv/core/harness/agents/cli_driver.py — the constructor checks that the server is not None but never verifies start() was called. If a caller constructs the driver with an unstarted InterceptionServer, the agent will get a ConnectionRefusedError with no clear error message. Add a check (e.g., _app is not None) in the constructor or at create_session time.

T1-3. asyncio.Queue created outside running event loop — potential race
src/openenv/core/harness/agents/interception_server.py — register_rollout() is synchronous and creates asyncio.Queue(). In Python 3.10+, asyncio.Queue() should be created on a running event loop. If called from a different thread than the server's event loop (e.g., training thread), this can raise RuntimeError: no running event loop or bind the queue to the wrong loop. CLIAgentSession.next_request also uses asyncio.create_task with the same issue. Needs an explicit event-loop affinity strategy.

T1-4. _write_pi_models_config unconditionally writes to /root/
src/openenv/core/harness/agents/cli_driver.py — writes to both {home}/.pi/agent/models.json and /root/.pi/agent/models.json. When the sandbox user is non-root (as in E2B templates where user is user), the /root/ write may fail silently or propagate depending on backend. Gate on effective user or remove the /root/ path.

T1-5. _SUPPORTED_AGENTS hardcoded instead of using registry
envs/coding_agent_env/server/coding_environment.py — _SUPPORTED_AGENTS = ("opencode", "pi") is hardcoded. Adding a new agent via register_agent(spec) won't update validation. Use list_agents() from the registry or add a comment about the manual sync requirement.

Tier 2: Alignment Flags

🚩 ALIGNMENT FLAG: Core infrastructure added without RFC update

Invariant: PRINCIPLES.md — "decisions documented in RFCs and should not be changed without a new RFC"
Concern: This PR promotes SandboxBackend, CLIAgentDriver, InterceptionServer, and CLIAgentSpec into src/openenv/core/harness/ as shared infrastructure. RFC 005 describes a different design (HarnessAdapter with send_message() multi-turn pattern). This PR implements CLIAgentSpec + single-rollout CLIAgentDriver instead. The divergence from RFC 005 is not documented. Either update RFC 005 or create a new RFC explaining the chosen design.
Reviewer: @Darktex

🚩 ALIGNMENT FLAG: interception_gate secret exposure to sandbox agents

Invariant: Agent isolation — agents cannot access reset/simulation controls
Concern: The InterceptionServer authenticates via a bearer secret injected into the sandbox as OPENAI_API_KEY/ANTHROPIC_API_KEY. An agent that exfiltrates its env vars could construct arbitrary HTTP requests to the interception server, including calling registered tool handlers. The threat model for this design needs explicit documentation. The PR logs a warning when host="0.0.0.0" but doesn't document mitigations for secret exposure.
Reviewer: @Darktex

🚩 ALIGNMENT FLAG: proxy_turns silently becomes dead field

Invariant: PRINCIPLES — "Minimize lifecycle deltas"
Concern: RolloutResult.proxy_turns was the primary logprob capture mechanism. It's now always [] with a comment that logprob capture moved to interception_gate. Any training code reading proxy_turns will silently get nothing. The field should be deprecated with a warning or removed — leaving a dead field in the wire type without documentation violates lifecycle delta minimization.
Reviewer: @Darktex

🚩 ALIGNMENT FLAG: Server-side imports from env package root

Invariant: Client-server separation
Concern: coding_environment.py (in server/) imports from coding_agent_env.config, coding_agent_env.harness, coding_agent_env.task — these are primitive/config modules, not client code, so this is likely fine architecturally. However, the fallback import path uses bare from models import ... which only works when CWD is server/. This fragility should be verified.
Reviewer: @Darktex

Verdict: Request Changes

The T1 bugs — particularly the asyncio queue race in interception_gate mode (T1-3) and the fabricated setup_results (T1-1) — should be fixed before merge. The RFC 005 alignment question (Tier 2) is the most important architectural concern: if this PR implements RFC 005, the RFC should be updated to reflect the actual design. The interception_gate threat model also needs documentation given the agent isolation invariant.

Automated review by Claude Code | Learn more

Darktex

Note: This is an automated review by Claude Code, not a human review.

Alignment Review — PR #694

Tier 1: Bugs / Issues to Fix

coding_environment.py — dead import shadowed by second import: CLIAgentSessionFactory is imported from openenv.core.harness.agents.cli_driver, then immediately shadowed by CodingAgentSessionFactory imported from coding_agent_env.harness. The first import is never used and should be removed (will fail ruff check).
coding_environment.py — setup_results always reports exit_code=0: When setup execution moved into _bootstrap_sandbox, the per-command CommandResult entries hardcode exit_code=0 and stdout="executed during bootstrap". The old code tracked real exit codes; the new code discards them. This is a silent data-accuracy regression.
coding_environment.py — _build_agent_config ignores disable_thinking, top_logprobs, and max_tokens_cap for the opencode agent: These parameters are accepted by run_rollout and forwarded into _build_agent_config, but the agent == "opencode" branch constructs CodingAgentConfig without passing proxy_disable_thinking, proxy_top_logprobs, or proxy_max_tokens_cap. Callers who relied on disable_thinking=True (e.g. for Qwen3) will silently get unexpected thinking tokens.
interception_server.py — /health endpoint bypasses auth: _handle_health returns {"status":"ok"} without calling _authorized. Given this server is designed to receive LLM requests from sandboxes tunneled over the public internet (the docstring explicitly says "your responsibility" for ngrok/frp), an unauthenticated health endpoint lets adversaries enumerate whether a trainer is running.
docker_backend.py — write_text ignores return code: sandbox.write_text calls subprocess.run for mkdir -p and cat > redirect, but neither return code is checked. If the write silently fails (full disk, unwriteable path), no exception is raised. Compare to hf_backend.py which checks r.exit_code != 0 for mkdir.
hf_backend.py — _normalize_exec_timeout(None) returns 86400s: A 24-hour timeout effectively means only HF-side timeouts protect against hung commands. If None is intended to mean "use sandbox default", this should be documented; if unintentional, fall back to a sensible default (e.g. 300s).

Tier 2: Alignment Flags for Human Review

ALIGNMENT FLAG 1: Interception gate puts trainer-side server outside the sandbox boundary

Principle at stake: RFC 002 — Rewards inside environment. The interception_gate mode routes every LLM forward pass through a trainer-owned InterceptionServer running on the trainer host outside any sandbox. This effectively moves trajectory-shaping logic outside the MCPEnvironment boundary. This may be the right call for RL training, but it should be explicitly acknowledged as a trade-off against RFC 002's invariant.
Suggested reviewer: @Darktex

ALIGNMENT FLAG 2: CLIAgentSession dual-role object mixes trainer API + resource session

Principle at stake: Agents cannot reset / access simulation controls. CLIAgentSession has next_request(), deliver(), and register_tool_handler() methods (trainer APIs) on the same object as the reset/step/verify interface (ResourceSession). If a future code path passes CLIAgentSession through normal session dispatch, these trainer methods become callable from contexts that should only have Gym-like access. Consider whether the trainer-side control surface should be a separate companion object.
Suggested reviewer: @Darktex

ALIGNMENT FLAG 3: Server imports from client-facing env package

Principle at stake: Client-server separation. server/coding_environment.py imports CodingAgentConfig, CodingAgentSessionFactory, and CodingAgentTask from coding_agent_env (the env-level package). The env package's __init__.py re-exports from both client.py and harness.py. This is a pre-existing pattern but is now more visible.
Suggested reviewer: @Darktex

ALIGNMENT FLAG 4: Pi agent install.sh runs apt-get / curl | bash — assumes root-privileged sandbox

Principle at stake: Container isolation. The Pi spec's setup runs apt-get install and curl | sh, requiring root inside the sandbox. E2B sandboxes default to non-root (user), which would silently fail the install. The Pi spec conflates root-privileged Docker sandboxes with user-level E2B sandboxes. At minimum, document that Pi requires a root-privileged backend.
Suggested reviewer: @Darktex

Verdict: REQUEST_CHANGES

The 6 Tier 1 issues should be fixed before merge. The Tier 2 alignment flags — particularly the interception_gate trade-off against RFC 002 — warrant explicit acknowledgment from @Darktex even if the decision is to accept the trade-off intentionally.

Automated review by Claude Code | Learn more

Darktex

Note: This is an automated review by Claude Code, not a human review.

Alignment Review — PR #694

Verdict: REQUEST_CHANGES

Automated Checks

Lint: PASS — ruff check passes on src/ and tests/.
Debug code: CLEAN — No print(), breakpoint(), or stray TODO debug artifacts in new core files.

Tier 1: Bugs / Mechanical Issues

T1-1 (High). CodingAgentSessionFactory.create() bypasses CLIAgentDriver.create_session() — interception_gate mode silently broken for OpenCode path

envs/coding_agent_env/harness.py ~lines 740–746

The factory manually creates a sandbox, calls self._driver.bootstrap_sandbox(), then builds a CodingAgentSession and calls session.start_agent(). This entirely skips CLIAgentDriver.create_session(), which is the only place where interception_gate wiring happens: register_rollout(), interception_rollout_id, interception_queue, and base_url_override are never set. The resulting session has _base_url_override=None, so in interception_gate mode the OpenCode process still talks to the real config.base_url instead of the interception endpoint.

Fix: Delegate to self._driver.create_session(task, self._config, verifier=self._verifier, episode_id=episode_id) or restructure the factory to not re-implement what the driver already does.

T1-2 (Medium). Fabricated setup_results hides real setup failures from callers

envs/coding_agent_env/server/coding_environment.py ~lines 1534–1545

After setup commands move into bootstrap, _run_rollout_impl synthesizes fake CommandResult records with exit_code=0 and stdout="executed during bootstrap" regardless of whether they actually succeeded. The old code stopped rollout on the first setup failure and surfaced the real exit code. Training pipelines reading setup_results can never detect that their setup script failed.

Fix: Either pass real exit codes from _bootstrap_sandbox or don't populate setup_results at all rather than fabricating false successes.

T1-3 (Low). Missing ToolHandler export from agents/__init__.py

src/openenv/core/harness/agents/__init__.py

ToolHandler is used in the public API (CLIAgentSession.register_tool_handler()) but is not re-exported from the agents package __init__. Callers have to reach into openenv.core.harness.agents.interception_server directly, breaking the clean package boundary.

Fix: Add "ToolHandler" to __all__.

T1-4 (Low). Unauthenticated /health endpoint on InterceptionServer

src/openenv/core/harness/agents/interception_server.py ~line 5096

The /health endpoint does not call _authorized(). For a server reachable from sandboxes over a public tunnel (ngrok, frp, public IP — explicitly documented), this is acceptable for a simple {"status": "ok"} response, but should carry a comment noting it's intentionally unauthenticated so future contributors don't inadvertently add state information to it.

Tier 2: Alignment Flags for Human Review

FLAG 1: New third network interface (InterceptionServer) not covered by RFC

The InterceptionServer is a trainer-side HTTP server that sandboxed agents contact for LLM inference. This is a third interface beyond the documented dual-boundary model (WebSocket for orchestration, MCP for agent tools). Per PRINCIPLES.md, the two-interface architecture is an RFC-backed decision. The PR introduces a trainer endpoint reachable from "ngrok, frp, public IP, VPN" — the security tradeoffs and implications for the "rewards inside environment" principle (when the training loop owns generation) are not discussed in any RFC.

FLAG 2: Interception gate enables trainer to inject arbitrary content into agent context

In interception_gate mode, every LLM request is routed through a trainer-controlled server. The trainer can inject arbitrary response content, potentially including environment state, reward signals, or instructions that bypass the environment's own reward system. This shifts the "rewards inside environment" invariant enforcement entirely onto the training loop, with no framework-level guardrails.

FLAG 3: transparent_proxy mode removed without deprecation path

mode="transparent_proxy" was the documented default. This PR removes it entirely, replacing it with interception_gate. Any external caller with mode="transparent_proxy" hardcoded (training scripts, deployed Spaces) will break with an opaque error. Even pre-1.0, a deprecation warning or clear error message pointing to the replacement would be appropriate.

FLAG 4: Duplicate factory patterns

CodingAgentSessionFactory in harness.py partially duplicates CLIAgentDriver.create_session() while accepting interception_server/interception_base_url params it does not propagate. Should callers use CLIAgentSessionFactory(spec=OPENCODE_SPEC, ...) directly? Having two factories for the same agent spec that behave differently under the same mode is a hidden invariant violation.

Recommendation

Fix T1-1 (interception_gate silently broken) and T1-2 (fabricated setup results) before merge. The Tier 2 flags — especially the unRFC'd third interface — should be reviewed by a human maintainer to decide whether RFC coverage is needed before this lands on main.

Suggested human reviewer: @Darktex

Automated review by Claude Code | Learn more

rycerzes · 2026-05-18T08:46:38Z

Thank you for the reviews @Darktex! I will be fixing them shortly. Also could you clean up the duplicated noise caused by claude so that the thread is cleaner?

- Wire disable_thinking and max_tokens_cap through CodingAgentConfig - Raise RuntimeError on mkdir/cat failures in docker backend - Propagate QueueFull exceptions instead of silently swallowing - Change CommandResult.exit_code to int | None for bootstrap clarity

replace asyncio.Queue with stdlib queue.Queue for the request notification path (server → training loop). This makes both directions of the InterceptionServer cross-loop/cross-thread safe: - Request notifications: queue.Queue (inherently thread-safe) - Response delivery: asyncio.Future via _resolve_future_threadsafe (already cross-loop safe) The consumer (next_request) uses asyncio.to_thread(q.get, timeout=...) to await without blocking the event loop. This follows the same pattern used by OpenClaw-RL at scale. chunk_queue (internal SSE streaming) remains asyncio.Queue since both producer and consumer run on the server's own event loop.

…andling - soak test

…ime user's $HOME

rycerzes

fixed all the following issues:

interception_gate wiring fixed in CodingAgentSessionFactory.create() (37e549d)
_build_agent_config param forwarding, DockerBackend.write_text return code checks, _put_queue_threadsafe no longer drops on QueueFull, and setup_results exit codes no longer fabricated (8aa9d18, a2b4388/448f6905)
Whitespace secret bypass and hardcoded /root/ path resolved (61e5524)
Cross-loop asyncio.Queue deadlock fixed by replacing with stdlib queue.Queue for the request notification path; consumer uses asyncio.to_thread(q.get, timeout=...) to avoid blocking the event loop. chunk_queue intentionally stays as asyncio.Queue since both producer and consumer run on the server's own event loop (a2b4388/b10a4483)

transparent_proxy migration docs are a non-issue — only one env used it and it's already been migrated.

sergiopaniego

Inline comments on concrete issues found during review.

rycerzes added 10 commits May 12, 2026 22:13

refactor: core sandbox infra to use openenv.core.harness.sandbox

57f9b70

feat: CLIAgentDriver Abstraction

024e904

feat: add tests

455b0e9

feat: impl Docker sandbox backend

e97fda0

- related tests

feat: pi agent adapter

9a35006

- fix agent handling

chore: agent specifications and improve interception module ref

06df791

feat: add tests for opencode + pi harness adapters

a3c4a3d

chore: migrate opencode_env to coding_agent_env

81e37a2

feat: hf sandbox backend

ddf1313

- tests

chore: ruff + usort format pass

9d85640

meta-cla Bot added the CLA Signed This label is managed by the Meta Open Source bot. label May 15, 2026

greptile-apps Bot reviewed May 15, 2026

View reviewed changes

Comment thread src/openenv/core/harness/agents/cli_driver.py Outdated

Comment thread envs/coding_agent_env/server/coding_environment.py

rycerzes added 5 commits May 16, 2026 02:10

refactor: wire coding_agent_env with interception_gate

171a3ea

chore: update tests for interception_gate, remove proxy test cases

52a024e

chore: address greptile review comments

4b1b707

rycerzes commented May 15, 2026

View reviewed changes

rycerzes changed the title ~~refactor agent sandbox infrastructure and impl new agent backends~~ refactor agent sandbox infrastructure with new agent backends + rework logprobs capture May 15, 2026

Darktex suggested changes May 16, 2026

View reviewed changes

rycerzes added 7 commits May 16, 2026 20:39

refactor: extract sandbox bootstrap to driver and fix interception races

a478fa8

chore: remove unsupported mode checks from CodingAgentEnvironment

b18caf2

chore: revert linting for out of scope files

bfc7305

feat: host-side tool routing and Pi gate models bootstrap

c2ca0c8

fix: support configurable Pi workdir for command and MCP config

1b1d9fb

fix: Docker host gateway mapping and per-create image override

2f52e48

feat: configurable extension directory support for CLI agents

f3fede2

Darktex suggested changes May 16, 2026

View reviewed changes

Darktex suggested changes May 17, 2026

View reviewed changes

fix: thread-safe queue handling

448f690

rycerzes added 6 commits May 18, 2026 14:18

fix: interception gate support in CodingAgentSessionFactory

37e549d

fix: whitespace secret validation + conditional /root/ write

61e5524

fix: replace asyncio.Queue with queue.Queue for thread-safe request h…

b10a448

…andling - soak test

fix: pi config discovery for CLIAgentDriver to be independent of runt…

659288b

…ime user's $HOME

rycerzes commented May 18, 2026

View reviewed changes

rycerzes mentioned this pull request May 18, 2026

feat(mini_swe_env): add SWE-Gym async GRPO environment with Pi interception and HF Space deployment #695

Open

9 tasks

sergiopaniego reviewed May 19, 2026

View reviewed changes

rycerzes added 4 commits May 20, 2026 10:48

fix: interception params and update max_tokens_cap validation

5136337

refactor: remove RolloutTurn references

8137b15

feat: add tool name allowlist validation

3c4ffa4

feat: provider-specific env var handling for Pi agent

151d1ab

Conversation

rycerzes commented May 15, 2026

Summary

Changes

Tests

References

Type of Change

Alignment Checklist

RFC Status

Claude Code Review

Uh oh!

greptile-apps Bot commented May 15, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Greptile Summary

Confidence Score: 2/5

Important Files Changed

Sequence Diagram

Comments Outside Diff (2)

Uh oh!

Uh oh!

Uh oh!

rycerzes left a comment • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Darktex left a comment

Choose a reason for hiding this comment

Alignment Review — PR #694

Tier 1: Bugs & Mechanical Issues (16 findings)

Critical / High

Medium

Low

Tier 2: Alignment Flags (8 findings)

Blocking Alignment Questions

Non-Blocking Alignment Concerns

What's Good

Recommended Next Steps

Uh oh!

Darktex left a comment

Choose a reason for hiding this comment

Alignment Review — PR #694

Tier 1: Bugs & Code Quality

Tier 2: Alignment Flags (need human sign-off)

Verdict

Uh oh!

Darktex left a comment

Choose a reason for hiding this comment

Alignment Review — PR #694

Tier 1: Bugs & Mechanical Issues (16 findings)

Critical / High

Medium

Low

Tier 2: Alignment Flags (8 findings)

Blocking Alignment Questions

Non-Blocking Alignment Concerns

What's Good

Recommended Next Steps

Uh oh!

Darktex left a comment

Choose a reason for hiding this comment

Alignment Review — PR #694

Tier 1: Bugs / Security / Correctness (6 findings)

Tier 2: Alignment Flags (3 flags)

Verdict: Request Changes

Uh oh!

Darktex left a comment

Choose a reason for hiding this comment

Alignment Review — PR #694

Automated Checks

Tier 1: Bugs / Issues

Tier 2: Alignment Flags

Verdict: Request Changes

Uh oh!

Darktex left a comment

Choose a reason for hiding this comment

Alignment Review — PR #694

Tier 1: Bugs / Issues to Fix

Tier 2: Alignment Flags for Human Review

Verdict: REQUEST_CHANGES

Uh oh!

greptile-apps Bot commented May 15, 2026 •

edited

Loading

rycerzes left a comment •

edited

Loading

rycerzes commented May 18, 2026 •

edited

Loading

rycerzes left a comment •

edited

Loading