Skip to content

refactor agent sandbox infrastructure with new agent backends + rework logprobs capture#694

Open
rycerzes wants to merge 33 commits into
huggingface:mainfrom
rycerzes:feat/multi-harness
Open

refactor agent sandbox infrastructure with new agent backends + rework logprobs capture#694
rycerzes wants to merge 33 commits into
huggingface:mainfrom
rycerzes:feat/multi-harness

Conversation

@rycerzes
Copy link
Copy Markdown
Contributor

Summary

The coding environment was hardwired to OpenCode. This PR pulls the sandbox and harness infrastructure into src/openenv/core/harness/ so it's shared, adds a declarative CLIAgentSpec + CLIAgentDriver that any agent adapter can plug into, ships OpenCode and Pi as the first two adapters, adds Docker and HF Sandbox backends alongside the existing E2B one, and renames opencode_envcoding_agent_env.

Changes

Sandbox protocols moved to core (src/openenv/core/harness/sandbox/). SandboxBackend, SandboxHandle, BgJob were locked inside opencode_env. They're now the shared seam. E2BSandboxBackend is unchanged and already works with CubeSandbox (E2B-compatible self-hosted MicroVM sandbox) by pointing E2B_API_URL at it.

Docker sandbox backend added (src/openenv/core/harness/sandbox/docker_backend.py). Runs sandboxes via docker run. Suitable for local development and CI.

HF Sandbox backend added (src/openenv/core/harness/sandbox/hf_backend.py). Wraps hf-sandbox.

CLIAgentSpec + CLIAgentDriver (src/openenv/core/harness/agents/). Per-agent knowledge is expressed as a declarative dataclass — install script, files to upload, env vars, MCP config format, artifacts to collect — with three small callables for the parts that can't be data (command builder, MCP config serializer, stdout event parser). The driver reads these fields and runs the lifecycle; it has no per-agent branches. Same pattern as verifiers (Prime Intellect). Adding a new agent is writing a spec, not touching the driver.

OpenCode refactored onto CLIAgentSpec. envs/coding_agent_env/harness.py is now a thin wrapper around CLIAgentSessionFactory.

Pi adapter added (src/openenv/core/harness/agents/pi.py). Runs Pi in print mode via --no-session -p @/instruction.txt, matching the verifiers Pi harness approach.

opencode_envcoding_agent_env. The environment is for coding tasks; the harness is a config parameter. run_rollout now takes agent: str = "opencode" and supports "opencode" and "pi". Gradio UI gets an agent selector. All imports, docs, pyproject.toml, openenv.yaml, and Dockerfile updated.

Tests

tests/core/test_cli_agent_driver.py — driver lifecycle via _FakeSandbox
tests/core/test_harness_adapters.py — per-adapter: MCP config, CLI flags, event parsing, registry
tests/core/test_docker_sandbox_backend.py — Docker backend
tests/core/test_hf_sandbox_backend.py — HF backend
tests/envs/test_coding_agent_env.py — end-to-end env tests (renamed, extended)

References

Type of Change

  • Bug fix
  • New feature
  • Breaking change
  • Documentation
  • New environment
  • Refactoring

Alignment Checklist

Before submitting, verify:

  • I have read .claude/docs/PRINCIPLES.md and this PR aligns with our principles
  • I have checked .claude/docs/INVARIANTS.md and no invariants are violated
  • I have run /pre-submit-pr (or bash .claude/hooks/lint.sh and tests) and addressed all issues

RFC Status

  • Not required (bug fix, docs, minor refactoring)
  • RFC exists: #___
  • RFC needed (will create before merge)

Claude Code Review

n/a

cc: @burtenshaw

@meta-cla meta-cla Bot added the CLA Signed This label is managed by the Meta Open Source bot. label May 15, 2026
@greptile-apps
Copy link
Copy Markdown
Contributor

greptile-apps Bot commented May 15, 2026

Greptile Summary

This PR refactors the hardwired OpenCode environment into a shared CLIAgentDriver / CLIAgentSpec infrastructure in src/openenv/core/harness/, renames opencode_envcoding_agent_env, and ships Docker and HF Sandbox backends alongside the existing E2B one. It is a substantial architectural refactor with two new agent adapters (OpenCode and Pi) and a declarative spec pattern that makes adding future agents trivial.

  • Core driver (cli_driver.py): introduces CLIAgentDriver / CLIAgentSessionFactory as the shared lifecycle engine; the _exec_with_retry helper has a correctness bug (early break on any non-empty stderr) that defeats the advertised 3-attempt install retry.
  • Server layer (coding_environment.py): lazy-imports via from coding_agent_env import ... which triggers __init__.py → client.py, violating the INVARIANTS client-server separation boundary; setup commands also run after the agent has already started, creating a correctness race for tasks that depend on setup completing first.
  • New backends (docker_backend.py, hf_backend.py): correctly implement the SandboxBackend protocol but are not wired into _build_session_factory, so they are unreachable via the MCP run_rollout tool.

Confidence Score: 2/5

Not safe to merge as-is — the server layer imports client code through the package init, violating a hard architectural boundary, and there is a real correctness race between setup commands and the running agent.

The server indirectly pulls in client.py via coding_agent_env/init.py every time CodingAgentEnvironment initializes, crossing a boundary the repo treats as non-negotiable. The _exec_with_retry bug means install retries never fire for the most common failure mode (non-empty stderr), so cold sandbox installs will fail non-deterministically in CI. The setup/agent race is explicitly noted as a known compromise but affects any task that writes input files or installs dependencies before the agent's first model call.

envs/coding_agent_env/server/coding_environment.py (client-server violation + setup race), envs/coding_agent_env/init.py (root init imports client), src/openenv/core/harness/agents/cli_driver.py (retry logic bug)

Important Files Changed

Filename Overview
src/openenv/core/harness/agents/cli_driver.py Core CLI driver for all agent harnesses; contains a retry bug in _exec_with_retry where any non-empty stderr immediately aborts retries, defeating 3-attempt install guarantees.
envs/coding_agent_env/server/coding_environment.py New MCP environment server — has two issues: server lazy-imports coding_agent_env.__init__ which pulls in client.py (violates client-server separation); setup commands race with the already-running agent.
src/openenv/core/harness/sandbox/docker_backend.py New Docker sandbox backend; protocol-correct and well-structured, but not wired into the server layer so unreachable via the MCP run_rollout tool.
src/openenv/core/harness/sandbox/hf_backend.py New HF Sandbox backend; HFBgJob.wait() uses a blocking polling loop instead of a background thread — acceptable but the calling thread is blocked for the full agent duration with no interrupt path.
envs/coding_agent_env/init.py Root package init unconditionally imports from .client import CodingAgentEnv, causing server code that does from coding_agent_env import ... to indirectly pull in client code — violates INVARIANTS client-server separation.
src/openenv/core/harness/agents/base.py Declarative CLIAgentSpec dataclass and supporting protocols — clean, well-documented, no issues.
src/openenv/core/harness/agents/opencode.py OpenCode adapter expressed as a declarative spec; _build_opencode_mcp_config intentionally returns empty string (config written via spec.files), which is correct.
src/openenv/core/harness/agents/pi.py Pi adapter — maps api_key to both HF_TOKEN and OPENAI_API_KEY, which is correct for Pi's multi-provider support.
src/openenv/core/harness/sandbox/interception.py Transparent forwarding proxy for logprob capture; streaming error-path handling (non-2xx returning JSON instead of empty SSE) is a correctness improvement over the prior version.
envs/coding_agent_env/harness.py Thin wrapper around CLIAgentDriver; calls _driver._start_proxy() directly (private API) which is a coupling smell but functionally correct.

Sequence Diagram

sequenceDiagram
    participant Orch as Orchestrator
    participant Env as CodingAgentEnvironment
    participant Factory as SessionFactory
    participant Driver as CLIAgentDriver
    participant SB as SandboxBackend (E2B/Docker/HF)
    participant Proxy as InterceptionProxy

    Orch->>Env: run_rollout(agent, instruction, setup, verify)
    Env->>Factory: create(task)
    Factory->>Driver: create_session(task, config)
    Driver->>SB: create(timeout_s)
    SB-->>Driver: SandboxHandle
    Driver->>Driver: _bootstrap_sandbox
    alt "mode == transparent_proxy"
        Driver->>SB: start_bg(interception.py)
        SB-->>Driver: proxy BgJob
    end
    Driver->>SB: start_bg(agent CLI)
    SB-->>Driver: agent BgJob
    Driver-->>Factory: CLIAgentSession
    Factory-->>Env: session (agent already running)
    Note over Env: setup commands run HERE — races with agent
    loop setup commands
        Env->>SB: exec(cmd)
    end
    Env->>Factory: wait_for_completion()
    loop verify commands
        Env->>SB: exec(cmd)
    end
    Env->>SB: kill()
    Env-->>Orch: RolloutResult (JSON)
Loading

Comments Outside Diff (2)

  1. envs/coding_agent_env/server/coding_environment.py, line 693-711 (link)

    P1 Setup commands race with running agent

    The agent is launched inside factory.create() via session.start_agent() before this loop runs. For tasks that require setup to complete before the agent's first tool call (e.g., installing a dependency or writing an input file the agent reads on start), this is a silent correctness failure: the agent may attempt to import pandas while the pip install pandas setup command is still running. The inline comment acknowledges the race but frames it as "fine for typical work" — that assumption breaks for any task where the agent reads setup-created files or depends on pre-installed packages.

    Prompt To Fix With AI
    This is a comment left during a code review.
    Path: envs/coding_agent_env/server/coding_environment.py
    Line: 693-711
    
    Comment:
    **Setup commands race with running agent**
    
    The agent is launched inside `factory.create()` via `session.start_agent()` before this loop runs. For tasks that require setup to complete before the agent's first tool call (e.g., installing a dependency or writing an input file the agent reads on start), this is a silent correctness failure: the agent may attempt to import `pandas` while the `pip install pandas` setup command is still running. The inline comment acknowledges the race but frames it as "fine for typical work" — that assumption breaks for any task where the agent reads setup-created files or depends on pre-installed packages.
    
    How can I resolve this? If you propose a fix, please make it concise.
  2. envs/coding_agent_env/server/coding_environment.py, line 831-854 (link)

    P2 ALIGNMENT FLAG: New Docker/HF sandbox backends are unreachable via the MCP tool

    _build_session_factory always creates E2BSandboxBackend. The DockerSandboxBackend and HFSandboxBackend added in this PR to src/openenv/core/harness/sandbox/ are only usable programmatically via the harness primitive, not through the deployed run_rollout MCP tool. There is no backend parameter in the tool signature to select them. The PR description bills Docker as "suitable for CI" but CI agents can only reach the environment via MCP, so Docker is effectively unused in the primary run path.

    • Principle at stake: "Be hands-on: Provide ready-to-use implementations, not just specs" (PRINCIPLES.md)
    • The concern: New backends shipped to core are disconnected from the server layer with no path to adoption
    • Suggested reviewer: @darktex

    Context Used: .claude/docs/PRINCIPLES.md (source)

    Prompt To Fix With AI
    This is a comment left during a code review.
    Path: envs/coding_agent_env/server/coding_environment.py
    Line: 831-854
    
    Comment:
    **ALIGNMENT FLAG: New Docker/HF sandbox backends are unreachable via the MCP tool**
    
    `_build_session_factory` always creates `E2BSandboxBackend`. The `DockerSandboxBackend` and `HFSandboxBackend` added in this PR to `src/openenv/core/harness/sandbox/` are only usable programmatically via the harness primitive, not through the deployed `run_rollout` MCP tool. There is no `backend` parameter in the tool signature to select them. The PR description bills Docker as "suitable for CI" but CI agents can only reach the environment via MCP, so Docker is effectively unused in the primary run path.
    
    - **Principle at stake**: "Be hands-on: Provide ready-to-use implementations, not just specs" (PRINCIPLES.md)
    - **The concern**: New backends shipped to core are disconnected from the server layer with no path to adoption
    - **Suggested reviewer**: `@darktex`
    
    **Context Used:** .claude/docs/PRINCIPLES.md ([source](https://app.greptile.com/review/custom-context?memory=67b96369-b31e-4918-84c9-a0a4e3c8aa97))
    
    How can I resolve this? If you propose a fix, please make it concise.
Prompt To Fix All With AI
Fix the following 4 code review issues. Work through them one at a time, proposing concise fixes.

---

### Issue 1 of 4
src/openenv/core/harness/agents/cli_driver.py:446-465
**`_exec_with_retry` breaks on first non-empty stderr, defeating retries**

The `if last_stderr.strip(): break` exits the retry loop on the very first failure that produces any stderr output — which is exactly the case for transient network errors (e.g., `curl: (6) Could not resolve host` or a flaky apt mirror). The install stage advertises 3-attempt retries with exponential backoff, but in practice retries only fire when a command exits non-zero with _empty_ stderr or throws a Python exception, which covers almost no real-world install failures.

For example, `curl -fsSL https://opencode.ai/install | bash` failing with a DNS error or a 503 will immediately raise without retry, rather than being retried after 3s and 6s as intended.

```suggestion
                last_stdout = r.stdout or ""
                last_stderr = r.stderr or ""
                last_exit = r.exit_code
```

### Issue 2 of 4
envs/coding_agent_env/server/coding_environment.py:693-711
**Setup commands race with running agent**

The agent is launched inside `factory.create()` via `session.start_agent()` before this loop runs. For tasks that require setup to complete before the agent's first tool call (e.g., installing a dependency or writing an input file the agent reads on start), this is a silent correctness failure: the agent may attempt to import `pandas` while the `pip install pandas` setup command is still running. The inline comment acknowledges the race but frames it as "fine for typical work" — that assumption breaks for any task where the agent reads setup-created files or depends on pre-installed packages.

### Issue 3 of 4
envs/coding_agent_env/server/coding_environment.py:425-432
**ALIGNMENT FLAG: Server imports client code via `coding_agent_env.__init__`**

The lazy import `from coding_agent_env import E2BSandboxBackend, CodingAgentConfig, CodingAgentSessionFactory, CodingAgentTask` triggers `coding_agent_env/__init__.py` which unconditionally executes `from .client import CodingAgentEnv`. This means `server/coding_environment.py` (server code) indirectly imports `client.py`, directly violating INVARIANTS.md §2 "Client-server separation": "Server code must never import client code."

The fix is to import from the specific sub-modules (`from coding_agent_env.config import CodingAgentConfig`, etc.) or split `__init__.py` into server-side and client-side exports.

- **Invariant at risk**: Client-server separation (INVARIANTS.md §2)
- **Suggested reviewer**: `@darktex`

### Issue 4 of 4
envs/coding_agent_env/server/coding_environment.py:831-854
**ALIGNMENT FLAG: New Docker/HF sandbox backends are unreachable via the MCP tool**

`_build_session_factory` always creates `E2BSandboxBackend`. The `DockerSandboxBackend` and `HFSandboxBackend` added in this PR to `src/openenv/core/harness/sandbox/` are only usable programmatically via the harness primitive, not through the deployed `run_rollout` MCP tool. There is no `backend` parameter in the tool signature to select them. The PR description bills Docker as "suitable for CI" but CI agents can only reach the environment via MCP, so Docker is effectively unused in the primary run path.

- **Principle at stake**: "Be hands-on: Provide ready-to-use implementations, not just specs" (PRINCIPLES.md)
- **The concern**: New backends shipped to core are disconnected from the server layer with no path to adoption
- **Suggested reviewer**: `@darktex`

Reviews (1): Last reviewed commit: "feat: hf sandbox backend" | Re-trigger Greptile

Comment thread src/openenv/core/harness/agents/cli_driver.py Outdated
Comment thread envs/coding_agent_env/server/coding_environment.py
rycerzes added 5 commits May 16, 2026 02:10
…roxy

the transparent proxy was a passive forwarder that captured logprobs
by injecting logprobs=true into upstream requests. It is replaced by
the interception_gate mode where the trainer owns the forward pass
entirely — no proxy needed inside the sandbox.
…eneration

InterceptionServer (aiohttp) runs on the trainer host. Each rollout
registers a queue. The agent's OPENAI_BASE_URL points at
`{base_url}/rollout/{id}/v1`. When the agent makes an LLM call it
blocks at the server. The training loop dequeues the request, calls
vLLM with logprobs=True and return_token_ids=True, and delivers the
response back via deliver_response().
Copy link
Copy Markdown
Contributor Author

@rycerzes rycerzes left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reworked logprob capture

  • Removed: transparent_proxy mode and sandbox/interception.py — a FastAPI proxy that ran inside the sandbox and injected logprobs=true into upstream LLM calls.
  • Added: interception_gate mode with InterceptionServer running on the trainer host. The agent's OPENAI_BASE_URL points to {base_url}/rollout/{id}/v1. LLM calls block at the server; the training loop dequeues via queue.get(), runs its own vLLM forward pass, and returns the response via deliver_response(). HMAC-signed request IDs prevent cross-rollout leakage.

Driver modes are now black_box (eval/demos) and interception_gate (RL training). CLIAgentSession gains next_request() and deliver() for the training loop. the trainer runs vLLM, the agent points OPENAI_BASE_URL at it, and the trainer owns the forward pass and logprob capture entirely.

Same pattern as AReaL, verifiers, rLLM, AWS AgentCore RL Toolkit, Atropos, OpenRLHF, and OpenClaw-RL.

Network reachability is caller-owned.

@rycerzes rycerzes changed the title refactor agent sandbox infrastructure and impl new agent backends refactor agent sandbox infrastructure with new agent backends + rework logprobs capture May 15, 2026
Copy link
Copy Markdown
Contributor

@Darktex Darktex left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Alignment Review — PR #694

Reviewed by 4 parallel alignment-reviewer agents covering: core agents, core sandbox, environment changes, and tests.

Overall verdict: request_changes


Tier 1: Bugs & Mechanical Issues (16 findings)

Critical / High

# File Issue
1 interception_server.py Deprecated API: asyncio.get_event_loop().create_future() → use asyncio.get_running_loop().create_future() (deprecated since 3.10, may error in future versions)
2 interception_server.py Race condition: self.intercepts and self.active_rollouts are plain dicts modified from concurrent coroutines without self._lock. Concurrent register_rollout + _handle_chat_completions + unregister_rollout can cause RuntimeError: dictionary changed size during iteration. Protect with asyncio.Lock.
3 interception_server.py Memory leak: Completed intercepts in self.intercepts are never removed after deliver_response(). Over long training runs every completed LLM call leaves a dead entry. Add del self.intercepts[request_id] after future resolves.
4 cli_driver.py Broken import: from openenv.core.harness import (...) references openenv.core.harness (singular) but the existing package is openenv.core.harnesses (plural). This will raise ModuleNotFoundError at runtime.
5 coding_environment.py NoneType crash: _build_session_factory calls self._E2BSandboxBackend(**kwargs) which is None when e2b is not installed. The old code raised a sentinel _RequiresE2B class. Add a guard: if self._E2BSandboxBackend is None: raise RuntimeError(...)
6 coding_environment.py Type mismatch: disable_thinking: Optional[bool] flows from the MCP tool through _run_rollout_impl into _build_agent_config(disable_thinking: bool). When None is passed, behavior may be wrong. Either pass disable_thinking_resolved or change the parameter type.
7 cli_driver.py Fragile private access: getattr(self._agent_bg_job, "_done", None) accesses a private BgJob implementation detail. If the attribute is renamed, next_request() silently spins until timeout. BgJob should expose a public is_done() method.

Medium

# File Issue
8 opencode.py Shell injection risk: _build_opencode_command interpolates {home}/{workdir} into shell commands without shlex.quote. If sandbox_home contains spaces or metacharacters, the command breaks or becomes injectable.
9 cli_driver.py Dead-code timeout logic: The two-step timeout resolution (budget = spec.default_timeout_s then override with config.agent_timeout_s) makes the first assignment dead when config has agent_timeout_s. Simplify to a single ternary chain.
10 docker_backend.py Invalid construction: DockerBgJob(poll_thread=None) with # type: ignore, then immediately mutated via job._poll_thread = poll_thread. Object is in invalid state between construction and mutation.
11 docker_backend.py Dead code: DockerBgJob._error is checked in wait() but never set anywhere. The error-raise branch is unreachable. Either wire it to the poll thread or remove it.
12 coding_environment.py Non-idempotent replay: Setup commands are run twice — once via setup_shell, then replayed individually "for observability". Common setup commands (pip install, mkdir) are not idempotent and may fail on the second run.
13 harness.py Leaky abstraction: CodingAgentSessionFactory._bootstrap_sandbox calls CLIAgentDriver private methods (_wait_for_sandbox_ready, _agent_already_installed, _exec_with_retry). If driver internals change, this breaks silently. Expose a public bootstrap() method instead.

Low

# File Issue
14 docker_backend.py UUID collision: uuid4().hex[:8] (2^32 values) for marker files. Use full hex (2^128) to avoid collisions under high-concurrency training.
15 test_hf_sandbox_backend.py Test isolation: _FakeSandboxAPI.calls is a class-level mutable list. Only one test clears it — others may see stale entries if run order changes. Move to instance variable or add autouse fixture.
16 test_docker_sandbox_backend.py Tautological assertion: assert isinstance(sandbox, SandboxBackend) or hasattr(sandbox, "exec") — the or arm defeats the purpose. Remove the fallback.

Tier 2: Alignment Flags (8 findings)

Blocking Alignment Questions

FLAG 1 — Deleted security tests with no replacement ⚠️

  • Invariant: "No credential exposure" (INVARIANTS.md §Security §3)
  • The old test_opencode_env.py had two adversarial tests: test_start_proxy_keeps_upstream_key_out_of_command (shell injection with sk-test '$(leak)) and test_interception_cli_reads_upstream_key_from_env. These verified API keys flow through env vars, not CLI argv. The new test suite only checks bg_envs["API_KEY"] == "sk-test-key" with clean inputs — no adversarial shell metacharacter testing.
  • Action needed: Restore equivalent adversarial security tests in test_cli_agent_driver.py.

FLAG 2 — transparent_proxy mode removed / logprob capture moved outside env boundary

  • Invariant: "Rewards inside environment" (RFC 002, INVARIANTS.md §Architectural §3)
  • _collect_proxy_turns is now a stub returning []. Logprob capture moves to the training loop via interception_gate. This is a significant architectural boundary shift. The mode="transparent_proxy" string is still accepted by the API but silently produces empty proxy_turns — a silent contract break.
  • Action needed: RFC documenting why logprob capture belongs outside the env boundary, or deprecation warning on transparent_proxy mode.

FLAG 3 — New harness infrastructure does not integrate with RFC 005

  • Invariant: "One canonical way to build environments" (PRINCIPLES.md)
  • RFC 005 defines HarnessAdapter (ABC), HarnessConfig (Pydantic), HarnessEnvironment. This PR introduces a parallel CLIAgentSpec/CLIAgentDriver/ResourceSession stack with no connection to RFC 005. Two diverging patterns for the same domain.
  • Action needed: Either extend RFC 005 abstractions or supersede with a new RFC.

Non-Blocking Alignment Concerns

FLAG 4InterceptionServer binds 0.0.0.0 (publicly exposed trainer host). Should default to 127.0.0.1.

FLAG 5--dangerously-skip-permissions hardcoded in OPENCODE_SPEC.base_command. Should be opt-in, not default.

FLAG 6docker_args passthrough allows callers to trivially bypass container isolation (--privileged, -v /:/mnt). Consider an allowlist.

FLAG 7HFSandboxCreateError may expose HF tokens via SDK exception messages propagated through f"... {last_error}".

FLAG 8 — No tests verify the dual API boundary invariant (agent cannot reach reset()/step()) in the new infrastructure. 5000+ lines of new code with no negative test for the most critical invariant.


What's Good

  • The spec/driver declarative pattern is architecturally elegant — adding a new agent is writing a dataclass, not modifying the driver.
  • The rename (opencode_envcoding_agent_env) is clean and consistent across imports, docs, pyproject.toml, Dockerfile, and openenv.yaml.
  • The sandbox protocol abstraction (SandboxBackend/SandboxHandle/BgJob) is well-designed and provides a clean seam for swapping backends.
  • Test coverage is extensive for the new driver lifecycle (1006 lines for cli_driver alone).

Recommended Next Steps

  1. Fix the 7 critical/high bugs (broken import, InterceptionServer race + leak, NoneType crash, type mismatch, deprecated API, private attribute access)
  2. Restore adversarial security tests for credential-in-argv
  3. Add a deprecation warning or validation for mode="transparent_proxy"
  4. Write an RFC (or RFC addendum) for the logprob ownership shift and the CLIAgentSpec/CLIAgentDriver pattern vs RFC 005
  5. Add a negative test asserting agents cannot access reset()/step() through the new infrastructure

Copy link
Copy Markdown
Contributor

@Darktex Darktex left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Note: This is an automated review by Claude Code, not a human review.


Alignment Review — PR #694

Tier 1: Bugs & Code Quality

Bug 1 — Deprecated asyncio.get_event_loop() in InterceptionServer
src/openenv/core/harness/agents/interception_server.py uses asyncio.get_event_loop().create_future() inside an async def. This is deprecated in Python 3.10+ and raises RuntimeError when no event loop is running in the current thread. Since the handler is a coroutine, use asyncio.get_running_loop().create_future() instead.

Bug 2 — Double bootstrap: setup commands run twice
envs/coding_agent_env/server/coding_environment.py (_run_rollout_impl): Setup commands are passed to CodingAgentTask (which runs them in _bootstrap_sandbox), then the same commands are re-run individually "for observability." Non-idempotent commands (e.g. pip install --upgrade, destructive rm -rf) will break. Either drop the re-run loop or don't pass setup_shell to the task — the two-pass design is architecturally incoherent.

Bug 3 — CodingAgentSessionFactory.create bypasses its own CLIAgentDriver
envs/coding_agent_env/harness.py: The factory creates a CLIAgentDriver in __init__, but create() never calls self._driver.create_session(). Instead it manually reproduces the bootstrap by calling private driver methods (_wait_for_sandbox_ready, _agent_already_installed, _exec_with_retry). The _driver field is dead weight. Either delegate to the driver or remove it.

Bug 4 — _build_session_factory crashes when E2B is not installed
envs/coding_agent_env/server/coding_environment.py: self._E2BSandboxBackend is set to None when e2b is not installed. Calling None(...) raises a cryptic TypeError. Add an explicit guard: if self._E2BSandboxBackend is None: raise RuntimeError("E2B not installed").

Bug 5 — Dropped security tests without replacement
tests/envs/test_coding_agent_env.py removes test_start_proxy_keeps_upstream_key_out_of_command and test_interception_cli_reads_upstream_key_from_env with no equivalent in the new test suite. These verified that API keys never appear in process argv — a real security property that should be re-tested for InterceptionServer.

Resource leak — InterceptionServer._stream_response leaks intercept entries on connection reset
When ConnectionResetError is caught, the intercept dict entry is not removed. On retries, new entries accumulate. Consider cleaning up in the except block.


Tier 2: Alignment Flags (need human sign-off)

FLAG 1 — Logprob capture removed from environment without RFC
The old design captured per-token logprobs inside the environment and returned them as proxy_turns in RolloutResult. The new design returns proxy_turns = [] always and moves logprob capture to the training loop via interception_gate. This changes a public API contract — callers reading result.proxy_turns will silently get nothing. The RolloutResult model still declares the field. This is a significant behavioral change that should go through an RFC.

FLAG 2 — InterceptionServer introduces a third API boundary
INVARIANTS.md states OpenEnv exposes exactly two APIs: WebSocket (Gym-like) and MCP (for agents). The InterceptionServer is a third HTTP server (OpenAI-compatible proxy) that sits between the agent and the LLM. Whether this is a violation or a legitimate extension of training infrastructure deserves explicit discussion. The invariants doc should be updated either way.

FLAG 3 — Two parallel harness namespaces in core
There is already src/openenv/core/harnesses/ (plural). This PR adds src/openenv/core/harness/ (singular) with a parallel structure. Developers onboarding will see two harness abstractions with no guidance on which to use, violating the "one canonical way" principle.


Verdict

Well-motivated refactor with a clean CLIAgentSpec design and good test coverage for the new driver. However, three mechanical bugs need fixing before merge (deprecated asyncio API, double bootstrap, bypassed driver abstraction), and three alignment flags need human sign-off (removed logprob contract, third API boundary, duplicate harness namespaces).

cc: @Darktex for alignment flags


Automated review by Claude Code | Learn more

Copy link
Copy Markdown
Contributor

@Darktex Darktex left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Note: This is an automated review by Claude Code, not a human review.


Alignment Review — PR #694

Reviewed by 4 parallel alignment-reviewer agents covering: core agents, core sandbox, environment changes, and tests.

Overall verdict: request_changes


Tier 1: Bugs & Mechanical Issues (16 findings)

Critical / High

# File Issue
1 interception_server.py Deprecated API: asyncio.get_event_loop().create_future() → use asyncio.get_running_loop().create_future() (deprecated since 3.10, may error in future versions)
2 interception_server.py Race condition: self.intercepts and self.active_rollouts are plain dicts modified from concurrent coroutines without self._lock. Concurrent register_rollout + _handle_chat_completions + unregister_rollout can cause RuntimeError: dictionary changed size during iteration. Protect with asyncio.Lock.
3 interception_server.py Memory leak: Completed intercepts in self.intercepts are never removed after deliver_response(). Over long training runs every completed LLM call leaves a dead entry. Add del self.intercepts[request_id] after future resolves.
4 cli_driver.py KeyError race: next_request() does bare server.intercepts[request_id] dict access after queue.get(). If close() calls unregister_rollout() from another thread between the queue read and the dict access, this raises an unhandled KeyError. Use .get() and handle None.
5 coding_environment.py NoneType crash: _build_session_factory calls self._E2BSandboxBackend(**kwargs) which is None when e2b is not installed. The old code raised a sentinel _RequiresE2B class. Add a guard: if self._E2BSandboxBackend is None: raise RuntimeError(...)
6 coding_environment.py Type mismatch: disable_thinking: Optional[bool] flows from the MCP tool through _run_rollout_impl into _build_agent_config(disable_thinking: bool). When None is passed, behavior may be wrong. Pass disable_thinking_resolved (the guaranteed-bool value) instead.
7 opencode.py Shell injection: _build_opencode_command uses "$(cat {instruction_file})" inside a double-quoted shell string. If the task instruction file contains shell metacharacters ($(...), backticks, "), they will be interpreted by the shell. Pass instruction via a method that doesn't expand shell metacharacters.

Medium

# File Issue
8 opencode.py + pi.py Unquoted paths: Both _build_opencode_command and _build_command interpolate home/workdir/instruction_file/log_file into shell strings without shlex.quote(). Paths with spaces or metacharacters will break.
9 docker_backend.py Invalid construction: DockerBgJob(poll_thread=None) with # type: ignore, then immediately mutated via job._poll_thread = poll_thread. Object is in invalid state between construction and mutation. Make poll_thread optional or use a factory method.
10 docker_backend.py Dead code: DockerBgJob._error is checked in wait() but never set anywhere. The error-raise branch is unreachable. Either wire it to the poll thread or remove it.
11 docker_backend.py Infinite loop: _poll_bg_job silences all exceptions with bare except Exception: pass. If the container is externally killed, the daemon thread loops forever at 0.5s intervals issuing docker exec calls against a gone container. Add a failure counter or container existence check.
12 coding_environment.py Double bootstrap: Setup commands run twice — once atomically inside _bootstrap_sandbox as setup_shell, then replayed individually post-factory.create() "for observability." Non-idempotent commands (pip install git+..., git clone, createdb) will fail on the second run.
13 harness.py Leaky abstraction: CodingAgentSessionFactory._bootstrap_sandbox calls CLIAgentDriver private methods (_wait_for_sandbox_ready, _agent_already_installed, _exec_with_retry). Expose a public bootstrap() method instead.

Low

# File Issue
14 docker_backend.py UUID collision: uuid4().hex[:8] (2^32 values) for marker files. Use full hex (2^128) to avoid collisions under high-concurrency training.
15 docker_backend.py + hf_backend.py Duplicated code: Identical _shell_quote implementation in both files. Move to base.py or a _util.py module.
16 interception_server.py Info leak: str(exc) from trainer-side exceptions sent to sandboxed agent in 500 responses. Replace with generic message; log detail server-side.

Tier 2: Alignment Flags (8 findings)

Blocking Alignment Questions

FLAG 1 — Deleted security tests with no replacement ⚠️

  • Invariant: "No credential exposure" (INVARIANTS.md §Security §3)
  • The old test_opencode_env.py had two adversarial tests: test_start_proxy_keeps_upstream_key_out_of_command (shell injection with sk-test '$(leak)) and test_interception_cli_reads_upstream_key_from_env. These verified API keys flow through env vars, not CLI argv. The new test suite only checks bg_envs["API_KEY"] == "sk-test-key" with clean inputs — no adversarial shell metacharacter testing.
  • Action needed: Restore equivalent adversarial security tests in test_cli_agent_driver.py.

FLAG 2 — transparent_proxy mode removed / logprob capture moved outside env boundary

  • Invariant: "Rewards inside environment" (RFC 002, INVARIANTS.md §Architectural §3)
  • _collect_proxy_turns is now a stub returning []. Logprob capture moves to the training loop via interception_gate. This is a significant architectural boundary shift. The mode="transparent_proxy" string is still accepted by the API but silently produces empty proxy_turns — a silent contract break.
  • Action needed: RFC documenting why logprob capture belongs outside the env boundary, or deprecation warning on transparent_proxy mode.

FLAG 3 — New harness infrastructure does not integrate with RFC 005

  • Invariant: "One canonical way to build environments" (PRINCIPLES.md)
  • RFC 005 defines HarnessAdapter (ABC), HarnessConfig (Pydantic), HarnessEnvironment. This PR introduces a parallel CLIAgentSpec/CLIAgentDriver/ResourceSession stack with no connection to RFC 005. Two diverging patterns for the same domain.
  • Action needed: Either extend RFC 005 abstractions or supersede with a new RFC.

FLAG 4 — InterceptionServer has zero dedicated tests

  • Invariant: No credential exposure + Dual API boundary
  • 324 new lines of security-critical async HTTP code (HMAC auth, per-rollout request queuing, streaming SSE) with exactly one test that only checks a constructor guard. No test of: 401 rejection, rollout-not-found 404, deliver_response paths, concurrent rollout isolation, or unregister_rollout cancellation.
  • Action needed: Add dedicated test suite for InterceptionServer.

Non-Blocking Alignment Concerns

FLAG 5InterceptionServer binds 0.0.0.0 (publicly exposed trainer host). Should default to 127.0.0.1.

FLAG 6--dangerously-skip-permissions hardcoded in OPENCODE_SPEC.base_command. Should be opt-in, not default.

FLAG 7docker_args passthrough allows callers to trivially bypass container isolation (--privileged, -v /:/mnt). Consider an allowlist.

FLAG 8 — No tests verify the dual API boundary invariant (agent cannot reach reset()/step()) in the new infrastructure. 5000+ lines of new code with no negative test for the most critical invariant.


What's Good

  • The spec/driver declarative pattern is architecturally elegant — adding a new agent is writing a dataclass, not modifying the driver.
  • The rename (opencode_envcoding_agent_env) is clean and consistent across imports, docs, pyproject.toml, Dockerfile, and openenv.yaml.
  • The sandbox protocol abstraction (SandboxBackend/SandboxHandle/BgJob) is well-designed and provides a clean seam for swapping backends.
  • Test coverage is extensive for the new driver lifecycle (1006 lines for cli_driver alone).

Recommended Next Steps

  1. Fix the 7 critical/high bugs (broken import, InterceptionServer race + leak, NoneType crash, type mismatch, deprecated API, shell injection)
  2. Restore adversarial security tests for credential-in-argv
  3. Add a deprecation warning or validation for mode="transparent_proxy"
  4. Write an RFC (or RFC addendum) for the logprob ownership shift and the CLIAgentSpec/CLIAgentDriver pattern vs RFC 005
  5. Add a negative test asserting agents cannot access reset()/step() through the new infrastructure

Automated review by Claude Code | Learn more

Copy link
Copy Markdown
Contributor

@Darktex Darktex left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Alignment Review — PR #694

Tier 1: Bugs / Security / Correctness (6 findings)

1. interception_server.pyasyncio.Queue created from wrong event loop
register_rollout() is a sync method that creates an asyncio.Queue(). If called from a training-loop thread that doesn't share the server's event loop (common pattern: asyncio.run() in a background thread), the queue binds to the wrong loop → RuntimeError: Task attached to a different loop at runtime.

2. docker_backend.py start_bg — fragile shell quoting for marker path
The marker path is concatenated into cmd before shell_quote():

wrapped = f"bash -c {shell_quote(cmd + f'; echo $? > {marker}')} &\necho $!"

If cmd contains trailing single-quotes or && sequences, the concatenation produces invalid shell. Compare with hf_backend.py which correctly quotes the marker separately.

3. coding_environment.py _run_rollout_implsetup_results fabricates exit_code=0
All setup commands are unconditionally recorded as exit_code=0 / "executed during bootstrap", even though actual per-command exit codes are unavailable. Callers inspecting setup_results[i].exit_code to diagnose failures will always see success. Should be an empty list rather than falsely populated.

4. interception_server.pyself.secret never validated non-empty
hmac.compare_digest("", "") returns True. While secrets.token_urlsafe(32) normally produces non-empty values, if secret="" is explicitly passed, all requests are authorized. Add assert self.secret or raise in __init__.

5. harness.py CodingAgentSessionFactory.createinterception_gate silently broken ⚠️
The factory calls session.start_agent() instead of routing through CLIAgentDriver.create_session(). The driver's interception wiring path (which sets OPENAI_BASE_URL to the interception server URL and OPENAI_API_KEY to server.secret) is completely bypassed. Result: in interception_gate mode, sessions always use the original config.base_url instead of the interception server URL.

6. harness.py CodingAgentSession.start_agent — bypasses spec callables
Uses legacy opencode_runtime.build_run_cmd()/build_env_vars() directly instead of OPENCODE_SPEC-declared callables. Creates two divergent code paths for starting OpenCode — one via CLIAgentDriver and one via CodingAgentSession.


Tier 2: Alignment Flags (3 flags)

ALIGNMENT FLAG 1: Agent-to-trainer boundary crossing in interception_gate

  • Invariant: "Agents cannot reset" / "Dual API boundary" (INVARIANTS.md)
  • Concern: The new interception_gate gives agents network egress to the trainer's InterceptionServer HTTP port, authenticated by a bearer token in sandbox env vars. The agent can trivially read its own env vars, discover the token, and craft arbitrary POST requests. No validation that request bodies are legitimate LLM calls. An agent could enumerate rollouts, poison other rollouts' response futures, or DoS the trainer. This changes the threat model vs. the prior in-sandbox proxy design.
  • Suggested reviewer: @Darktex

ALIGNMENT FLAG 2: setup_results observability regression

  • Principle: Honest error reporting
  • Concern: Replacing real exit codes with synthetic exit_code=0 removes debugging observability without RFC documentation.

ALIGNMENT FLAG 3: Two parallel code paths for same agent

  • Principle: "Minimize lifecycle deltas" (PRINCIPLES.md)
  • Concern: CodingAgentSessionFactoryCodingAgentSession.start_agent() and CLIAgentSessionFactoryCLIAgentDriver._start_agent() are two paths that will diverge as OPENCODE_SPEC evolves.

Verdict: Request Changes

The interception_gate broken wiring (#5), misleading setup_results (#3), and docker start_bg quoting bug (#2) need fixes before merge. The alignment flag on sandbox-to-trainer network boundary (#1) warrants explicit sign-off given it changes the threat model of agent execution.

Overall this is a well-structured refactor with good test coverage — the CLIAgentSpec/CLIAgentDriver pattern is clean and extensible. The issues above are fixable without rearchitecting.

Copy link
Copy Markdown
Contributor

@Darktex Darktex left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Note: This is an automated review by Claude Code, not a human review.


Alignment Review — PR #694

Automated Checks

  • Lint: deferred to CI (PR branch not checked out locally)
  • Debug code: CLEAN — no new print/breakpoint/bare TODO in added files

Tier 1: Bugs / Issues

T1-1. Fabricated setup_results with hardcoded exit_code=0
envs/coding_agent_env/server/coding_environment.py — after bootstrap, the code synthesizes fake CommandResult entries with exit_code=0, stdout="executed during bootstrap", and duration_s=0.0 for every setup command regardless of actual outcome. Any monitoring or debugging relying on setup_results will get bogus data. Either remove the fabricated entries or document that individual-command telemetry is no longer available.

T1-2. InterceptionServer not validated as started in CLIAgentDriver
src/openenv/core/harness/agents/cli_driver.py — the constructor checks that the server is not None but never verifies start() was called. If a caller constructs the driver with an unstarted InterceptionServer, the agent will get a ConnectionRefusedError with no clear error message. Add a check (e.g., _app is not None) in the constructor or at create_session time.

T1-3. asyncio.Queue created outside running event loop — potential race
src/openenv/core/harness/agents/interception_server.pyregister_rollout() is synchronous and creates asyncio.Queue(). In Python 3.10+, asyncio.Queue() should be created on a running event loop. If called from a different thread than the server's event loop (e.g., training thread), this can raise RuntimeError: no running event loop or bind the queue to the wrong loop. CLIAgentSession.next_request also uses asyncio.create_task with the same issue. Needs an explicit event-loop affinity strategy.

T1-4. _write_pi_models_config unconditionally writes to /root/
src/openenv/core/harness/agents/cli_driver.py — writes to both {home}/.pi/agent/models.json and /root/.pi/agent/models.json. When the sandbox user is non-root (as in E2B templates where user is user), the /root/ write may fail silently or propagate depending on backend. Gate on effective user or remove the /root/ path.

T1-5. _SUPPORTED_AGENTS hardcoded instead of using registry
envs/coding_agent_env/server/coding_environment.py_SUPPORTED_AGENTS = ("opencode", "pi") is hardcoded. Adding a new agent via register_agent(spec) won't update validation. Use list_agents() from the registry or add a comment about the manual sync requirement.


Tier 2: Alignment Flags

🚩 ALIGNMENT FLAG: Core infrastructure added without RFC update

  • Invariant: PRINCIPLES.md — "decisions documented in RFCs and should not be changed without a new RFC"
  • Concern: This PR promotes SandboxBackend, CLIAgentDriver, InterceptionServer, and CLIAgentSpec into src/openenv/core/harness/ as shared infrastructure. RFC 005 describes a different design (HarnessAdapter with send_message() multi-turn pattern). This PR implements CLIAgentSpec + single-rollout CLIAgentDriver instead. The divergence from RFC 005 is not documented. Either update RFC 005 or create a new RFC explaining the chosen design.
  • Reviewer: @Darktex

🚩 ALIGNMENT FLAG: interception_gate secret exposure to sandbox agents

  • Invariant: Agent isolation — agents cannot access reset/simulation controls
  • Concern: The InterceptionServer authenticates via a bearer secret injected into the sandbox as OPENAI_API_KEY/ANTHROPIC_API_KEY. An agent that exfiltrates its env vars could construct arbitrary HTTP requests to the interception server, including calling registered tool handlers. The threat model for this design needs explicit documentation. The PR logs a warning when host="0.0.0.0" but doesn't document mitigations for secret exposure.
  • Reviewer: @Darktex

🚩 ALIGNMENT FLAG: proxy_turns silently becomes dead field

  • Invariant: PRINCIPLES — "Minimize lifecycle deltas"
  • Concern: RolloutResult.proxy_turns was the primary logprob capture mechanism. It's now always [] with a comment that logprob capture moved to interception_gate. Any training code reading proxy_turns will silently get nothing. The field should be deprecated with a warning or removed — leaving a dead field in the wire type without documentation violates lifecycle delta minimization.
  • Reviewer: @Darktex

🚩 ALIGNMENT FLAG: Server-side imports from env package root

  • Invariant: Client-server separation
  • Concern: coding_environment.py (in server/) imports from coding_agent_env.config, coding_agent_env.harness, coding_agent_env.task — these are primitive/config modules, not client code, so this is likely fine architecturally. However, the fallback import path uses bare from models import ... which only works when CWD is server/. This fragility should be verified.
  • Reviewer: @Darktex

Verdict: Request Changes

The T1 bugs — particularly the asyncio queue race in interception_gate mode (T1-3) and the fabricated setup_results (T1-1) — should be fixed before merge. The RFC 005 alignment question (Tier 2) is the most important architectural concern: if this PR implements RFC 005, the RFC should be updated to reflect the actual design. The interception_gate threat model also needs documentation given the agent isolation invariant.


Automated review by Claude Code | Learn more

Copy link
Copy Markdown
Contributor

@Darktex Darktex left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Note: This is an automated review by Claude Code, not a human review.


Alignment Review — PR #694

Tier 1: Bugs / Issues to Fix

  1. coding_environment.py — dead import shadowed by second import: CLIAgentSessionFactory is imported from openenv.core.harness.agents.cli_driver, then immediately shadowed by CodingAgentSessionFactory imported from coding_agent_env.harness. The first import is never used and should be removed (will fail ruff check).

  2. coding_environment.pysetup_results always reports exit_code=0: When setup execution moved into _bootstrap_sandbox, the per-command CommandResult entries hardcode exit_code=0 and stdout="executed during bootstrap". The old code tracked real exit codes; the new code discards them. This is a silent data-accuracy regression.

  3. coding_environment.py_build_agent_config ignores disable_thinking, top_logprobs, and max_tokens_cap for the opencode agent: These parameters are accepted by run_rollout and forwarded into _build_agent_config, but the agent == "opencode" branch constructs CodingAgentConfig without passing proxy_disable_thinking, proxy_top_logprobs, or proxy_max_tokens_cap. Callers who relied on disable_thinking=True (e.g. for Qwen3) will silently get unexpected thinking tokens.

  4. interception_server.py/health endpoint bypasses auth: _handle_health returns {"status":"ok"} without calling _authorized. Given this server is designed to receive LLM requests from sandboxes tunneled over the public internet (the docstring explicitly says "your responsibility" for ngrok/frp), an unauthenticated health endpoint lets adversaries enumerate whether a trainer is running.

  5. docker_backend.pywrite_text ignores return code: sandbox.write_text calls subprocess.run for mkdir -p and cat > redirect, but neither return code is checked. If the write silently fails (full disk, unwriteable path), no exception is raised. Compare to hf_backend.py which checks r.exit_code != 0 for mkdir.

  6. hf_backend.py_normalize_exec_timeout(None) returns 86400s: A 24-hour timeout effectively means only HF-side timeouts protect against hung commands. If None is intended to mean "use sandbox default", this should be documented; if unintentional, fall back to a sensible default (e.g. 300s).


Tier 2: Alignment Flags for Human Review

ALIGNMENT FLAG 1: Interception gate puts trainer-side server outside the sandbox boundary

  • Principle at stake: RFC 002 — Rewards inside environment. The interception_gate mode routes every LLM forward pass through a trainer-owned InterceptionServer running on the trainer host outside any sandbox. This effectively moves trajectory-shaping logic outside the MCPEnvironment boundary. This may be the right call for RL training, but it should be explicitly acknowledged as a trade-off against RFC 002's invariant.
  • Suggested reviewer: @Darktex

ALIGNMENT FLAG 2: CLIAgentSession dual-role object mixes trainer API + resource session

  • Principle at stake: Agents cannot reset / access simulation controls. CLIAgentSession has next_request(), deliver(), and register_tool_handler() methods (trainer APIs) on the same object as the reset/step/verify interface (ResourceSession). If a future code path passes CLIAgentSession through normal session dispatch, these trainer methods become callable from contexts that should only have Gym-like access. Consider whether the trainer-side control surface should be a separate companion object.
  • Suggested reviewer: @Darktex

ALIGNMENT FLAG 3: Server imports from client-facing env package

  • Principle at stake: Client-server separation. server/coding_environment.py imports CodingAgentConfig, CodingAgentSessionFactory, and CodingAgentTask from coding_agent_env (the env-level package). The env package's __init__.py re-exports from both client.py and harness.py. This is a pre-existing pattern but is now more visible.
  • Suggested reviewer: @Darktex

ALIGNMENT FLAG 4: Pi agent install.sh runs apt-get / curl | bash — assumes root-privileged sandbox

  • Principle at stake: Container isolation. The Pi spec's setup runs apt-get install and curl | sh, requiring root inside the sandbox. E2B sandboxes default to non-root (user), which would silently fail the install. The Pi spec conflates root-privileged Docker sandboxes with user-level E2B sandboxes. At minimum, document that Pi requires a root-privileged backend.
  • Suggested reviewer: @Darktex

Verdict: REQUEST_CHANGES

The 6 Tier 1 issues should be fixed before merge. The Tier 2 alignment flags — particularly the interception_gate trade-off against RFC 002 — warrant explicit acknowledgment from @Darktex even if the decision is to accept the trade-off intentionally.


Automated review by Claude Code | Learn more

Copy link
Copy Markdown
Contributor

@Darktex Darktex left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Note: This is an automated review by Claude Code, not a human review.


Alignment Review — PR #694

Verdict: REQUEST_CHANGES


Automated Checks

  • Lint: PASS — ruff check passes on src/ and tests/.
  • Debug code: CLEAN — No print(), breakpoint(), or stray TODO debug artifacts in new core files.

Tier 1: Bugs / Mechanical Issues

T1-1 (High). CodingAgentSessionFactory.create() bypasses CLIAgentDriver.create_session() — interception_gate mode silently broken for OpenCode path

envs/coding_agent_env/harness.py ~lines 740–746

The factory manually creates a sandbox, calls self._driver.bootstrap_sandbox(), then builds a CodingAgentSession and calls session.start_agent(). This entirely skips CLIAgentDriver.create_session(), which is the only place where interception_gate wiring happens: register_rollout(), interception_rollout_id, interception_queue, and base_url_override are never set. The resulting session has _base_url_override=None, so in interception_gate mode the OpenCode process still talks to the real config.base_url instead of the interception endpoint.

Fix: Delegate to self._driver.create_session(task, self._config, verifier=self._verifier, episode_id=episode_id) or restructure the factory to not re-implement what the driver already does.


T1-2 (Medium). Fabricated setup_results hides real setup failures from callers

envs/coding_agent_env/server/coding_environment.py ~lines 1534–1545

After setup commands move into bootstrap, _run_rollout_impl synthesizes fake CommandResult records with exit_code=0 and stdout="executed during bootstrap" regardless of whether they actually succeeded. The old code stopped rollout on the first setup failure and surfaced the real exit code. Training pipelines reading setup_results can never detect that their setup script failed.

Fix: Either pass real exit codes from _bootstrap_sandbox or don't populate setup_results at all rather than fabricating false successes.


T1-3 (Low). Missing ToolHandler export from agents/__init__.py

src/openenv/core/harness/agents/__init__.py

ToolHandler is used in the public API (CLIAgentSession.register_tool_handler()) but is not re-exported from the agents package __init__. Callers have to reach into openenv.core.harness.agents.interception_server directly, breaking the clean package boundary.

Fix: Add "ToolHandler" to __all__.


T1-4 (Low). Unauthenticated /health endpoint on InterceptionServer

src/openenv/core/harness/agents/interception_server.py ~line 5096

The /health endpoint does not call _authorized(). For a server reachable from sandboxes over a public tunnel (ngrok, frp, public IP — explicitly documented), this is acceptable for a simple {"status": "ok"} response, but should carry a comment noting it's intentionally unauthenticated so future contributors don't inadvertently add state information to it.


Tier 2: Alignment Flags for Human Review

FLAG 1: New third network interface (InterceptionServer) not covered by RFC

The InterceptionServer is a trainer-side HTTP server that sandboxed agents contact for LLM inference. This is a third interface beyond the documented dual-boundary model (WebSocket for orchestration, MCP for agent tools). Per PRINCIPLES.md, the two-interface architecture is an RFC-backed decision. The PR introduces a trainer endpoint reachable from "ngrok, frp, public IP, VPN" — the security tradeoffs and implications for the "rewards inside environment" principle (when the training loop owns generation) are not discussed in any RFC.

FLAG 2: Interception gate enables trainer to inject arbitrary content into agent context

In interception_gate mode, every LLM request is routed through a trainer-controlled server. The trainer can inject arbitrary response content, potentially including environment state, reward signals, or instructions that bypass the environment's own reward system. This shifts the "rewards inside environment" invariant enforcement entirely onto the training loop, with no framework-level guardrails.

FLAG 3: transparent_proxy mode removed without deprecation path

mode="transparent_proxy" was the documented default. This PR removes it entirely, replacing it with interception_gate. Any external caller with mode="transparent_proxy" hardcoded (training scripts, deployed Spaces) will break with an opaque error. Even pre-1.0, a deprecation warning or clear error message pointing to the replacement would be appropriate.

FLAG 4: Duplicate factory patterns

CodingAgentSessionFactory in harness.py partially duplicates CLIAgentDriver.create_session() while accepting interception_server/interception_base_url params it does not propagate. Should callers use CLIAgentSessionFactory(spec=OPENCODE_SPEC, ...) directly? Having two factories for the same agent spec that behave differently under the same mode is a hidden invariant violation.


Recommendation

Fix T1-1 (interception_gate silently broken) and T1-2 (fabricated setup results) before merge. The Tier 2 flags — especially the unRFC'd third interface — should be reviewed by a human maintainer to decide whether RFC coverage is needed before this lands on main.

Suggested human reviewer: @Darktex


Automated review by Claude Code | Learn more

@rycerzes
Copy link
Copy Markdown
Contributor Author

rycerzes commented May 18, 2026

Thank you for the reviews @Darktex! I will be fixing them shortly. Also could you clean up the duplicated noise caused by claude so that the thread is cleaner?

rycerzes added 6 commits May 18, 2026 14:18
- Wire disable_thinking and max_tokens_cap through CodingAgentConfig
- Raise RuntimeError on mkdir/cat failures in docker backend
- Propagate QueueFull exceptions instead of silently swallowing
- Change CommandResult.exit_code to int | None for bootstrap clarity
replace asyncio.Queue with stdlib queue.Queue for the request
notification path (server → training loop). This makes both
directions of the InterceptionServer cross-loop/cross-thread safe:

- Request notifications: queue.Queue (inherently thread-safe)
- Response delivery: asyncio.Future via _resolve_future_threadsafe
  (already cross-loop safe)

The consumer (next_request) uses asyncio.to_thread(q.get, timeout=...)
to await without blocking the event loop. This follows the same
pattern used by OpenClaw-RL at scale.

chunk_queue (internal SSE streaming) remains asyncio.Queue since both
producer and consumer run on the server's own event loop.
Copy link
Copy Markdown
Contributor Author

@rycerzes rycerzes left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

fixed all the following issues:

  • interception_gate wiring fixed in CodingAgentSessionFactory.create() (37e549d)
  • _build_agent_config param forwarding, DockerBackend.write_text return code checks, _put_queue_threadsafe no longer drops on QueueFull, and setup_results exit codes no longer fabricated (8aa9d18, a2b4388/448f6905)
  • Whitespace secret bypass and hardcoded /root/ path resolved (61e5524)
  • Cross-loop asyncio.Queue deadlock fixed by replacing with stdlib queue.Queue for the request notification path; consumer uses asyncio.to_thread(q.get, timeout=...) to avoid blocking the event loop. chunk_queue intentionally stays as asyncio.Queue since both producer and consumer run on the server's own event loop (a2b4388/b10a4483)

transparent_proxy migration docs are a non-issue — only one env used it and it's already been migrated.

Copy link
Copy Markdown
Member

@sergiopaniego sergiopaniego left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Inline comments on concrete issues found during review.

Comment thread envs/coding_agent_env/harness.py Outdated
Comment thread envs/coding_agent_env/harness.py Outdated
Comment thread envs/coding_agent_env/server/coding_environment.py Outdated
Comment thread envs/coding_agent_env/server/coding_environment.py Outdated
Comment thread src/openenv/core/harness/agents/interception_server.py
Comment thread src/openenv/core/harness/agents/pi.py Outdated
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

CLA Signed This label is managed by the Meta Open Source bot.

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants