SWE-1.6 triggers unbounded worker spawning → load >1100, ~42GB RAM, SSH instability, ECONNREFUSED (stable with GPT-5.4 Mini)

## Title
SWE-1.6 triggers unbounded worker spawning → load >1100, ~42GB RAM, ECONNREFUSED (stable with GPT-5.4 Mini)

---

## Summary
Using SWE-1.6 in Windsurf causes extreme resource usage after sending a prompt. The system becomes unstable and the Windsurf backend enters a reconnect loop with ECONNREFUSED. Switching to GPT-5.4 Mini on the same setup resolves the issue entirely.

---

## Environment
- OS: Debian (Trixie)
- CPU: AMD 5950X (16C/32T)
- Container CPU allocation: 12 threads (6 cores)
- RAM: 64 GB (48 GB assigned to the container)
- Runtime: Incus container
- Filesystem: BTRFS
- Concurrent workload: CI runner in another container

---

## Steps to Reproduce
1. Start Windsurf with SWE-1.6 selected
2. Open a moderately sized project
3. Send a prompt (e.g., code query / analysis)
4. Observe process spawning and system metrics

---

## Observed Behavior
- Dozens of processes spawn:
  ```
  language_server_linux_x64
  --enable_index_service
  --enable_local_search
  ```
- Load average spikes dramatically (observed >1100 on 32-thread CPU)
- Memory usage grows to ~42 GB (container assigned 48 GB)
- Windsurf logs:
  ```
  Connection to server got closed. Server will restart.
  windsurf client: couldn't create connection to server.
  Error: connect ECONNREFUSED 127.0.0.1:<port>
  Restarting server failed
  ```
- Requires manual window reload to recover

---

## Comparison (Same System, Same Project)

| Model            | Load      | RAM Usage | Stability |
|------------------|-----------|-----------|-----------|
| SWE-1.6          | 100–1100+ | up to 42GB| ❌ unstable |
| GPT-5.4 Mini     | <10       | ~7GB      | ✅ stable   |

---

## Additional Observations
- Trigger occurs after prompt, not at startup
- Behavior resembles unbounded parallelism / worker spawning
- System instability correlates with process explosion
- Increasing RAM improves stability but does not fix the root cause
- Limiting container CPU/processes mitigates but does not eliminate the issue
- Updating SSH client and server resolved connection drops, indicating prior issues were caused by system starvation rather than network failure

---

## Expected Behavior
- Bounded worker pool
- Graceful degradation under load
- Stable backend connection
- Resource usage proportional to workload

---

## Actual Behavior
- Unbounded worker spawning
- Extreme CPU and memory usage
- Backend becomes unreachable (ECONNREFUSED)
- Requires manual recovery

---

## Related Issues
- ECONNREFUSED restart loop (#141)
- Language server memory explosion (#300)

This report identifies a likely root cause: resource explosion triggered by SWE-1.6 after a prompt.

---

## Workarounds
- Switching to GPT-5.4 Mini (fully stable)
- Increasing available RAM (partial mitigation)
- Limiting container CPU / processes (partial mitigation)
- Updating SSH configuration prevents disconnects but does not address the root cause

---

## SSH Mitigations Applied
- Client (`~/.ssh/config`):
  ```
  Host *
      ServerAliveInterval 10
      ServerAliveCountMax 10
      TCPKeepAlive yes
      ConnectTimeout 10
  ```
- Server (`/etc/ssh/sshd_config`):
  ```
  ClientAliveInterval 15
  ClientAliveCountMax 10
  TCPKeepAlive yes
  UseDNS no
  MaxStartups 10:30:60
  LoginGraceTime 30
  MaxSessions 4
  ```
- Result: SSH disconnects no longer occur under load; prior disconnects were due to scheduler starvation rather than network issues

---

## Notes
This appears to be a scaling/control issue specific to SWE-1.6 runtime behavior. High-core systems amplify the problem, but the lack of concurrency limits likely affects general use as well.


Provide feedback

Saved searches

Use saved searches to filter your results more quickly

SWE-1.6 triggers unbounded worker spawning → load >1100, ~42GB RAM, SSH instability, ECONNREFUSED (stable with GPT-5.4 Mini) #322

Title

Summary

Environment

Steps to Reproduce

Observed Behavior

Comparison (Same System, Same Project)

Additional Observations

Expected Behavior

Actual Behavior

Related Issues

Workarounds

SSH Mitigations Applied

Notes

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Model	Load	RAM Usage	Stability
SWE-1.6	100–1100+	up to 42GB	❌ unstable
GPT-5.4 Mini	<10	~7GB	✅ stable

SWE-1.6 triggers unbounded worker spawning → load >1100, ~42GB RAM, SSH instability, ECONNREFUSED (stable with GPT-5.4 Mini) #322

Description

Title

Summary

Environment

Steps to Reproduce

Observed Behavior

Comparison (Same System, Same Project)

Additional Observations

Expected Behavior

Actual Behavior

Related Issues

Workarounds

SSH Mitigations Applied

Notes

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions