[ENHANCEMENT] Extract shared OpenAI-compatible chunk processing from RouterProvider subclasses

## Problem (one or two sentences)

`RouterProvider` subclasses (`OpencodeGoHandler`, `VercelAiGatewayHandler`, `LiteLLMHandler`) each re-implement both the OpenAI streaming chunk processing loop and `completePrompt` inline, duplicating logic that already exists — in a more complete form — in `BaseOpenAiCompatibleProvider`. Each copy is a subset of the original and diverges in subtle ways.

## Context (who is affected and when)

Affects contributors adding or maintaining dynamic-model-list providers (those that extend `RouterProvider`). The two class hierarchies — `RouterProvider` for dynamic model fetching, `BaseOpenAiCompatibleProvider` for static model lists — have no common base for stream processing, so every new `RouterProvider` subclass is forced to reinvent it.

### `createMessage` divergences

| Feature | `BaseOpenAiCompatibleProvider` | `OpencodeGoHandler` | `VercelAiGatewayHandler` |
|---|---|---|---|
| `<think>` tag support via `TagMatcher` | ✅ | ❌ | ❌ |
| Checks both `reasoning_content` and `reasoning` keys | ✅ | ❌ (only `reasoning_content`) | ❌ |
| `tool_call_end` on `finish_reason === "tool_calls"` | ✅ | ❌ | ❌ |
| Cost calculation in usage chunk | ✅ | ❌ | ❌ |
| `handleOpenAIError` wrapper | ✅ | ❌ | ❌ |

### `completePrompt` divergences

All three subclasses also re-implement `completePrompt` with minor variations. `LiteLLMHandler` has a legitimate divergence (`max_tokens` vs `max_completion_tokens` depending on GPT-5 model detection) that would need to be preserved in any extraction.

## Desired behavior (conceptual, not technical)

The chunk-processing and `completePrompt` logic each live in one place. All OpenAI-compatible providers — whether they use static or dynamic model lists — process streaming chunks and non-streaming completions consistently.

## Constraints / preferences

- `RouterProvider` subclasses need `fetchModel()` (dynamic model loading) which `BaseOpenAiCompatibleProvider` doesn't have, so a simple inheritance change won't work.
- The solution should not require duplicating model-fetching logic into `BaseOpenAiCompatibleProvider`.
- `LiteLLMHandler`'s GPT-5 `max_tokens`/`max_completion_tokens` branching must be preserved.
- Behaviour should be unchanged for all existing providers.

## Proposed approach

Extract the streaming loop and `completePrompt` from `BaseOpenAiCompatibleProvider` into standalone utility functions, e.g.:

```ts
// src/api/providers/utils/openai-stream.ts
export async function* processOpenAIStream(
  stream: AsyncIterable<OpenAI.Chat.ChatCompletionChunk>,
  modelInfo?: ModelInfo,
): ApiStream { ... }
```

`BaseOpenAiCompatibleProvider` and all `RouterProvider` subclasses then delegate to these utilities. The three affected files are:
- `src/api/providers/opencode-go.ts`
- `src/api/providers/vercel-ai-gateway.ts`
- `src/api/providers/lite-llm.ts`

## Trade-offs / risks

- Small risk of behavioural change if the extraction misses any provider-specific nuance (e.g. Vercel's prompt-caching logic that runs before the loop, LiteLLM's GPT-5 token parameter branching).
- Best done with the existing provider test suites as a safety net — all three affected providers have unit tests covering streaming output.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[ENHANCEMENT] Extract shared OpenAI-compatible chunk processing from RouterProvider subclasses #338

Problem (one or two sentences)

Context (who is affected and when)

`createMessage` divergences

`completePrompt` divergences

Desired behavior (conceptual, not technical)

Constraints / preferences

Proposed approach

Trade-offs / risks

Metadata

Assignees

Labels

Type

Fields

Projects

Milestone

Relationships

Development

Feature	`BaseOpenAiCompatibleProvider`	`OpencodeGoHandler`	`VercelAiGatewayHandler`
`<think>` tag support via `TagMatcher`	✅	❌	❌
Checks both `reasoning_content` and `reasoning` keys	✅	❌ (only `reasoning_content`)	❌
`tool_call_end` on `finish_reason === "tool_calls"`	✅	❌	❌
Cost calculation in usage chunk	✅	❌	❌
`handleOpenAIError` wrapper	✅	❌	❌

[ENHANCEMENT] Extract shared OpenAI-compatible chunk processing from RouterProvider subclasses #338

Description

Problem (one or two sentences)

Context (who is affected and when)

createMessage divergences

completePrompt divergences

Desired behavior (conceptual, not technical)

Constraints / preferences

Proposed approach

Trade-offs / risks

Metadata

Metadata

Assignees

Labels

Type

Fields

Projects

Milestone

Relationships

Development

Issue actions

`createMessage` divergences

`completePrompt` divergences