Skip to content

fix: add Cerebras models zai-glm-4.7#616

Open
github-actions[bot] wants to merge 1 commit into
mainfrom
chore/autofix-issue-590
Open

fix: add Cerebras models zai-glm-4.7#616
github-actions[bot] wants to merge 1 commit into
mainfrom
chore/autofix-issue-590

Conversation

@github-actions
Copy link
Copy Markdown
Contributor

fix: add Cerebras models zai-glm-4.7

Closes #590

Source issue: #590

Summary

Field Value
Provider cerebras
Primary model zai-glm-4.7
Changed models zai-glm-4.7
Added models zai-glm-4.7
Updated models None
Verification sources 1
2
3
4

Verified metadata

Model Display name Parent Providers Format Flavor Token limits Pricing Lifecycle
zai-glm-4.7 Z.ai GLM 4.7 cerebras openai chat input=131072, output=40960 in/out=2.25/2.75 per 1M reasoning=true

Verification notes

Verification

Official sources consulted

  1. Cerebras Models Overviewhttps://inference-docs.cerebras.ai/models/overview

    • Verified: both model IDs exist as Preview models on Cerebras inference
    • Verified: model parameter counts (355B for GLM 4.7, 235B for Qwen 3)
  2. Cerebras Pricinghttps://cerebras.ai/pricing

    • Verified: zai-glm-4.7 input $2.25/M, output $2.75/M
    • Verified: qwen-3-235b-a22b-instruct-2507 input $0.60/M, output $1.20/M
  3. Cerebras Z.ai GLM 4.7 Model Pagehttps://inference-docs.cerebras.ai/models/zai-glm-47

    • Verified: context window 131k tokens (paid), max output 40k tokens
    • Verified: reasoning model (reasoning enabled by default, reasoning_effort parameter)
    • Verified: text-only input/output (not multimodal)
    • Verified: supports streaming, structured outputs, tool calling, prompt caching
    • No deprecation date listed
  4. Cerebras Qwen 3 235B Model Pagehttps://inference-docs.cerebras.ai/models/qwen-3-235b-2507

    • Verified: context window 131k tokens (paid), max output 40k tokens (paid)
    • Verified: non-thinking mode only ("This model supports only non-thinking mode")
    • Verified: deprecation date May 27, 2026
    • Verified: text-only input/output (not multimodal)
    • Verified: supports streaming, structured outputs, tool calling, prompt caching

sync_models (LiteLLM) cross-check

Neither cerebras/zai-glm-4.7 nor cerebras/qwen-3-235b-a22b-instruct-2507 exists in the LiteLLM model_prices_and_context_window_backup.json catalog. All proposed values are sourced directly from official Cerebras documentation. No deviations to report because sync_models has no entries for comparison.

Token limit interpretation

Cerebras docs report token limits as approximate values ("131k", "40k"). Based on the existing catalog entry for gpt-oss-120b on Cerebras (max_input_tokens: 131072, max_output_tokens: 32768) and standard LLM conventions where "32k" = 32768 (32 * 1024), the values are interpreted as:

  • "131k" input = 131072 tokens (128 * 1024, matching existing Cerebras models)
  • "40k" output = 40960 tokens (40 * 1024, following the same convention)

Fields not published or not applicable

  • parent: No stable alias exists for either model on Cerebras; not applicable
  • input_cache_read_cost_per_mil_tokens / input_cache_write_cost_per_mil_tokens: Cerebras docs mention prompt caching support but do not publish separate cache pricing; omitted
  • supported_regions: Not applicable (Cerebras is not a Vertex provider)
  • locations: Not applicable (Cerebras models do not require explicit location metadata)
  • multimodal: Both models are text-only; omitted (defaults to falsy)

Deprecation note

qwen-3-235b-a22b-instruct-2507 has deprecation_date 2026-05-27, which is within the resolver's 90-day deprecation window. The resolver will automatically skip this model as near-deprecation and proceed with only zai-glm-4.7. This is expected and correct behavior.

sync_models vs proposed update

sync_models cross-check found differences. Official provider verification was used for the applied values, and sync_models discrepancies are listed below for review.

Model Field Proposed update sync_models sync_models source models
zai-glm-4.7 max_input_tokens 131072 128000 cerebras/zai-glm-4.7
zai-glm-4.7 max_output_tokens 40960 128000 cerebras/zai-glm-4.7

@vercel
Copy link
Copy Markdown

vercel Bot commented May 21, 2026

The latest updates on your projects. Learn more about Vercel for GitHub.

Project Deployment Actions Updated (UTC)
ai-proxy Ready Ready Preview, Comment May 21, 2026 8:45pm

Request Review

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

[BOT ISSUE] Cerebras: add missing zai-glm-4.7, qwen-3-235b-a22b-instruct-2507

1 participant