fix: add Cerebras models zai-glm-4.7 by github-actions[bot] · Pull Request #616 · braintrustdata/braintrust-proxy

github-actions · 2026-05-21T20:43:56Z

fix: add Cerebras models zai-glm-4.7

Closes #590

Source issue: #590

Summary

Field	Value
Provider	cerebras
Primary model	zai-glm-4.7
Changed models	`zai-glm-4.7`
Added models	`zai-glm-4.7`
Updated models	None
Verification sources	1 2 3 4

Verified metadata

Model	Display name	Parent	Providers	Format	Flavor	Token limits	Pricing	Lifecycle
zai-glm-4.7	Z.ai GLM 4.7		cerebras	openai	chat	input=131072, output=40960	in/out=2.25/2.75 per 1M	reasoning=true

Verification notes

Verification

Official sources consulted

Cerebras Models Overview — https://inference-docs.cerebras.ai/models/overview
- Verified: both model IDs exist as Preview models on Cerebras inference
- Verified: model parameter counts (355B for GLM 4.7, 235B for Qwen 3)
Cerebras Pricing — https://cerebras.ai/pricing
- Verified: zai-glm-4.7 input $2.25/M, output $2.75/M
- Verified: qwen-3-235b-a22b-instruct-2507 input $0.60/M, output $1.20/M
Cerebras Z.ai GLM 4.7 Model Page — https://inference-docs.cerebras.ai/models/zai-glm-47
- Verified: context window 131k tokens (paid), max output 40k tokens
- Verified: reasoning model (reasoning enabled by default, reasoning_effort parameter)
- Verified: text-only input/output (not multimodal)
- Verified: supports streaming, structured outputs, tool calling, prompt caching
- No deprecation date listed
Cerebras Qwen 3 235B Model Page — https://inference-docs.cerebras.ai/models/qwen-3-235b-2507
- Verified: context window 131k tokens (paid), max output 40k tokens (paid)
- Verified: non-thinking mode only ("This model supports only non-thinking mode")
- Verified: deprecation date May 27, 2026
- Verified: text-only input/output (not multimodal)
- Verified: supports streaming, structured outputs, tool calling, prompt caching

sync_models (LiteLLM) cross-check

Neither cerebras/zai-glm-4.7 nor cerebras/qwen-3-235b-a22b-instruct-2507 exists in the LiteLLM model_prices_and_context_window_backup.json catalog. All proposed values are sourced directly from official Cerebras documentation. No deviations to report because sync_models has no entries for comparison.

Token limit interpretation

Cerebras docs report token limits as approximate values ("131k", "40k"). Based on the existing catalog entry for gpt-oss-120b on Cerebras (max_input_tokens: 131072, max_output_tokens: 32768) and standard LLM conventions where "32k" = 32768 (32 * 1024), the values are interpreted as:

"131k" input = 131072 tokens (128 * 1024, matching existing Cerebras models)
"40k" output = 40960 tokens (40 * 1024, following the same convention)

Fields not published or not applicable

parent: No stable alias exists for either model on Cerebras; not applicable
input_cache_read_cost_per_mil_tokens / input_cache_write_cost_per_mil_tokens: Cerebras docs mention prompt caching support but do not publish separate cache pricing; omitted
supported_regions: Not applicable (Cerebras is not a Vertex provider)
locations: Not applicable (Cerebras models do not require explicit location metadata)
multimodal: Both models are text-only; omitted (defaults to falsy)

Deprecation note

qwen-3-235b-a22b-instruct-2507 has deprecation_date 2026-05-27, which is within the resolver's 90-day deprecation window. The resolver will automatically skip this model as near-deprecation and proceed with only zai-glm-4.7. This is expected and correct behavior.

sync_models vs proposed update

sync_models cross-check found differences. Official provider verification was used for the applied values, and sync_models discrepancies are listed below for review.

Model	Field	Proposed update	sync_models	sync_models source models
zai-glm-4.7	max_input_tokens	131072	128000	cerebras/zai-glm-4.7
zai-glm-4.7	max_output_tokens	40960	128000	cerebras/zai-glm-4.7

vercel · 2026-05-21T20:43:59Z

The latest updates on your projects. Learn more about Vercel for GitHub.

Project	Deployment	Actions	Updated (UTC)
ai-proxy	Ready	Preview, Comment	May 21, 2026 8:45pm

fix: add Cerebras models zai-glm-4.7

9bfebcb

github-actions Bot added the auto-sync label May 21, 2026

github-actions Bot requested review from Alex Z (CLowbrow), aswink, Caitlin Pinn (cpinn), Erin McNulty (erin2722) and Ken Jiang (knjiang) May 21, 2026 20:43

github-actions Bot mentioned this pull request May 21, 2026

[BOT ISSUE] Cerebras: add missing zai-glm-4.7, qwen-3-235b-a22b-instruct-2507 #590

Open

4 tasks

vercel Bot deployed to Preview May 21, 2026 20:45 View deployment

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

fix: add Cerebras models zai-glm-4.7#616

fix: add Cerebras models zai-glm-4.7#616
github-actions[bot] wants to merge 1 commit into
mainfrom
chore/autofix-issue-590

github-actions Bot commented May 21, 2026

Uh oh!

vercel Bot commented May 21, 2026 •

edited

Loading

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

github-actions Bot commented May 21, 2026

Verification

Official sources consulted

sync_models (LiteLLM) cross-check

Token limit interpretation

Fields not published or not applicable

Deprecation note

Uh oh!

vercel Bot commented May 21, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

vercel Bot commented May 21, 2026 •

edited

Loading