fix: add Cerebras models zai-glm-4.7#616
Open
github-actions[bot] wants to merge 1 commit into
Open
Conversation
|
The latest updates on your projects. Learn more about Vercel for GitHub.
|
4 tasks
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
fix: add Cerebras models zai-glm-4.7
Closes #590
Source issue: #590
Summary
zai-glm-4.7zai-glm-4.72
3
4
Verified metadata
Verification notes
Verification
Official sources consulted
Cerebras Models Overview — https://inference-docs.cerebras.ai/models/overview
Cerebras Pricing — https://cerebras.ai/pricing
zai-glm-4.7input $2.25/M, output $2.75/Mqwen-3-235b-a22b-instruct-2507input $0.60/M, output $1.20/MCerebras Z.ai GLM 4.7 Model Page — https://inference-docs.cerebras.ai/models/zai-glm-47
reasoning_effortparameter)Cerebras Qwen 3 235B Model Page — https://inference-docs.cerebras.ai/models/qwen-3-235b-2507
sync_models (LiteLLM) cross-check
Neither
cerebras/zai-glm-4.7norcerebras/qwen-3-235b-a22b-instruct-2507exists in the LiteLLMmodel_prices_and_context_window_backup.jsoncatalog. All proposed values are sourced directly from official Cerebras documentation. No deviations to report because sync_models has no entries for comparison.Token limit interpretation
Cerebras docs report token limits as approximate values ("131k", "40k"). Based on the existing catalog entry for
gpt-oss-120bon Cerebras (max_input_tokens: 131072, max_output_tokens: 32768) and standard LLM conventions where "32k" = 32768 (32 * 1024), the values are interpreted as:Fields not published or not applicable
parent: No stable alias exists for either model on Cerebras; not applicableinput_cache_read_cost_per_mil_tokens/input_cache_write_cost_per_mil_tokens: Cerebras docs mention prompt caching support but do not publish separate cache pricing; omittedsupported_regions: Not applicable (Cerebras is not a Vertex provider)locations: Not applicable (Cerebras models do not require explicit location metadata)multimodal: Both models are text-only; omitted (defaults to falsy)Deprecation note
qwen-3-235b-a22b-instruct-2507has deprecation_date 2026-05-27, which is within the resolver's 90-day deprecation window. The resolver will automatically skip this model as near-deprecation and proceed with onlyzai-glm-4.7. This is expected and correct behavior.sync_models vs proposed update
sync_models cross-check found differences. Official provider verification was used for the applied values, and sync_models discrepancies are listed below for review.