wip : Change gated model to tinyllama#292
wip : Change gated model to tinyllama#292arekay-nv wants to merge 1 commit intofeat/alicheng-pubsub-integrationfrom
Conversation
Signed-off-by: Rashid Kaleem <230885705+arekay-nv@users.noreply.github.com>
|
MLCommons CLA bot All contributors have signed the MLCommons CLA ✍️ ✅ |
There was a problem hiding this comment.
Code Review
This pull request updates the default placeholder model to TinyLlama across various templates and scripts. It also introduces more robust handling for gated Hugging Face repositories by checking for access permissions before attempting to load tokenizers, allowing the system to fall back gracefully if access is denied. Review feedback suggests that the string-based error matching for gated repositories may be fragile and recommends ensuring that the huggingface_hub dependency is updated to at least version 0.23.0 to support the auth_check method.
| except (GatedRepoError, OSError) as e: | ||
| # transformers re-raises GatedRepoError as OSError containing "gated" | ||
| if isinstance(e, GatedRepoError) or "gated" in str(e).lower(): | ||
| logger.warning( | ||
| "Tokenizer '%s' is a gated HuggingFace repo and this " | ||
| "environment has no access. Set HF_TOKEN or use an " | ||
| "ungated model to enable token metrics. Continuing " | ||
| "without token metrics (ISL/OSL/TPOT unavailable).", | ||
| args.tokenizer, | ||
| ) | ||
| else: | ||
| logger.warning( | ||
| "Failed to load tokenizer '%s': %s. Continuing without " | ||
| "token metrics (ISL/OSL/TPOT unavailable).", | ||
| args.tokenizer, | ||
| e, | ||
| ) | ||
| pool_cm = nullcontext() |
There was a problem hiding this comment.
The error handling for TokenizePool initialization is robust, but the string matching on OSError (line 104) is slightly fragile as it depends on the internal error message format of the transformers library. While this is a known pattern for gated repo errors in transformers, consider if there's a more reliable way to detect this, or ensure that any other OSError that might occur during initialization is also handled correctly (which the else block currently does).
| if info.gated: | ||
| try: | ||
| HfApi().auth_check(repo_id=model_name) | ||
| except GatedRepoError: | ||
| logger.warning( | ||
| f"Model '{model_name}' is a gated HuggingFace repo and " | ||
| "this environment has no access. Set HF_TOKEN or use an " | ||
| "ungated model to enable token metrics. Continuing " | ||
| "without token metrics (ISL/OSL/TPOT unavailable)." | ||
| ) | ||
| return False |
There was a problem hiding this comment.
The addition of HfApi().auth_check is a good proactive measure to verify access to gated repositories before launching the aggregator subprocess. However, note that auth_check was introduced in huggingface_hub version 0.23.0. If the environment uses an older version, this call will raise an AttributeError. While the general except Exception block at line 222 will catch this and allow the benchmark to continue without token metrics, it might be worth ensuring the project's dependencies are updated to reflect this requirement.
What does this PR do?
Type of change
Related issues
Testing
Checklist