AnthropicLlm usage_metadata is missing thinking token count

`AnthropicLlm` only maps basic input/output token counts into `usage_metadata`. When extended thinking is enabled, thinking block tokens are included in `output_tokens` but never broken out separately.

## Current behaviour

`message_to_generate_content_response` and the streaming final response both produce:

```python
usage_metadata=types.GenerateContentResponseUsageMetadata(
    prompt_token_count=message.usage.input_tokens,
    candidates_token_count=message.usage.output_tokens,
    total_token_count=(input_tokens + output_tokens),
)
```

Cache token counts (`cache_creation_input_tokens`, `cache_read_input_tokens`) are also missing but are tracked separately in #5395.

## Expected behaviour

When extended thinking is enabled, populate `usage_metadata.thoughts_token_count` with the token count of thinking blocks. This is derivable from the thinking block content (supplemental API call to tokenizer) or from a future dedicated API field (ref: [anthropic-python-sdk](https://github.com/anthropics/anthropic-sdk-python/blob/main/src/anthropic/types/usage.py) ).

## Reference

This is particularly relevant now that extended thinking is supported via PR #5392.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

AnthropicLlm usage_metadata is missing thinking token count #5397

Current behaviour

Expected behaviour

Reference

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

AnthropicLlm usage_metadata is missing thinking token count #5397

Description

Current behaviour

Expected behaviour

Reference

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions