Skip to content

docs(examples): add chat reasoning_effort examples#511

Open
andreaonofrei01 wants to merge 4 commits intomainfrom
andreaonofrei/reasoning-examples
Open

docs(examples): add chat reasoning_effort examples#511
andreaonofrei01 wants to merge 4 commits intomainfrom
andreaonofrei/reasoning-examples

Conversation

@andreaonofrei01
Copy link
Copy Markdown
Contributor

@andreaonofrei01 andreaonofrei01 commented May 4, 2026

Summary

Adds four runnable examples in examples/mistral/chat/ demonstrating reasoning_effort on mistral-medium-3-5. The SDK already supports the parameter and the ThinkChunk / TextChunk types, but there were no examples showing how to use them.

Files

  • reasoning_response_shape.py : first stop. Calls the API once with reasoning_effort="high" and once with "none", then dumps the raw message.content so the reader can see the ThinkChunk / TextChunk JSON before consuming it.
  • reasoning.py : single-turn call. Iterates message.content and branches on isinstance(chunk, ThinkChunk | TextChunk). Notes that reasoning_effort="none" returns a plain str.
  • reasoning_with_streaming.py: the streaming version. Handles the three shapes that arrive on delta.content: ThinkChunk lists during the thinking phase, a transition list containing both a closing ThinkChunk and the first TextChunk, and plain string fragments after thinking ends.
  • reasoning_multi_turn.py : 3-turn math chain (17 × 23 → × 3 → −100) run with two replay strategies (keep ThinkChunks vs drop them) and prints per-turn token usage so the cost difference is visible.

Notes

  • All examples set httpx.Timeout(300.0) because reasoning runs routinely exceed the default 60s read timeout.
  • Model is pinned to mistral-medium-3-5 (the model docs explicitly list as supporting reasoning_effort).

Test plan

  • python examples/mistral/chat/reasoning_response_shape.py : verified output matches the documented ThinkChunk / TextChunk schema; effort="none" returns str.
  • python examples/mistral/chat/reasoning.py : verified the John puzzle returns John is 22 years old.
  • python examples/mistral/chat/reasoning_with_streaming.py : verified the trains problem returns 11:16:40 am and the thinking → final-answer transition renders cleanly.
  • python examples/mistral/chat/reasoning_multi_turn.py : verified both strategies arrive at 1073; observed totals keep≈1656 vs drop≈637 tokens (varies with temperature=0.7).

Add four examples in examples/mistral/chat/ demonstrating reasoning_effort
on mistral-medium-3-5:

- reasoning_response_shape.py: dump the raw response shape for
  reasoning_effort="high" vs "none" so users see the ThinkChunk /
  TextChunk JSON before consuming it.
- reasoning.py: single-turn call, iterate ThinkChunk and TextChunk in
  message.content.
- reasoning_with_streaming.py: handle streaming deltas where chunks
  arrive as ThinkChunk lists during thinking and as plain string
  fragments after thinking ends.
- reasoning_multi_turn.py: 3-turn math chain run with two replay
  strategies (keep vs drop ThinkChunks) and prints token usage so
  the cost difference is visible.
Comment thread examples/mistral/chat/reasoning_multi_turn.py Outdated
Comment thread examples/mistral/chat/reasoning.py Outdated
Comment thread examples/mistral/chat/reasoning_multi_turn.py Outdated
- reasoning_multi_turn.py: drop the keep-vs-drop comparison and
  recommend keeping ThinkChunks across turns. Per reviewer feedback,
  dropping the reasoning trace degrades MM3.5 performance.
- All four files: replace httpx.Client(timeout=...) with the SDK's
  timeout_ms parameter; remove the httpx import.
Avoids the unnecessary AssistantMessage(content=content) re-wrap and
forwards any future fields on AssistantMessage automatically.

Verified end-to-end: 3/3 runs of the math chain produce 391 -> 1173 ->
1073, and history inspection confirms each AssistantMessage slot
preserves [ThinkChunk, TextChunk].
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants