Skip to content

Fix/5315 agent engine sandbox code executor gemini 2.x incompatibility#5498

Closed
JesserHamdaoui wants to merge 11 commits intogoogle:mainfrom
JesserHamdaoui:fix/5315-AgentEngineSandboxCodeExecutor-Gemini-2.x-incompatibility
Closed

Fix/5315 agent engine sandbox code executor gemini 2.x incompatibility#5498
JesserHamdaoui wants to merge 11 commits intogoogle:mainfrom
JesserHamdaoui:fix/5315-AgentEngineSandboxCodeExecutor-Gemini-2.x-incompatibility

Conversation

@JesserHamdaoui
Copy link
Copy Markdown
Contributor


Problem:

AgentEngineSandboxCodeExecutor fails with Gemini 2.x models when the agent is asked to execute Python code. The observed errors are UNEXPECTED_TOOL_CALL (when no other tools are registered) or MALFORMED_FUNCTION_CALL (when other tools are present).

The failure is a contract mismatch across three layers that must agree for code execution to succeed:

Layer What it expects
AgentEngineSandboxCodeExecutor Code returned by the model in ```python / ```tool_code markdown fences inside text parts
extract_code_and_truncate_content Already handles native executable_code parts correctly — this layer is not the problem
Vertex AI Gemini API server A response containing an executable_code part is only valid if the request explicitly declared Tool(code_execution=ToolCodeExecution()), otherwise the response is rejected before ADK ever sees it

Gemini 2.x models, as it seems, are post-trained to satisfy "execute Python" requests by emitting structured native executable_code parts rather than markdown text. Because AgentEngineSandboxCodeExecutor does not declare the code_execution tool in the outgoing request (it was designed to receive markdown), the Vertex AI API validator rejects the model's response and returns content=null. The ADK post-processor never receives anything to route to the sandbox.

This is why passing code_executor=AgentEngineSandboxCodeExecutor(...) to the agent constructor does not help: code_executor is an ADK-side construct only. It tells ADK where to send code once a response arrives; it does not communicate anything to the Vertex AI API server, which has no knowledge of the attached sandbox and enforces the tool-declaration contract at response validation time.

Solution:

Two complementary fixes applied entirely within google/adk/flows/llm_flows/_code_execution.py, with no changes to any other ADK file and no public API changes:

Layer 1 — Pre-processor steering (_run_pre_processor): When the configured executor is a BaseCodeExecutor but not a BuiltInCodeExecutor, append a system-instruction (_NON_BUILTIN_EXECUTOR_INSTRUCTION) to every outgoing request. The instruction explicitly tells the model to wrap Python code in ```tool_code markdown fences and forbids native executable_code emission, reducing the frequency with which the model triggers the API validator.

Layer 2 — Response-processor recovery (_run_post_processor): When the API still rejects the response with UNEXPECTED_TOOL_CALL or MALFORMED_FUNCTION_CALL, parse the rejected code out of error_message via _extract_code_from_error_message, reconstruct a synthetic executable_code part on llm_response.content, clear the error fields, and let the existing extract_code_and_truncate_contentcode_executor.execute_code pipeline handle it exactly as if the model had emitted the part cleanly. Note that extract_code_and_truncate_content already supports executable_code parts, this recovery path simply gives it the chance to run.

Together, Layer 1 stops most rejections at the request side and Layer 2 rescues the cases where the model still emits a native tool call despite the steering, ensuring the full User → Model → Executor → Sandbox flow completes across Gemini 2.x.


Testing Plan

Unit Tests:

  • I have added or updated unit tests for my change.
  • All unit tests pass locally.

Four new test groups were added to tests/unittests/flows/llm_flows/test_code_execution.py:

Test group What it covers
test_extract_code_from_error_message_* Valid single-line payload, multiline payload, None input, non-matching error message
test_maybe_recover_from_api_rejection_* UNEXPECTED_TOOL_CALL recovery, MALFORMED_FUNCTION_CALL recovery, unrecognised error code, missing error code, unparseable message
test_pre_processor_injects_instruction_* Instruction appended for non-built-in executor; not appended for BuiltInCodeExecutor
test_post_processor_recovery_* Rejected response with no content is recovered and routed to the sandbox executor; BuiltInCodeExecutor skips the recovery path entirely

Manual End-to-End (E2E) Tests:

Please refer to #5315 for the full reproduction script and setup steps. Using the reproduction case from that issue, the fix was verified against a live Vertex AI Agent Engine sandbox.

JesserHamdaoui and others added 11 commits April 26, 2026 20:54
Steering prompt that tells Gemini 2.x to wrap code in tool_code fences
instead of emitting native executable_code parts when no code_execution
tool is declared on the request.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
…L_CALL_RE constants

Frozenset of Gemini 2.x error codes that indicate a native
code_execution tool call was rejected, and a regex to extract the
code payload from the error message.
Parses the code payload out of a Gemini UNEXPECTED_TOOL_CALL rejection
error message using _UNEXPECTED_TOOL_CALL_RE.
Reconstructs the executable_code part that Gemini 2.x intended to emit
when the API rejected the response with UNEXPECTED_TOOL_CALL or
MALFORMED_FUNCTION_CALL, allowing the sandbox executor pipeline to
proceed normally.
…-in executors

Appends _NON_BUILTIN_EXECUTOR_INSTRUCTION to every LLM request that
uses a non-BuiltInCodeExecutor, steering Gemini 2.x to output code in
tool_code markdown fences rather than native executable_code parts
which the API rejects as UNEXPECTED_TOOL_CALL.
…ost-processor

When Gemini 2.x emits a native code_execution call and the API rejects
it, llm_response.content is empty. For non-built-in executors, attempt
to reconstruct the executable_code part from the error message via
_maybe_recover_from_api_rejection so the sandbox executor pipeline can
still run the code.
@google-cla
Copy link
Copy Markdown

google-cla Bot commented Apr 26, 2026

Thanks for your pull request! It looks like this may be your first contribution to a Google open source project. Before we can look at your pull request, you'll need to sign a Contributor License Agreement (CLA).

View this failed invocation of the CLA check for more information.

For the most up to date status, view the checks section at the bottom of the pull request.

@adk-bot adk-bot added the core [Component] This issue is related to the core interface and implementation label Apr 26, 2026
@adk-bot
Copy link
Copy Markdown
Collaborator

adk-bot commented Apr 26, 2026

Response from ADK Triaging Agent

Hello @JesserHamdaoui, thank you for your contribution!

Before we can review your pull request, you'll need to sign our Contributor License Agreement (CLA). It looks like the CLA check is currently failing.

You can find more information and the link to sign the agreement in the "cla/google" check at the bottom of this PR.

Thank you!

@JesserHamdaoui JesserHamdaoui deleted the fix/5315-AgentEngineSandboxCodeExecutor-Gemini-2.x-incompatibility branch April 26, 2026 20:37
@JesserHamdaoui
Copy link
Copy Markdown
Contributor Author

Closed this due to mentioning co-authoring a commit with Claude code that led to fail the google CLA check.
I opened a new PR #5499

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

core [Component] This issue is related to the core interface and implementation

Projects

None yet

Development

Successfully merging this pull request may close these issues.

AgentEngineSandboxCodeExecutor incompatible with Gemini 2.x models (MALFORMED_FUNCTION_CALL)

2 participants