Skip to content

feat: Adding tool call into response api#413

Draft
ilblackdragon wants to merge 2 commits intomainfrom
tools-support
Draft

feat: Adding tool call into response api#413
ilblackdragon wants to merge 2 commits intomainfrom
tools-support

Conversation

@ilblackdragon
Copy link
Member

No description provided.

@gemini-code-assist
Copy link
Contributor

Summary of Changes

Hello @ilblackdragon, I'm Gemini Code Assist1! I'm currently reviewing this pull request and will post my feedback shortly. In the meantime, here's a summary to help you and other reviewers quickly get up to speed!

This pull request significantly extends the API's capabilities by integrating support for external function calls. It allows the language model to dynamically request the execution of client-defined functions, effectively pausing the conversation until the client provides the necessary results. This mechanism facilitates a more interactive and extensible tool-use experience, enabling the LLM to leverage external services and data sources managed by the client application.

Highlights

  • External Function Call Support: Introduced a new FunctionCall enum variant in ResponseOutputItem and ConversationItem to represent external function calls requested by the LLM, enabling client-side execution of tools.
  • Client-Provided Function Output: Added FunctionCallOutput to ResponseInputItem, allowing clients to submit the results of their external function executions back to the API to continue the conversation flow.
  • New Function Tool Executor: Implemented a FunctionToolExecutor that identifies and processes external function tools. Instead of executing them, it signals FunctionCallRequired errors, pausing the agent loop until the client provides the output.
  • Agent Loop Management for Function Calls: Updated the ResponseServiceImpl to handle FunctionCallOutput items from client requests and to manage the agent loop's state when FunctionCallsRequired are encountered, supporting both single and parallel function calls.
  • Enhanced Error Handling: Added new error types, FunctionCallRequired and FunctionCallNotFound, to provide specific feedback related to the lifecycle and state of external function calls.
  • Comprehensive End-to-End Testing: Included new E2E tests (e2e_function_tools.rs) to validate the complete flow of external function calls, covering scenarios like single calls, parallel calls, and proper error handling for invalid function outputs.
Changelog
  • crates/api/src/models.rs
    • Added FunctionCall variant to ResponseOutputItem enum with fields for id, response_id, previous_response_id, next_response_ids, created_at, call_id, name, arguments, status, and model.
    • Added FunctionCall variant to ConversationItem enum with similar fields to ResponseOutputItem.
  • crates/api/src/routes/conversations.rs
    • Updated convert_output_item_to_conversation_item to map ResponseOutputItem::FunctionCall to ConversationItem::FunctionCall.
    • Modified impl ConversationItem to include FunctionCall in the get_id method.
  • crates/api/src/routes/responses.rs
    • Added FunctionCallRequired and FunctionCallNotFound error mappings to map_response_error_to_status.
    • Implemented From<ServiceResponseError> for ErrorResponse to handle new function call related errors.
  • crates/api/tests/e2e_conversations.rs
    • Extended test_conversation_items_pagination, test_conversation_items_include_response_metadata, and test_conversation_items_include_model to account for the new FunctionCall conversation item.
  • crates/api/tests/e2e_function_tools.rs
    • Added a new test file for end-to-end testing of function tools, including test_function_tool_single_call, test_function_tool_parallel_calls, test_function_output_without_previous_response_fails, and test_function_tool_coexists_with_builtin_tools.
  • crates/database/src/repositories/response_item.rs
    • Updated PgResponseItemsRepository to correctly handle and update ResponseOutputItem::FunctionCall variants.
  • crates/services/src/id_prefixes.rs
    • Introduced PREFIX_FC constant for function call IDs.
    • Added PREFIX_FC to the ALL_PREFIXES array.
  • crates/services/src/responses/errors.rs
    • Defined new ResponseError variants: FunctionCallRequired { name: String, call_id: String } and FunctionCallNotFound(String).
  • crates/services/src/responses/models.rs
    • Added FunctionCallOutput variant to ResponseInputItem for client-provided function results, including call_id and output.
    • Introduced FunctionCallOutputType enum for type marking function call output input.
    • Added FunctionCall variant to ResponseOutputItem for server-requested function calls.
    • Implemented is_function_call_output and as_function_call_output helper methods for ResponseInputItem.
  • crates/services/src/responses/service.rs
    • Modified AgentLoopResult enum to include FunctionCallsRequired(Vec<tools::FunctionCallInfo>).
    • Updated ResponseServiceImpl::process_response_stream to initialize FunctionToolExecutor and process FunctionCallOutput items from the request input.
    • Adjusted agent loop logic to return AgentLoopResult::FunctionCallsRequired when external function calls are pending.
    • Added process_function_call_outputs asynchronous function to handle and verify client-provided function results.
    • Modified store_input_as_response_items to skip FunctionCallOutput items as they are processed separately.
  • crates/services/src/responses/tools/executor.rs
    • Added FunctionCallPending(FunctionCallInfo) variant to ToolExecutionResult.
    • Introduced FunctionCallInfo struct to hold details about pending external function calls (call_id, name, arguments).
  • crates/services/src/responses/tools/function.rs
    • Created new module function.rs implementing FunctionToolExecutor for handling external function tools.
    • The FunctionToolExecutor identifies function tools and returns ResponseError::FunctionCallRequired to signal client execution.
  • crates/services/src/responses/tools/mod.rs
    • Exported FunctionCallInfo and FunctionToolExecutor from the tools module.
Activity
  • The pull request introduces a new feature to support external function calls within the API.
  • The author ilblackdragon has implemented changes across multiple crates, including api, database, and services.
  • New data models and error types have been defined to accommodate the function call workflow.
  • A dedicated FunctionToolExecutor has been developed to manage the lifecycle of external function calls.
  • Extensive end-to-end tests have been added to ensure the robustness and correctness of the new functionality.
Using Gemini Code Assist

The full guide for Gemini Code Assist can be found on our documentation page, here are some quick tips.

Invoking Gemini

You can request assistance from Gemini at any point by creating a comment using either /gemini <command> or @gemini-code-assist <command>. Below is a summary of the supported commands on the current page.

Feature Command Description
Code Review /gemini review Performs a code review for the current pull request in its current state.
Pull Request Summary /gemini summary Provides a summary of the current pull request in its current state.
Comment @gemini-code-assist Responds in comments when explicitly tagged, both in pull request comments and review comments.
Help /gemini help Displays a list of available commands.

Customization

To customize Gemini Code Assist for GitHub experience, repository maintainers can create a configuration file and/or provide a custom code review style guide (such as PEP-8 for Python) by creating and adding files to a .gemini/ folder in the base of the repository. Detailed instructions can be found here.

Limitations & Feedback

Gemini Code Assist may make mistakes. Please leave feedback on any instances where its feedback is incorrect or counter productive. You can react with 👍 and 👎 on @gemini-code-assist comments. If you're interested in giving your feedback about your experience with Gemini Code Assist for Github and other Google products, sign up here.

You can also get AI-powered code generation, chat, as well as code reviews directly in the IDE at no cost with the Gemini Code Assist IDE Extension.

Footnotes

  1. Review the Privacy Notices, Generative AI Prohibited Use Policy, Terms of Service, and learn how to configure Gemini Code Assist in GitHub here. Gemini can make mistakes, so double check it and use code with caution.

Copy link
Contributor

@gemini-code-assist gemini-code-assist bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review

This pull request introduces support for external function calls (tool calls) in the response API, a significant and well-implemented feature. The implementation of process_function_call_outputs contains an authorization bypass and an IDOR vulnerability. This allows tool call IDs verification to be skipped with a malformed previous_response_id, leading to unauthorized injection of tool results and potential information leaks due to loading response items without ownership verification. Additionally, a high-severity issue was identified regarding inconsistent tool_call_id generation, which could disrupt the tool call flow if an ID is not provided by the LLM. There are also a couple of medium-severity issues to improve robustness and test clarity. Addressing these critical security and functional issues is essential before merging.

// The service layer will handle this error and create the FunctionCall output item
Err(ResponseError::FunctionCallRequired {
name: tool_call.tool_type.clone(),
call_id: tool_call.id.clone().unwrap_or_default(),
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

high

There is an inconsistency in how fallback tool_call_ids are generated. Here, if tool_call.id is None, it defaults to an empty string. However, in service.rs, when constructing the assistant message with tool_calls, a new UUID-based ID is generated as a fallback.

This will cause a mismatch. The FunctionCall item sent to the client will have an empty call_id, but the assistant message in the history will have a generated ID. When the client responds with the function output, the resulting tool message will have an empty tool_call_id, which won't match the one in the assistant message, likely causing an error from the LLM provider.

The ID should be generated once and reused. A potential fix is to populate the ID on ToolCallInfo immediately after it's created if it's missing.

Comment on lines +1542 to +1569
if let Ok(uuid) = uuid::Uuid::parse_str(uuid_str) {
let response_id = models::ResponseId(uuid);

// List all items from the previous response
let items = response_items_repository
.list_by_response(response_id)
.await
.map_err(|e| {
errors::ResponseError::InternalError(format!(
"Failed to fetch response items: {e}"
))
})?;

// Find the matching FunctionCall item
let found = items.iter().any(|item| {
matches!(item, models::ResponseOutputItem::FunctionCall {
call_id: item_call_id,
..
} if item_call_id == call_id)
});

if !found {
return Err(errors::ResponseError::FunctionCallNotFound(
call_id.to_string(),
));
}
}
}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

security-medium medium

The verification logic for call_id can be bypassed by providing a malformed previous_response_id. If uuid::Uuid::parse_str(uuid_str) fails, the entire verification block is skipped, and the function proceeds to add the unverified tool output to the conversation history. This allows an attacker to inject arbitrary tool results into the LLM context, potentially leading to indirect prompt injection.

Comment on lines +1547 to +1549
.list_by_response(response_id)
.await
.map_err(|e| {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

security-medium medium

This call to list_by_response uses a response_id derived from user input without verifying that the response belongs to the current workspace or user. While the items are not directly returned to the user, they are loaded into memory, and the attacker can use the FunctionCallNotFound error as a side-channel to verify the existence of specific call_ids in other users' responses. The repository method list_by_response should be updated to enforce ownership checks.

Comment on lines +129 to +130
let turn2_with_tool_result_prompt =
mock_prompts::build_prompt(&format!("{} {}", turn1_user, function_output));
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

medium

The mocking for the second turn of the conversation seems to not accurately represent the conversation history. The prompt for the LLM should be a structured list of messages, including the user's initial query, the assistant's first response with the tool call, and a new message with role: "tool" containing the function output. Simply concatenating the user message and the function output into a single string doesn't reflect the actual API call to the LLM provider.

While this test may pass due to how mock_prompts::build_prompt is implemented, it makes the test brittle and harder to understand. The mock should be updated to reflect the actual message structure. This feedback also applies to test_function_tool_parallel_calls.


if function_outputs.is_empty() {
return Ok(messages);
}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

medium

For robustness, it's better to explicitly validate that FunctionCallOutput items are only provided when resuming a response. If function_outputs is not empty, request.previous_response_id must be present. Without this check, the request might proceed and fail later at the LLM provider level with a less clear error message.

Please add a check after this block:

if request.previous_response_id.is_none() {
    return Err(errors::ResponseError::InvalidParams(
        "FunctionCallOutput is only allowed when resuming a response with previous_response_id.".to_string(),
    ));
}

This also makes the behavior consistent with the test test_function_output_without_previous_response_fails, which expects a client error in this scenario.

Custom function tools were failing with "missing query field" error
because convert_tool_calls only allowed specific tools to skip the
'query' requirement. This also affected code_interpreter and computer
tools which had no executor.

Changes:
- Update get_function_tool_names() to include code_interpreter/computer
- Update FunctionToolExecutor to handle code_interpreter/computer
- Add function_tool_names parameter to convert_tool_calls
- Add tests for all new functionality

Now all tool types are properly handled:
- Function: client-executed via FunctionToolExecutor
- WebSearch: server-executed via WebSearchToolExecutor
- FileSearch: server-executed via FileSearchToolExecutor
- CodeInterpreter: client-executed via FunctionToolExecutor
- Computer: client-executed via FunctionToolExecutor
- Mcp: server-executed via McpToolExecutor

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant