-
-
Notifications
You must be signed in to change notification settings - Fork 11.8k
[Frontend] Add MCP tool streaming support to Responses API #30192
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Conversation
Signed-off-by: Daniel Salib <[email protected]>
Signed-off-by: Daniel Salib <[email protected]>
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Code Review
This pull request introduces comprehensive support for Multi-Channel Protocol (MCP) tools within the OpenAI API response framework. Key changes include adding McpCall as a new response output item type and corresponding streaming events (response.mcp_call_arguments.delta, response.mcp_call_arguments.done, response.mcp_call.in_progress, response.mcp_call.completed). The harmony_utils.py parser has been refactored to treat any non-built-in, non-function recipient in Harmony messages as an MCP call, replacing previous ValueError exceptions for unknown recipients. New utility functions _parse_mcp_recipient and _parse_mcp_call were added to handle the parsing of MCP recipients and creation of McpCall objects, including support for dotted recipients (e.g., repo_browser.list). Built-in tools like 'python', 'browser', and 'container' are now explicitly handled as reasoning output rather than generic MCP calls. The serving_responses.py module was updated to emit the new MCP streaming events and correctly distinguish between function calls, built-in tools, and generic MCP tools during streaming. Test cases were significantly expanded to cover basic MCP call parsing, dotted recipients, differentiation between MCP and function/built-in calls, and multi-turn streaming interactions with MCP tools, including code_interpreter via MCP. A review comment highlighted and provided a fix for issues in a new test case, test_mcp_tool_calling_streaming_types, specifically addressing an overly strict assertion and incorrect conditional logic in event handling, ensuring proper validation of the MCP streaming event sequence.
| async for event in stream_response: | ||
| assert "mcp_call" in event.type | ||
|
|
||
| if event.type == "response.created": | ||
| stack_of_event_types.append(event.type) | ||
| elif event.type == "response.completed": | ||
| assert stack_of_event_types[-1] == pairs_of_event_types[event.type] | ||
| stack_of_event_types.pop() | ||
| if ( | ||
| event.type.endswith("added") | ||
| or event.type == "response.mcp_call.in_progress" | ||
| ): | ||
| stack_of_event_types.append(event.type) | ||
| elif event.type.endswith("delta"): | ||
| if stack_of_event_types[-1] == event.type: | ||
| continue | ||
| stack_of_event_types.append(event.type) | ||
| elif event.type.endswith("done") or event.type == "response.mcp_call.completed": | ||
| assert stack_of_event_types[-1] == pairs_of_event_types[event.type] | ||
| stack_of_event_types.pop() | ||
|
|
||
| assert len(stack_of_event_types) == 0 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The logic in this test has a couple of issues:
- The assertion
assert "mcp_call" in event.typeon line 245 is too strict. The event stream includes many events unrelated to MCP calls (e.g.,response.created,response.reasoning_text.delta), which will cause this assertion to fail. - The event handling logic uses an
iffollowed by anelifchain, and then anotherif(line 252). This secondifshould be anelifto form a single conditional block. Otherwise, an event matchingendswith("added")will be processed by the secondifblock, and then the subsequentelifblocks fordeltaanddonewill be skipped for that event, which is not the intended logic for pairing events.
I've suggested a fix that addresses both points by introducing a flag to check for MCP events and correcting the conditional logic.
mcp_event_seen = False
stack_of_event_types = []
async for event in stream_response:
if "mcp_call" in event.type:
mcp_event_seen = True
if event.type == "response.created":
stack_of_event_types.append(event.type)
elif event.type == "response.completed":
assert stack_of_event_types.pop() == pairs_of_event_types[event.type]
elif (
event.type.endswith("added")
or event.type == "response.mcp_call.in_progress"
):
stack_of_event_types.append(event.type)
elif event.type.endswith("delta"):
if not stack_of_event_types or stack_of_event_types[-1] != event.type:
stack_of_event_types.append(event.type)
elif event.type.endswith("done") or event.type == "response.mcp_call.completed":
assert stack_of_event_types.pop() == pairs_of_event_types[event.type]
assert mcp_event_seen, "No MCP call events were observed in the stream."
assert len(stack_of_event_types) == 0There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
💡 Codex Review
Here are some automated review suggestions for this pull request.
ℹ️ About Codex in GitHub
Codex has been enabled to automatically review pull requests in this repo. Reviews are triggered when you
- Open a pull request for review
- Mark a draft as ready
- Comment "@codex review".
If Codex has suggestions, it will comment; otherwise it will react with 👍.
When you sign up for Codex through ChatGPT, Codex can also answer questions or update the PR, like "@codex address that feedback".
| stack_of_event_types = [] | ||
| async for event in stream_response: | ||
| assert "mcp_call" in event.type |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Fix blanket mcp_call assertion in streaming test
In test_mcp_tool_calling_streaming_types every streamed event is immediately asserted to contain the substring "mcp_call" before any dispatch logic runs. The Responses API stream always begins with response.created (and response.in_progress) which do not include that substring, so this assertion fails on the first event and the rest of the test logic never executes. As written the test cannot pass even when the streaming implementation is correct; the assertion needs to be limited to the MCP events it is intended to check.
Useful? React with 👍 / 👎.
| return False | ||
|
|
||
| # Function calls have "functions." prefix | ||
| # Everything else is an MCP tool | ||
| return not recipient.startswith("functions.") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Don’t classify built‑in tools as MCP during streaming
The helper _is_mcp_tool_by_namespace now treats any recipient that is not functions.* as an MCP tool. In the Harmony streaming path this flag is used to route tool calls into the MCP branch, so built‑in recipients such as "python", "browser", or "container" now emit response.mcp_call* events instead of the expected response.code_interpreter_call*/built‑in events. A request streaming with tools=[{"type": "code_interpreter"}] will therefore deliver only MCP events, breaking API semantics and diverging from the non‑streaming parsing logic that keeps built‑ins separate from MCP connectors.
Useful? React with 👍 / 👎.
|
Hi @daniel-salib, the pre-commit checks have failed. Please run: uv pip install pre-commit
pre-commit install
pre-commit run --all-filesThen, commit the changes and push to your branch. For future commits, |
|
cc @robertgshaw2-redhat @chaunceyjiang Can you help review this PR? Thanks! |
Purpose
This change enables streaming support for MCP tools when using GPT OSS. It extends the harmony utilities and response serving infrastructure to handle tool streaming, allowing tool calls and their results to be incrementally streamed back to clients rather than returned as a single batch.
Test Plan
curl -X POST "http://localhost:8000/v1/responses" -H "Content-Type: application/json" -H "Authorization: Bearer dummy-api-key" -d '{
"model": "default",
"input": "Multiply 123*456 using the mcp.code_interpreter tool.",
"tools": [{
"type": "mcp",
"server_label": "code_interpreter",
"headers": {"test": "test"},
"server_url": "IGNORED"
}],
"stream": true,
"enable_response_messages": true
}'
Test Result
event: response.created
data: {"response":{"id":"resp_634aa3735d374e609c59128a4ca4c9ff","created_at":1762333234,"incomplete_details":null,"instructions":null,"metadata":null,"model":"default","object":"response","output":[],"parallel_tool_calls":true,"temperature":1.0,"tool_choice":"auto","tools":[{"server_label":"code_interpreter","type":"mcp","allowed_tools":null,"authorization":null,"connector_id":null,"headers":{"test":"test"},"require_approval":null,"server_description":null,"server_url":"IGNORED"}],"top_p":1.0,"background":false,"max_output_tokens":130895,"max_tool_calls":null,"previous_response_id":null,"prompt":null,"reasoning":null,"service_tier":"auto","status":"in_progress","text":null,"top_logprobs":null,"truncation":"disabled","usage":null,"user":null,"input_messages":null,"output_messages":null},"sequence_number":0,"type":"response.created"}
event: response.in_progress
data: {"response":{"id":"resp_634aa3735d374e609c59128a4ca4c9ff","created_at":1762333234,"incomplete_details":null,"instructions":null,"metadata":null,"model":"default","object":"response","output":[],"parallel_tool_calls":true,"temperature":1.0,"tool_choice":"auto","tools":[{"server_label":"code_interpreter","type":"mcp","allowed_tools":null,"authorization":null,"connector_id":null,"headers":{"test":"test"},"require_approval":null,"server_description":null,"server_url":"IGNORED"}],"top_p":1.0,"background":false,"max_output_tokens":130895,"max_tool_calls":null,"previous_response_id":null,"prompt":null,"reasoning":null,"service_tier":"auto","status":"in_progress","text":null,"top_logprobs":null,"truncation":"disabled","usage":null,"user":null,"input_messages":null,"output_messages":null},"sequence_number":1,"type":"response.in_progress"}
event: response.output_item.added
data: {"item":{"id":"msg_91e0b5be583e4ac38cfe7d55f025def7","summary":[],"type":"reasoning","content":null,"encrypted_content":null,"status":"in_progress"},"output_index":0,"sequence_number":2,"type":"response.output_item.added"}
event: response.reasoning_part.added
data: {"content_index":0,"item_id":"msg_91e0b5be583e4ac38cfe7d55f025def7","output_index":0,"part":{"text":"","type":"reasoning_text"},"sequence_number":3,"type":"response.reasoning_part.added"}
event: response.reasoning_text.delta
data: {"content_index":0,"delta":"We","item_id":"msg_91e0b5be583e4ac38cfe7d55f025def7","output_index":0,"sequence_number":4,"type":"response.reasoning_text.delta"}
event: response.reasoning_text.delta
data: {"content_index":0,"delta":" need","item_id":"msg_91e0b5be583e4ac38cfe7d55f025def7","output_index":0,"sequence_number":5,"type":"response.reasoning_text.delta"}
event: response.reasoning_text.delta
data: {"content_index":0,"delta":" to","item_id":"msg_91e0b5be583e4ac38cfe7d55f025def7","output_index":0,"sequence_number":6,"type":"response.reasoning_text.delta"}
event: response.reasoning_text.delta
data: {"content_index":0,"delta":" compute","item_id":"msg_91e0b5be583e4ac38cfe7d55f025def7","output_index":0,"sequence_number":7,"type":"response.reasoning_text.delta"}
event: response.reasoning_text.delta
data: {"content_index":0,"delta":" ","item_id":"msg_91e0b5be583e4ac38cfe7d55f025def7","output_index":0,"sequence_number":8,"type":"response.reasoning_text.delta"}
event: response.reasoning_text.delta
data: {"content_index":0,"delta":"123","item_id":"msg_91e0b5be583e4ac38cfe7d55f025def7","output_index":0,"sequence_number":9,"type":"response.reasoning_text.delta"}
event: response.reasoning_text.delta
data: {"content_index":0,"delta":"*","item_id":"msg_91e0b5be583e4ac38cfe7d55f025def7","output_index":0,"sequence_number":10,"type":"response.reasoning_text.delta"}
event: response.reasoning_text.delta
data: {"content_index":0,"delta":"456","item_id":"msg_91e0b5be583e4ac38cfe7d55f025def7","output_index":0,"sequence_number":11,"type":"response.reasoning_text.delta"}
event: response.reasoning_text.delta
data: {"content_index":0,"delta":".","item_id":"msg_91e0b5be583e4ac38cfe7d55f025def7","output_index":0,"sequence_number":12,"type":"response.reasoning_text.delta"}
event: response.reasoning_text.delta
data: {"content_index":0,"delta":" Use","item_id":"msg_91e0b5be583e4ac38cfe7d55f025def7","output_index":0,"sequence_number":13,"type":"response.reasoning_text.delta"}
event: response.reasoning_text.delta
data: {"content_index":0,"delta":" python","item_id":"msg_91e0b5be583e4ac38cfe7d55f025def7","output_index":0,"sequence_number":14,"type":"response.reasoning_text.delta"}
event: response.reasoning_text.delta
data: {"content_index":0,"delta":".","item_id":"msg_91e0b5be583e4ac38cfe7d55f025def7","output_index":0,"sequence_number":15,"type":"response.reasoning_text.delta"}
event: response.reasoning_text.done
data: {"content_index":-1,"item_id":"msg_91e0b5be583e4ac38cfe7d55f025def7","output_index":0,"sequence_number":16,"text":"We need to compute 123*456. Use python.","type":"response.reasoning_text.done"}
event: response.reasoning_part.done
data: {"content_index":-1,"item_id":"msg_91e0b5be583e4ac38cfe7d55f025def7","output_index":0,"part":{"text":"We need to compute 123*456. Use python.","type":"reasoning_text"},"sequence_number":17,"type":"response.reasoning_part.done"}
event: response.output_item.done
data: {"item":{"id":"msg_91e0b5be583e4ac38cfe7d55f025def7","summary":[],"type":"reasoning","content":[{"text":"We need to compute 123*456. Use python.","type":"reasoning_text"}],"encrypted_content":null,"status":"completed"},"output_index":0,"sequence_number":18,"type":"response.output_item.done"}
event: response.output_item.added
data: {"item":{"id":"mcp_4e15766739ed49a1860e8d7b377348d7","arguments":"","name":"python","server_label":"code_interpreter","type":"mcp_call","approval_request_id":null,"error":null,"output":null,"status":"in_progress","call_id":"mcp_f38d222820be4db7ba44e4b7e63b0c0f"},"output_index":1,"sequence_number":19,"type":"response.output_item.added"}
event: response.mcp_call.in_progress
data: {"item_id":"mcp_4e15766739ed49a1860e8d7b377348d7","output_index":1,"sequence_number":20,"type":"response.mcp_call.in_progress"}
event: response.mcp_call_arguments.delta
data: {"delta":"123","item_id":"mcp_4e15766739ed49a1860e8d7b377348d7","output_index":1,"sequence_number":21,"type":"response.mcp_call_arguments.delta"}
event: response.mcp_call_arguments.delta
data: {"delta":"*","item_id":"mcp_4e15766739ed49a1860e8d7b377348d7","output_index":1,"sequence_number":22,"type":"response.mcp_call_arguments.delta"}
event: response.mcp_call_arguments.delta
data: {"delta":"456","item_id":"mcp_4e15766739ed49a1860e8d7b377348d7","output_index":1,"sequence_number":23,"type":"response.mcp_call_arguments.delta"}
event: response.mcp_call_arguments.delta
data: {"delta":"\n","item_id":"mcp_4e15766739ed49a1860e8d7b377348d7","output_index":1,"sequence_number":24,"type":"response.mcp_call_arguments.delta"}
event: response.mcp_call_arguments.done
data: {"arguments":"123*456\n","item_id":"mcp_4e15766739ed49a1860e8d7b377348d7","output_index":1,"sequence_number":25,"type":"response.mcp_call_arguments.done","name":"python"}
event: response.mcp_call.completed
data: {"item_id":"mcp_4e15766739ed49a1860e8d7b377348d7","output_index":1,"sequence_number":26,"type":"response.mcp_call.completed"}
event: response.output_item.done
data: {"item":{"id":"mcp_4e15766739ed49a1860e8d7b377348d7","arguments":"123*456\n","name":"python","server_label":"code_interpreter","type":"mcp_call","approval_request_id":null,"error":null,"output":null,"status":"completed","call_id":"mcp_13e054b550474cd5aa66c71aefcebf00"},"output_index":1,"sequence_number":27,"type":"response.output_item.done"}
event: response.output_item.added
data: {"item":{"id":"msg_5e2d50b2c1704e9eb848d78929716445","content":[],"role":"assistant","status":"in_progress","type":"message"},"output_index":2,"sequence_number":28,"type":"response.output_item.added"}
event: response.content_part.added
data: {"content_index":0,"item_id":"msg_5e2d50b2c1704e9eb848d78929716445","output_index":2,"part":{"annotations":[],"text":"","type":"output_text","logprobs":[]},"sequence_number":29,"type":"response.content_part.added"}
event: response.output_text.delta
data: {"content_index":0,"delta":"The","item_id":"msg_5e2d50b2c1704e9eb848d78929716445","logprobs":[],"output_index":2,"sequence_number":30,"type":"response.output_text.delta"}
event: response.output_text.delta
data: {"content_index":0,"delta":" product","item_id":"msg_5e2d50b2c1704e9eb848d78929716445","logprobs":[],"output_index":2,"sequence_number":31,"type":"response.output_text.delta"}
event: response.output_text.delta
data: {"content_index":0,"delta":" of","item_id":"msg_5e2d50b2c1704e9eb848d78929716445","logprobs":[],"output_index":2,"sequence_number":32,"type":"response.output_text.delta"}
event: response.output_text.delta
data: {"content_index":0,"delta":" \(","item_id":"msg_5e2d50b2c1704e9eb848d78929716445","logprobs":[],"output_index":2,"sequence_number":33,"type":"response.output_text.delta"}
event: response.output_text.delta
data: {"content_index":0,"delta":"123","item_id":"msg_5e2d50b2c1704e9eb848d78929716445","logprobs":[],"output_index":2,"sequence_number":34,"type":"response.output_text.delta"}
event: response.output_text.delta
data: {"content_index":0,"delta":" \","item_id":"msg_5e2d50b2c1704e9eb848d78929716445","logprobs":[],"output_index":2,"sequence_number":35,"type":"response.output_text.delta"}
event: response.output_text.delta
data: {"content_index":0,"delta":"times","item_id":"msg_5e2d50b2c1704e9eb848d78929716445","logprobs":[],"output_index":2,"sequence_number":36,"type":"response.output_text.delta"}
event: response.output_text.delta
data: {"content_index":0,"delta":" ","item_id":"msg_5e2d50b2c1704e9eb848d78929716445","logprobs":[],"output_index":2,"sequence_number":37,"type":"response.output_text.delta"}
event: response.output_text.delta
data: {"content_index":0,"delta":"456","item_id":"msg_5e2d50b2c1704e9eb848d78929716445","logprobs":[],"output_index":2,"sequence_number":38,"type":"response.output_text.delta"}
event: response.output_text.delta
data: {"content_index":0,"delta":"\","item_id":"msg_5e2d50b2c1704e9eb848d78929716445","logprobs":[],"output_index":2,"sequence_number":39,"type":"response.output_text.delta"}
event: response.output_text.delta
data: {"content_index":0,"delta":")","item_id":"msg_5e2d50b2c1704e9eb848d78929716445","logprobs":[],"output_index":2,"sequence_number":40,"type":"response.output_text.delta"}
event: response.output_text.delta
data: {"content_index":0,"delta":" is","item_id":"msg_5e2d50b2c1704e9eb848d78929716445","logprobs":[],"output_index":2,"sequence_number":41,"type":"response.output_text.delta"}
event: response.output_text.delta
data: {"content_index":0,"delta":" **","item_id":"msg_5e2d50b2c1704e9eb848d78929716445","logprobs":[],"output_index":2,"sequence_number":42,"type":"response.output_text.delta"}
event: response.output_text.delta
data: {"content_index":0,"delta":"56","item_id":"msg_5e2d50b2c1704e9eb848d78929716445","logprobs":[],"output_index":2,"sequence_number":43,"type":"response.output_text.delta"}
event: response.output_text.delta
data: {"content_index":0,"delta":",","item_id":"msg_5e2d50b2c1704e9eb848d78929716445","logprobs":[],"output_index":2,"sequence_number":44,"type":"response.output_text.delta"}
event: response.output_text.delta
data: {"content_index":0,"delta":"088","item_id":"msg_5e2d50b2c1704e9eb848d78929716445","logprobs":[],"output_index":2,"sequence_number":45,"type":"response.output_text.delta"}
event: response.output_text.delta
data: {"content_index":0,"delta":"**","item_id":"msg_5e2d50b2c1704e9eb848d78929716445","logprobs":[],"output_index":2,"sequence_number":46,"type":"response.output_text.delta"}
event: response.output_text.delta
data: {"content_index":0,"delta":".","item_id":"msg_5e2d50b2c1704e9eb848d78929716445","logprobs":[],"output_index":2,"sequence_number":47,"type":"response.output_text.delta"}
event: response.output_text.done
data: {"content_index":-1,"item_id":"msg_5e2d50b2c1704e9eb848d78929716445","logprobs":[],"output_index":2,"sequence_number":48,"text":"The product of \(123 \times 456\) is 56,088.","type":"response.output_text.done"}
event: response.content_part.done
data: {"content_index":-1,"item_id":"msg_5e2d50b2c1704e9eb848d78929716445","output_index":2,"part":{"annotations":[],"text":"The product of \(123 \times 456\) is 56,088.","type":"output_text","logprobs":null},"sequence_number":49,"type":"response.content_part.done"}
event: response.output_item.done
data: {"item":{"id":"msg_5e2d50b2c1704e9eb848d78929716445","content":[{"annotations":[],"text":"The product of \(123 \times 456\) is 56,088.","type":"output_text","logprobs":null}],"role":"assistant","status":"completed","type":"message"},"output_index":2,"sequence_number":50,"type":"response.output_item.done"}
event: response.completed
data: {"response":{"id":"resp_634aa3735d374e609c59128a4ca4c9ff","created_at":1762333234,"incomplete_details":null,