Skip to content

Conversation

@qandrew
Copy link
Contributor

@qandrew qandrew commented Dec 5, 2025

Purpose

  • add input/output messages to ResponsesParser. If we have a 2 turn conversation (user input 1, output 1, MCP call, input 2, output 2), we'll have user input 1 in response.input_messages, and [output 1, input 2, output 2] in response.output_messages).
  • refactor make_response_output_items_from_parsable_context to be a class function of ResponsesParser.

NOTE: This PR must be merged in after #30230, this builds on top of #30115

Test Plan

 CUDA_VISIBLE_DEVICES=4,5,6,7 VLLM_USE_EXPERIMENTAL_PARSER_CONTEXT=1 vllm serve MiniMaxAI/MiniMax-M2   --tensor-parallel-size 4   --tool-call-parser minimax_m2   --reasoning-parser minimax_m2    --enable-auto-tool-choice --trust-remote-code  --tool-server=localhost:8081/container,localhost:8081/browser,localhost:8081/python
 ./vllm/fb/scripts/gptoss/run_mcp_server.sh
 curl -X POST "http://localhost:8000/v1/responses"   -H "Content-Type: application/json"   -H "Authorization: Bearer dummy-api-key"   -d '{
        "model": "MiniMaxAI/MiniMax-M2",
        "input": "Multiply 64548*15151 using the python tool.",
        "tools": [
          {
            "type": "mcp",
            "server_label": "code_interpreter",
            "headers": {"test": "test"},
            "server_url": "IGNORED"
          }
        ],
        "enable_response_messages": true
      }'

Test Result

{"id":"resp_a83cc9fdcf5438d0","created_at":1765170748,"incomplete_details":null,"instructions":null,"metadata":null,"model":"MiniMaxAI/MiniMax-M2","object":"response","output":[{"id
":"rs_8885082003365dd4","summary":[],"type":"reasoning","content":[{"text":"The user explicitly asked to \"Multiply 64548*15151 using the python tool.\" The `code_interpreter` tool
is available and perfectly suited for this task, as it can execute Python code. So, I'm calling that tool with the multiplication operation.\n","type":"reasoning_text"}],"encrypted_
content":null,"status":null},{"id":"mcp_a065ac97a78a55a8","arguments":"{\"code\": \"64548 * 15151\"}","name":"code_interpreter","server_label":"code_interpreter","type":"mcp_call","
error":null,"output":"977966748\n","status":"completed"},{"id":"rs_9a378eb6060f3abe","summary":[],"type":"reasoning","content":[{"text":"Okay, so the user wants to multiply 64548 by
 15151 using the Python tool. This is a straightforward calculation, but they specifically asked to use the Python tool rather than just calculating it mentally or explaining how to
 do it.\n\nI see that I have access to a code_interpreter tool which can execute Python code. This is the perfect tool for this task since it will allow me to run the multiplication
 calculation accurately.\n\nLooking at the tool parameters, I need to provide:\n- code: The Python code to execute\n\nThe multiplication operation in Python is simple: I just need t
o use the asterisk (*) operator between the two numbers. So I'll write a simple line of Python code that calculates 64548 * 15151.\n\nI don't need to do anything fancy here - just a straightforward multiplication. The code will be:\n64548 * 15151\n\nI've called the code_interpreter tool with this code, and it has returned the result: 977966748.\n\nSo the answer to the multiplication problem 64548 * 15151 is 977966748. I should provide this result to the user in a clear and concise way. I don't think I need to show the Python code I used since the user just wanted the result of the calculation.\n\nLet me present the final answer to the user.\n","type":"reasoning_text"}],"encrypted_content":null,"status":null},{"id":"msg_8957bb63510b6ea2","content":[{"annotations":[],"text":"\n\nThe result of the multiplication is 977966748.","type":"output_text","logprobs":null}],"role":"assistant","status":"completed","type":"message"}],"parallel_tool_calls":true,"temperature":1.0,"tool_choice":"auto","tools":[{"server_label":"code_interpreter","type":"mcp","allowed_tools":null,"authorization":null,"connector_id":null,"headers":{"test":"test"},"require_approval":null,"server_description":null,"server_url":"IGNORED"}],"top_p":0.95,"background":false,"max_output_tokens":196304,"max_tool_calls":null,"previous_response_id":null,"prompt":null,"reasoning":null,"service_tier":"auto","status":"completed","text":null,"top_logprobs":null,"truncation":"disabled","usage":{"input_tokens":304,"input_tokens_details":{"cached_tokens":240,"input_tokens_per_turn":[],"cached_tokens_per_turn":[]},"output_tokens":362,"output_tokens_details":{"reasoning_tokens":0,"tool_output_tokens":0,"output_tokens_per_turn":[],"tool_output_tokens_per_turn":[]},"total_tokens":666},"user":null,"input_messages":[{"message":"]~!b[]~b]system\nYou are a helpful assistant.\n\n# Tools\nYou may call one or more tools to assist with the user query.\nHere are the tools available in JSONSchema format:\n\n<tools>\n<tool>{\"server_label\": \"code_interpreter\", \"type\": \"mcp\", \"allowed_tools\": null, \"authorization\": null, \"connector_id\": null, \"headers\": {\"test\": \"test\"}, \"require_approval\": null, \"server_description\": null, \"server_url\": \"IGNORED\"}</tool>\n</tools>\n\nWhen making tool calls, use XML format to invoke tools and pass parameters:\n\n<minimax:tool_call>\n<invoke name=\"tool-name-1\">\n<parameter name=\"param-key-1\">param-value-1</parameter>\n<parameter name=\"param-key-2\">param-value-2</parameter>\n...\n</invoke>\n</minimax:tool_call>[e~[\n]~b]user\nMultiply 64548*15151 using the python tool.[e~[\n]~b]ai\n<think>\n","tokens":[200034,200019,28463,10,2985,457,258,12473,23413,634,35,31455,10,2985,992,1728,841,436,812,6629,301,7202,418,275,3100,9560,320,9806,457,275,6629,3136,296,19535,19357,6634,3728,60,32650,1100,60,45382,62,16673,11882,29534,2803,494,3689,35580,73413,1164,494,4467,2803,494,109,17051,1164,494,33374,159706,2803,3065,44,494,86312,2803,3065,44,494,102992,3893,2803,3065,44,494,29924,2803,18396,4500,2803,494,4500,47488,494,16525,95,81351,2803,3065,44,494,11882,57254,2803,3065,44,494,11882,17902,2803,494,18320,178244,22089,1579,45382,1100,1579,32650,6983,4625,3527,3994,9885,44,1223,27190,6634,301,48687,6629,306,1916,7729,3728,200052,10,60,88638,1925,1139,45382,28145,45,49,6143,60,24662,1925,1139,2949,38158,45,49,3361,2949,27306,45,49,1579,24662,1100,60,24662,1925,1139,2949,38158,45,50,3361,2949,27306,45,50,1579,24662,1100,4563,1579,88638,1100,200053,200020,10,200019,3995,10,180665,32,48634,3740,42,18331,7257,1818,275,26636,3994,46,200020,10,200019,1361,10,200050,10],"type":"raw_message_tokens"}],"output_messages":[{"message":"The user explicitly asked to \"Multiply 64548*15151 using the python tool.\" The `code_interpreter` tool is available and perfectly suited for this task, as it can execute Python code. So, I'm calling that tool with the multiplication operation.\n</think>\n\n\n<minimax:tool_call>\n<invoke name=\"code_interpreter\">\n<parameter name=\"code\">64548 * 15151</parameter>\n</invoke>\n</minimax:tool_call>","tokens":[758,3100,24668,5316,301,494,180665,32,48634,3740,42,18331,7257,1818,275,26636,3994,1511,517,2033,3689,35580,73413,96,3994,355,3136,306,18376,31685,360,546,7201,44,401,412,566,18731,18296,3817,46,2021,44,4227,12554,389,3994,418,275,45271,7311,320,200051,4368,200052,10,60,88638,1925,1139,3689,35580,73413,6143,60,24662,1925,1139,3689,3361,48634,3740,653,32,18331,7257,1579,24662,1100,1579,88638,1100,200053,200020],"type":"raw_message_tokens"},{"message":"]~!b[]~b]system\nYou are a helpful assistant.\n\n# Tools\nYou may call one or more tools to assist with the user query.\nHere are the tools available in JSONSchema format:\n\n<tools>\n<tool>{\"server_label\": \"code_interpreter\", \"type\": \"mcp\", \"allowed_tools\": null, \"authorization\": null, \"connector_id\": null, \"headers\": {\"test\": \"test\"}, \"require_approval\": null, \"server_description\": null, \"server_url\": \"IGNORED\"}</tool>\n</tools>\n\nWhen making tool calls, use XML format to invoke tools and pass parameters:\n\n<minimax:tool_call>\n<invoke name=\"tool-name-1\">\n<parameter name=\"param-key-1\">param-value-1</parameter>\n<parameter name=\"param-key-2\">param-value-2</parameter>\n...\n</invoke>\n</minimax:tool_call>[e~[\n]~b]user\nMultiply 64548*15151 using the python tool.[e~[\n]~b]ai\n<think>\nThe user explicitly asked to \"Multiply 64548*15151 using the python tool.\" The `code_interpreter` tool is available and perfectly suited for this task, as it can execute Python code. So, I'm calling that tool with the multiplication operation.\n\n</think>\n\n\n<minimax:tool_call>\n<invoke name=\"code_interpreter\">\n<parameter name=\"code\">64548 * 15151</parameter>\n</invoke>\n</minimax:tool_call>[e~[\n]~b]tool\n<response>977966748\n\n</response>[e~[\n]~b]ai\n<think>\n","tokens":[200034,200019,28463,10,2985,457,258,12473,23413,634,35,31455,10,2985,992,1728,841,436,812,6629,301,7202,418,275,3100,9560,320,9806,457,275,6629,3136,296,19535,19357,6634,3728,60,32650,1100,60,45382,62,16673,11882,29534,2803,494,3689,35580,73413,1164,494,4467,2803,494,109,17051,1164,494,33374,159706,2803,3065,44,494,86312,2803,3065,44,494,102992,3893,2803,3065,44,494,29924,2803,18396,4500,2803,494,4500,47488,494,16525,95,81351,2803,3065,44,494,11882,57254,2803,3065,44,494,11882,17902,2803,494,18320,178244,22089,1579,45382,1100,1579,32650,6983,4625,3527,3994,9885,44,1223,27190,6634,301,48687,6629,306,1916,7729,3728,200052,...

Essential Elements of an Effective PR Description Checklist
  • The purpose of the PR, such as "Fix some issue (link existing issues this PR will resolve)".
  • The test plan, such as providing test command.
  • The test results, such as pasting the results comparison before and after, or e2e results
  • (Optional) The necessary documentation update, such as updating supported_models.md and examples for a new model.
  • (Optional) Release notes update. If your change is user facing, please update the release notes draft in the Google Doc.

Copy link
Contributor

@gemini-code-assist gemini-code-assist bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review

The pull request introduces input/output message tracking for ResponsesParser and enables enable_response_messages for ParsableContext. However, there are several issues identified, including incorrect population of input_messages in multi-turn conversations, unresolved TODOs that indicate known parsing shortcomings, and the presence of a debugger breakpoint in the code. These issues could lead to incorrect API responses and should be addressed.

@qandrew qandrew changed the title [responsesAPI][6] input/output messages for ResponsesParser [responsesAPI][8] input/output messages for ResponsesParser Dec 8, 2025
@qandrew qandrew force-pushed the all-input-output-parsable branch from 2d1d3e6 to 40fe6cb Compare December 8, 2025 04:11
@mergify
Copy link

mergify bot commented Dec 8, 2025

Documentation preview: https://vllm--30158.org.readthedocs.build/en/30158/

@mergify mergify bot added the documentation Improvements or additions to documentation label Dec 8, 2025
@mergify
Copy link

mergify bot commented Dec 8, 2025

This pull request has merge conflicts that must be resolved before it can be
merged. Please rebase the PR, @qandrew.

https://docs.github.com/en/pull-requests/collaborating-with-pull-requests/working-with-forks/syncing-a-fork

@mergify mergify bot added the needs-rebase label Dec 8, 2025
@qandrew qandrew force-pushed the all-input-output-parsable branch from 40fe6cb to 76cad23 Compare December 8, 2025 05:24
@mergify mergify bot removed the needs-rebase label Dec 8, 2025
Andrew Xia added 2 commits December 7, 2025 21:38
Signed-off-by: Andrew Xia <[email protected]>
fix
Signed-off-by: Andrew Xia <[email protected]>
@qandrew qandrew marked this pull request as ready for review December 8, 2025 05:39
@qandrew
Copy link
Contributor Author

qandrew commented Dec 8, 2025

Copy link

@chatgpt-codex-connector chatgpt-codex-connector bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Codex Review

Here are some automated review suggestions for this pull request.

ℹ️ About Codex in GitHub

Codex has been enabled to automatically review pull requests in this repo. Reviews are triggered when you

  • Open a pull request for review
  • Mark a draft as ready
  • Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

When you sign up for Codex through ChatGPT, Codex can also answer questions or update the PR, like "@codex address that feedback".

Comment on lines +6 to +10
from openai.types.responses import ResponseFunctionToolCall, ResponseOutputItem
from openai.types.responses.response_function_tool_call_output_item import (
ResponseFunctionToolCallOutputItem,
)
from openai.types.responses.response_input_item import McpCall

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P1 Badge Import McpCall output schema for parsed tool calls

The parser now converts MCP tool call outputs into McpCall objects inside make_response_output_items_from_parsable_context, but the import here pulls McpCall from response_input_item. That class models request inputs and (unlike the output version used elsewhere, e.g. harmony_utils) does not carry the output field this method populates, so parsing an MCP call will either raise validation/TypeError or drop the tool output when assembling the response. This should import the output-item McpCall to correctly serialize tool results for parsable contexts.

Useful? React with 👍 / 👎.


return self

def make_response_output_items_from_parsable_context(
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

just a refactor, no functional changes.

Copy link
Collaborator

@chaunceyjiang chaunceyjiang left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks~

@github-project-automation github-project-automation bot moved this from To Triage to Ready in gpt-oss Issues & Enhancements Dec 10, 2025
@chaunceyjiang chaunceyjiang added the ready ONLY add when PR is ready to merge/full CI is needed label Dec 10, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

documentation Improvements or additions to documentation frontend gpt-oss Related to GPT-OSS models ready ONLY add when PR is ready to merge/full CI is needed

Projects

Status: Ready

Development

Successfully merging this pull request may close these issues.

2 participants