[responsesAPI][8] input/output messages for ResponsesParser #30158

qandrew · 2025-12-05T22:09:21Z

Purpose

add input/output messages to ResponsesParser. If we have a 2 turn conversation (user input 1, output 1, MCP call, input 2, output 2), we'll have user input 1 in response.input_messages, and [output 1, input 2, output 2] in response.output_messages).
refactor make_response_output_items_from_parsable_context to be a class function of ResponsesParser.

NOTE: This PR must be merged in after #30230, this builds on top of #30115

Test Plan

 CUDA_VISIBLE_DEVICES=4,5,6,7 VLLM_USE_EXPERIMENTAL_PARSER_CONTEXT=1 vllm serve MiniMaxAI/MiniMax-M2   --tensor-parallel-size 4   --tool-call-parser minimax_m2   --reasoning-parser minimax_m2    --enable-auto-tool-choice --trust-remote-code  --tool-server=localhost:8081/container,localhost:8081/browser,localhost:8081/python
 ./vllm/fb/scripts/gptoss/run_mcp_server.sh

 curl -X POST "http://localhost:8000/v1/responses"   -H "Content-Type: application/json"   -H "Authorization: Bearer dummy-api-key"   -d '{
        "model": "MiniMaxAI/MiniMax-M2",
        "input": "Multiply 64548*15151 using the python tool.",
        "tools": [
          {
            "type": "mcp",
            "server_label": "code_interpreter",
            "headers": {"test": "test"},
            "server_url": "IGNORED"
          }
        ],
        "enable_response_messages": true
      }'

Test Result

{"id":"resp_a83cc9fdcf5438d0","created_at":1765170748,"incomplete_details":null,"instructions":null,"metadata":null,"model":"MiniMaxAI/MiniMax-M2","object":"response","output":[{"id
":"rs_8885082003365dd4","summary":[],"type":"reasoning","content":[{"text":"The user explicitly asked to \"Multiply 64548*15151 using the python tool.\" The `code_interpreter` tool
is available and perfectly suited for this task, as it can execute Python code. So, I'm calling that tool with the multiplication operation.\n","type":"reasoning_text"}],"encrypted_
content":null,"status":null},{"id":"mcp_a065ac97a78a55a8","arguments":"{\"code\": \"64548 * 15151\"}","name":"code_interpreter","server_label":"code_interpreter","type":"mcp_call","
error":null,"output":"977966748\n","status":"completed"},{"id":"rs_9a378eb6060f3abe","summary":[],"type":"reasoning","content":[{"text":"Okay, so the user wants to multiply 64548 by
 15151 using the Python tool. This is a straightforward calculation, but they specifically asked to use the Python tool rather than just calculating it mentally or explaining how to
 do it.\n\nI see that I have access to a code_interpreter tool which can execute Python code. This is the perfect tool for this task since it will allow me to run the multiplication
 calculation accurately.\n\nLooking at the tool parameters, I need to provide:\n- code: The Python code to execute\n\nThe multiplication operation in Python is simple: I just need t
o use the asterisk (*) operator between the two numbers. So I'll write a simple line of Python code that calculates 64548 * 15151.\n\nI don't need to do anything fancy here - just a straightforward multiplication. The code will be:\n64548 * 15151\n\nI've called the code_interpreter tool with this code, and it has returned the result: 977966748.\n\nSo the answer to the multiplication problem 64548 * 15151 is 977966748. I should provide this result to the user in a clear and concise way. I don't think I need to show the Python code I used since the user just wanted the result of the calculation.\n\nLet me present the final answer to the user.\n","type":"reasoning_text"}],"encrypted_content":null,"status":null},{"id":"msg_8957bb63510b6ea2","content":[{"annotations":[],"text":"\n\nThe result of the multiplication is 977966748.","type":"output_text","logprobs":null}],"role":"assistant","status":"completed","type":"message"}],"parallel_tool_calls":true,"temperature":1.0,"tool_choice":"auto","tools":[{"server_label":"code_interpreter","type":"mcp","allowed_tools":null,"authorization":null,"connector_id":null,"headers":{"test":"test"},"require_approval":null,"server_description":null,"server_url":"IGNORED"}],"top_p":0.95,"background":false,"max_output_tokens":196304,"max_tool_calls":null,"previous_response_id":null,"prompt":null,"reasoning":null,"service_tier":"auto","status":"completed","text":null,"top_logprobs":null,"truncation":"disabled","usage":{"input_tokens":304,"input_tokens_details":{"cached_tokens":240,"input_tokens_per_turn":[],"cached_tokens_per_turn":[]},"output_tokens":362,"output_tokens_details":{"reasoning_tokens":0,"tool_output_tokens":0,"output_tokens_per_turn":[],"tool_output_tokens_per_turn":[]},"total_tokens":666},"user":null,"input_messages":[{"message":"]~!b[]~b]system\nYou are a helpful assistant.\n\n# Tools\nYou may call one or more tools to assist with the user query.\nHere are the tools available in JSONSchema format:\n\n<tools>\n<tool>{\"server_label\": \"code_interpreter\", \"type\": \"mcp\", \"allowed_tools\": null, \"authorization\": null, \"connector_id\": null, \"headers\": {\"test\": \"test\"}, \"require_approval\": null, \"server_description\": null, \"server_url\": \"IGNORED\"}</tool>\n</tools>\n\nWhen making tool calls, use XML format to invoke tools and pass parameters:\n\n<minimax:tool_call>\n<invoke name=\"tool-name-1\">\n<parameter name=\"param-key-1\">param-value-1</parameter>\n<parameter name=\"param-key-2\">param-value-2</parameter>\n...\n</invoke>\n</minimax:tool_call>[e~[\n]~b]user\nMultiply 64548*15151 using the python tool.[e~[\n]~b]ai\n<think>\n","tokens":[200034,200019,28463,10,2985,457,258,12473,23413,634,35,31455,10,2985,992,1728,841,436,812,6629,301,7202,418,275,3100,9560,320,9806,457,275,6629,3136,296,19535,19357,6634,3728,60,32650,1100,60,45382,62,16673,11882,29534,2803,494,3689,35580,73413,1164,494,4467,2803,494,109,17051,1164,494,33374,159706,2803,3065,44,494,86312,2803,3065,44,494,102992,3893,2803,3065,44,494,29924,2803,18396,4500,2803,494,4500,47488,494,16525,95,81351,2803,3065,44,494,11882,57254,2803,3065,44,494,11882,17902,2803,494,18320,178244,22089,1579,45382,1100,1579,32650,6983,4625,3527,3994,9885,44,1223,27190,6634,301,48687,6629,306,1916,7729,3728,200052,10,60,88638,1925,1139,45382,28145,45,49,6143,60,24662,1925,1139,2949,38158,45,49,3361,2949,27306,45,49,1579,24662,1100,60,24662,1925,1139,2949,38158,45,50,3361,2949,27306,45,50,1579,24662,1100,4563,1579,88638,1100,200053,200020,10,200019,3995,10,180665,32,48634,3740,42,18331,7257,1818,275,26636,3994,46,200020,10,200019,1361,10,200050,10],"type":"raw_message_tokens"}],"output_messages":[{"message":"The user explicitly asked to \"Multiply 64548*15151 using the python tool.\" The `code_interpreter` tool is available and perfectly suited for this task, as it can execute Python code. So, I'm calling that tool with the multiplication operation.\n</think>\n\n\n<minimax:tool_call>\n<invoke name=\"code_interpreter\">\n<parameter name=\"code\">64548 * 15151</parameter>\n</invoke>\n</minimax:tool_call>","tokens":[758,3100,24668,5316,301,494,180665,32,48634,3740,42,18331,7257,1818,275,26636,3994,1511,517,2033,3689,35580,73413,96,3994,355,3136,306,18376,31685,360,546,7201,44,401,412,566,18731,18296,3817,46,2021,44,4227,12554,389,3994,418,275,45271,7311,320,200051,4368,200052,10,60,88638,1925,1139,3689,35580,73413,6143,60,24662,1925,1139,3689,3361,48634,3740,653,32,18331,7257,1579,24662,1100,1579,88638,1100,200053,200020],"type":"raw_message_tokens"},{"message":"]~!b[]~b]system\nYou are a helpful assistant.\n\n# Tools\nYou may call one or more tools to assist with the user query.\nHere are the tools available in JSONSchema format:\n\n<tools>\n<tool>{\"server_label\": \"code_interpreter\", \"type\": \"mcp\", \"allowed_tools\": null, \"authorization\": null, \"connector_id\": null, \"headers\": {\"test\": \"test\"}, \"require_approval\": null, \"server_description\": null, \"server_url\": \"IGNORED\"}</tool>\n</tools>\n\nWhen making tool calls, use XML format to invoke tools and pass parameters:\n\n<minimax:tool_call>\n<invoke name=\"tool-name-1\">\n<parameter name=\"param-key-1\">param-value-1</parameter>\n<parameter name=\"param-key-2\">param-value-2</parameter>\n...\n</invoke>\n</minimax:tool_call>[e~[\n]~b]user\nMultiply 64548*15151 using the python tool.[e~[\n]~b]ai\n<think>\nThe user explicitly asked to \"Multiply 64548*15151 using the python tool.\" The `code_interpreter` tool is available and perfectly suited for this task, as it can execute Python code. So, I'm calling that tool with the multiplication operation.\n\n</think>\n\n\n<minimax:tool_call>\n<invoke name=\"code_interpreter\">\n<parameter name=\"code\">64548 * 15151</parameter>\n</invoke>\n</minimax:tool_call>[e~[\n]~b]tool\n<response>977966748\n\n</response>[e~[\n]~b]ai\n<think>\n","tokens":[200034,200019,28463,10,2985,457,258,12473,23413,634,35,31455,10,2985,992,1728,841,436,812,6629,301,7202,418,275,3100,9560,320,9806,457,275,6629,3136,296,19535,19357,6634,3728,60,32650,1100,60,45382,62,16673,11882,29534,2803,494,3689,35580,73413,1164,494,4467,2803,494,109,17051,1164,494,33374,159706,2803,3065,44,494,86312,2803,3065,44,494,102992,3893,2803,3065,44,494,29924,2803,18396,4500,2803,494,4500,47488,494,16525,95,81351,2803,3065,44,494,11882,57254,2803,3065,44,494,11882,17902,2803,494,18320,178244,22089,1579,45382,1100,1579,32650,6983,4625,3527,3994,9885,44,1223,27190,6634,301,48687,6629,306,1916,7729,3728,200052,...

Essential Elements of an Effective PR Description Checklist

The purpose of the PR, such as "Fix some issue (link existing issues this PR will resolve)".
The test plan, such as providing test command.
The test results, such as pasting the results comparison before and after, or e2e results
(Optional) The necessary documentation update, such as updating supported_models.md and examples for a new model.
(Optional) Release notes update. If your change is user facing, please update the release notes draft in the Google Doc.

gemini-code-assist

Code Review

The pull request introduces input/output message tracking for ResponsesParser and enables enable_response_messages for ParsableContext. However, there are several issues identified, including incorrect population of input_messages in multi-turn conversations, unresolved TODOs that indicate known parsing shortcomings, and the presence of a debugger breakpoint in the code. These issues could lead to incorrect API responses and should be addressed.

vllm/entrypoints/openai/serving_responses.py

vllm/entrypoints/context.py

vllm/entrypoints/openai/serving_engine.py

vllm/entrypoints/openai/serving_responses.py

mergify · 2025-12-08T04:11:47Z

Documentation preview: https://vllm--30158.org.readthedocs.build/en/30158/

mergify · 2025-12-08T04:11:48Z

This pull request has merge conflicts that must be resolved before it can be
merged. Please rebase the PR, @qandrew.

https://docs.github.com/en/pull-requests/collaborating-with-pull-requests/working-with-forks/syncing-a-fork

Signed-off-by: Andrew Xia <[email protected]>

qandrew · 2025-12-08T05:43:35Z

cc @yeqcharlotte @chaunceyjiang @houseroad

chatgpt-codex-connector

💡 Codex Review

Here are some automated review suggestions for this pull request.

ℹ️ About Codex in GitHub

Codex has been enabled to automatically review pull requests in this repo. Reviews are triggered when you

Open a pull request for review
Mark a draft as ready
Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

When you sign up for Codex through ChatGPT, Codex can also answer questions or update the PR, like "@codex address that feedback".

chatgpt-codex-connector · 2025-12-08T05:44:15Z

vllm/entrypoints/openai/parser/responses_parser.py

+from openai.types.responses import ResponseFunctionToolCall, ResponseOutputItem
+from openai.types.responses.response_function_tool_call_output_item import (
+    ResponseFunctionToolCallOutputItem,
+)
+from openai.types.responses.response_input_item import McpCall


Import McpCall output schema for parsed tool calls

The parser now converts MCP tool call outputs into McpCall objects inside make_response_output_items_from_parsable_context, but the import here pulls McpCall from response_input_item. That class models request inputs and (unlike the output version used elsewhere, e.g. harmony_utils) does not carry the output field this method populates, so parsing an MCP call will either raise validation/TypeError or drop the tool output when assembling the response. This should import the output-item McpCall to correctly serialize tool results for parsable contexts.

Useful? React with 👍 / 👎.

qandrew · 2025-12-09T17:11:40Z

vllm/entrypoints/openai/parser/responses_parser.py


        return self

+    def make_response_output_items_from_parsable_context(


just a refactor, no functional changes.

Signed-off-by: Andrew Xia <[email protected]>

chaunceyjiang

Thanks~

mergify bot added frontend gpt-oss Related to GPT-OSS models labels Dec 5, 2025

github-project-automation bot added this to gpt-oss Issues & Enhancements Dec 5, 2025

github-project-automation bot moved this to To Triage in gpt-oss Issues & Enhancements Dec 5, 2025

gemini-code-assist bot reviewed Dec 5, 2025

View reviewed changes

qandrew changed the title ~~[responsesAPI][6] input/output messages for ResponsesParser~~ [responsesAPI][8] input/output messages for ResponsesParser Dec 8, 2025

qandrew force-pushed the all-input-output-parsable branch from 2d1d3e6 to 40fe6cb Compare December 8, 2025 04:11

mergify bot added the documentation Improvements or additions to documentation label Dec 8, 2025

mergify bot added the needs-rebase label Dec 8, 2025

Andrew Xia added 2 commits December 7, 2025 21:24

initial commit, input / output messages for non harmony responsesAPI

fb48da8

Signed-off-by: Andrew Xia <[email protected]>

unit test

76cad23

Signed-off-by: Andrew Xia <[email protected]>

qandrew force-pushed the all-input-output-parsable branch from 40fe6cb to 76cad23 Compare December 8, 2025 05:24

mergify bot removed the needs-rebase label Dec 8, 2025

Andrew Xia added 2 commits December 7, 2025 21:38

refactor

7511fbd

Signed-off-by: Andrew Xia <[email protected]>

fix

17fd28c

Signed-off-by: Andrew Xia <[email protected]>

qandrew marked this pull request as ready for review December 8, 2025 05:39

qandrew requested review from DarkLight1337, NickLucche, aarnphm, chaunceyjiang and robertgshaw2-redhat as code owners December 8, 2025 05:39

chatgpt-codex-connector bot reviewed Dec 8, 2025

View reviewed changes

qandrew commented Dec 9, 2025

View reviewed changes

Merge branch 'main' into all-input-output-parsable

0d48f5d

Signed-off-by: Andrew Xia <[email protected]>

chaunceyjiang approved these changes Dec 10, 2025

View reviewed changes

github-project-automation bot moved this from To Triage to Ready in gpt-oss Issues & Enhancements Dec 10, 2025

chaunceyjiang added the ready ONLY add when PR is ready to merge/full CI is needed label Dec 10, 2025

Merge branch 'main' into all-input-output-parsable

fb13011

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Uh oh!

[responsesAPI][8] input/output messages for ResponsesParser #30158

[responsesAPI][8] input/output messages for ResponsesParser #30158

Uh oh!

qandrew commented Dec 5, 2025 •

edited by github-actions bot

Loading

Uh oh!

gemini-code-assist bot left a comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

mergify bot commented Dec 8, 2025

Uh oh!

mergify bot commented Dec 8, 2025

Uh oh!

qandrew commented Dec 8, 2025

Uh oh!

chatgpt-codex-connector bot left a comment

Uh oh!

chatgpt-codex-connector bot Dec 8, 2025

Uh oh!

qandrew Dec 9, 2025

Uh oh!

chaunceyjiang left a comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants


		return self

		def make_response_output_items_from_parsable_context(

Uh oh!

[responsesAPI][8] input/output messages for ResponsesParser #30158

Are you sure you want to change the base?

[responsesAPI][8] input/output messages for ResponsesParser #30158

Uh oh!

Conversation

qandrew commented Dec 5, 2025 • edited by github-actions bot Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Purpose

Test Plan

Test Result

Uh oh!

gemini-code-assist bot left a comment

Choose a reason for hiding this comment

Code Review

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

mergify bot commented Dec 8, 2025

Uh oh!

mergify bot commented Dec 8, 2025

Uh oh!

qandrew commented Dec 8, 2025

Uh oh!

chatgpt-codex-connector bot left a comment

Choose a reason for hiding this comment

💡 Codex Review

Uh oh!

chatgpt-codex-connector bot Dec 8, 2025

Choose a reason for hiding this comment

Uh oh!

qandrew Dec 9, 2025

Choose a reason for hiding this comment

Uh oh!

chaunceyjiang left a comment

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

qandrew commented Dec 5, 2025 •

edited by github-actions bot

Loading