[Feature Request] Native Support for MCP Code Execution (Programmatic Tool Calling) & Lazy Result Loading

### Checked other resources

- [x] This is a feature request, not a bug report or usage question.
- [x] I added a clear and descriptive title that summarizes the feature request.
- [x] I used the GitHub search to find a similar feature request and didn't find it.
- [x] I checked the LangChain documentation and API reference to see if this feature already exists.
- [x] This is not related to the langchain-community package.

### Package (Required)

- [x] langchain
- [x] langchain-openai
- [x] langchain-anthropic
- [ ] langchain-classic
- [x] langchain-core
- [ ] langchain-cli
- [ ] langchain-model-profiles
- [ ] langchain-tests
- [ ] langchain-text-splitters
- [ ] langchain-chroma
- [ ] langchain-deepseek
- [ ] langchain-exa
- [ ] langchain-fireworks
- [ ] langchain-groq
- [ ] langchain-huggingface
- [ ] langchain-mistralai
- [ ] langchain-nomic
- [ ] langchain-ollama
- [ ] langchain-perplexity
- [ ] langchain-prompty
- [ ] langchain-qdrant
- [ ] langchain-xai
- [ ] Other / not sure / general

### Feature Description

I am requesting native support for **Programmatic Tool Calling (Code Execution**) with Model Context Protocol (MCP) integration.

Currently, LangChain agents typically operate in a "chatty" loop:

1. LLM predicts a tool call (JSON).
2. Runtime executes the tool.
3. Full result is appended to the context.
4. LLM reads the result and decides the next step.

**The requested feature changes this paradigm to:**

1. The Agent is aware of available tools as importable libraries.
2. The Agent generates a script (e.g., Python or TypeScript) that orchestrates multiple tool calls, loops, and data filtering logic in one go.
3. The script runs in a sandbox (e.g., Docker, Daytona, E2B).
4. **Crucially**: Intermediate results stay in the sandbox variable state and are not automatically dumped into the LLM context unless explicitly requested or returned.

**Key capabilities needed:**

- **Batching via Code**: Ability for the LLM to write a script calling `tool_a()` and `tool_b()` sequentially or in loops without context round-trips.
- **Lazy Result Loading**: The ability to return a reference/ID of a tool result to the LLM, allowing the model to decide whether to fetch the full payload (expensive) or just use the reference for the next code step.
- **Direct Return**: A mechanism to bypass the final LLM generation step if the tool result (e.g.  a generated file or structured dataset) is the final answer.

### Use Case

**Problem**: When building complex agents connected to data-heavy MCP servers (e.g., databases, CRM, extensive file systems), the standard tool-calling loop is inefficient.

1. **Context Pollution**: Returning 500 rows of SQL data just to filter them down to 5 rows consumes massive amounts of tokens.
2. **Latency**: Sequential round-trips for multi-step tasks take too long.
3. **Reliability**: LLMs often hallucinate when manually copying data from a tool result string into a new tool call argument.

**Solution Value**: By shifting logic to a code execution environment, we can achieve 60-98% token reduction (as per Anthropic and Cloudflare research) and significantly lower latency. This allows LangChain agents to handle "_Real World_" data tasks that are currently cost-prohibitive.

### Proposed Solution

The implementation could involve:

1. **New `CodeExecutionAgent` Architecture**: An agent optimized for writing orchestration scripts rather than single JSON tool calls.
2. **MCP-to-Native Bridge**: A utility that converts MCP Tool definitions into importable Python/TS functions (with type hints) that can be injected into a sandbox environment.
3. **Result Primitiv**es:
    * **`ToolResultReference`**: A lightweight object returned to the context containing metadata (e.g., `id: "123", preview: "Dataframe 1000x5..."`).
    * `fetch_result(id)`: A tool exposed to the Agent to retrieve the full content if truly needed.

```mermaid
sequenceDiagram
    participant Agent as LLM Context
    participant Sandbox as Code Runtime
    participant Tools as MCP Tools

    rect rgb(140, 140, 140)
    Note over Agent, Tools: Standard Approach (Current)
    Agent->>Tools: Call Tool A
    Tools-->>Agent: Return HUGE Payload (Tokens $$)
    Agent->>Tools: Call Tool B (using data from A)
    Tools-->>Agent: Return Result
    end

    rect rgb(120, 155, 120)
    Note over Agent, Tools: Proposed Code Execution Approach
    Agent->>Sandbox: Send Script (Call A, filter data, Call B)
    activate Sandbox
    Sandbox->>Tools: Execute A
    Sandbox->>Sandbox: Filter/Process Data (in memory)
    Sandbox->>Tools: Execute B
    deactivate Sandbox
    Sandbox-->>Agent: Return Final Answer OR Result Reference (Low Tokens)
    end
```

### Alternatives Considered

* **Current Standard**: Chaining `Runnable` objects or using `LangGraph` with standard tool nodes. This works for simple tasks but fails to scale with data volume due to context window limits.
* **Custom Sandboxes**: Manually implementing a Python REPL tool. While possible, it lacks the standardized discovery of MCP tools and requires heavy lifting to bridge the tool definitions into the REPL scope safely.

### Additional Context

There is already significant community momentum around this pattern. LangChain should aim to support this natively to remain the standard for efficient agents.

**Existing implementations proving the concept:**

1. Open PTC Agent (by Chen-zexi):
    * [GitHub Repo](https://github.com/Chen-zexi/open-ptc-agent), [Reddit Post](https://www.reddit.com/r/ClaudeAI/comments/1p8arsf/implemented_anthropics_programmatic_tool_calling/)
    * Built on top of LangChain DeepAgent.
    * Implements Anthropic's "Programmatic Tool Calling".
    * Demonstrates distinct advantages: tools are discovered on-demand, and code executes in a Daytona sandbox.
2. Code-Mode (by Juanviera23):
    * [GitHub Repo](https://github.com/universal-tool-calling-protocol/code-mode), [Reddit Post](https://www.reddit.com/r/Python/comments/1p4okow/codemode_mcp_for_python_save_60_in_tokens_by/)
    * Demonstrates a `>60%` token reduction.
    * Uses a TypeScript sandbox where the agent writes a single script to replace 4-6 sequential tool calls.
3. [Anthropic's "*Code Execution With MCP*" post](https://www.anthropic.com/engineering/code-execution-with-mcp)

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

[Feature Request] Native Support for MCP Code Execution (Programmatic Tool Calling) & Lazy Result Loading #34130

Checked other resources

Package (Required)

Feature Description

Use Case

Proposed Solution

Alternatives Considered

Additional Context

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

[Feature Request] Native Support for MCP Code Execution (Programmatic Tool Calling) & Lazy Result Loading #34130

Description

Checked other resources

Package (Required)

Feature Description

Use Case

Proposed Solution

Alternatives Considered

Additional Context

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions