Skip to content

Expose support for "custom" functions #6985

@stephentoub

Description

@stephentoub

OpenAI has recently added support for "custom" tools, where the input for the function isn't split across multiple schematized parameters but instead is just free-form text. These also support using a context-free grammar instead of a schema for guiding the model what that input can be.

We need to figure out the right way to expose this concept in M.E.AI. There are a few things we'll need to handle. Today:

  • AIFunction/AIFunctionDeclaration bakes in the notion that inputs are guided by a JSON schema.
  • FunctionCallContent expects a collection a parameters.
  • Neither FunctionCallContent nor FunctionResultContent carry with them an indication of whether they're tied to a normal function tool or a "custom" tool, but OpenAI differentiates, and would expect a FunctionResultContent produced in response to a custom tool to be on the wire differently from a regular function tool.

We have a variety of options for how we could expose this, e.g.

  1. We could add a string? FunctionContract or string? Grammar or something like that on AIFunctionDeclaration, which could be used by a leaf client to indicate it's a custom thing. AIFunctionFactoryCreateOptions could be augmented with one as well, and the resulting AIFunction it produces would likely behave differently in its InvokeAsync, such as by only allowing a single string/TextContent/etc. input parameter and sourcing it from a single parameter in the FunctionCallContent, regardless of the name of that entry. FunctionCallContent and FunctionResultContent could be augmented with an additional property that indicates with which what kind of tool it's associated. FunctionInvokingChatClient would be updated to know about the special-ness of certain AIFunctions (e.g. whether that property was set) and would tweak how it creates FunctionResultContent, or maybe it would just propagate the kind property from the FCC to the FRC, regardless of what AIFunction is used.
  2. We could introduce a new hierarchy of tools, e.g. AICustomFunction : AICustomFunctionDeclaration : AITool. FunctionInvokingChatClient would be updated to be aware of it and know how to invoke it. We would still need updates to FunctionCallContent/FunctionResultContent as in (1).
  3. Something so cool I don't know about it.

There are variations on all of these, too, e.g. we could utilize AdditionalProperties on the various content types rather than strongly-typed properties (but as everyone throughout the pipeline would need to agree on the meaning of things, strongly-typed properties are probably better.

We need to prototype this out end-to-end to determine the right course, and then do it. The .NET OpenAI library doesn't yet expose types for this, but it will in the near future. We should also investigate how other abstractions are handling this, and if other services like Gemini and Anthropic are introducing similar capabilities.

Metadata

Metadata

Assignees

No one assigned

    Labels

    area-aiMicrosoft.Extensions.AI librariesuntriaged

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions