diff --git a/rfcs/0001-workflow-modal-actions.md b/rfcs/0001-workflow-modal-actions.md new file mode 100644 index 00000000..952cebc7 --- /dev/null +++ b/rfcs/0001-workflow-modal-actions.md @@ -0,0 +1,576 @@ +# RFC: Workflow-Based Modal Actions with `"use step"` + +**Status:** Draft +**Author:** v0 +**Date:** 2026-02-15 + +## Summary + +Replace the current `bot.onAction` / `bot.onModalSubmit` / `bot.onModalClose` string-matcher pattern with an inline, awaitable modal API powered by [Workflow DevKit](https://useworkflow.dev). Instead of scattering handler registration across the file and correlating them through `callbackId` strings, modals become a single awaitable expression inside a `"use workflow"` function. The workflow suspends when a modal is opened, and resumes with the user's submitted values when they interact with it. + +## Motivation + +### The Problem + +Today, handling a modal interaction in chat-sdk requires **three separate registrations** connected by opaque string identifiers: + +```tsx +// 1. Register a button handler +bot.onAction("feedback", async (event) => { + await event.openModal( + + + + , + ); +}); + +// 2. Register a submit handler (matched by callbackId string) +bot.onModalSubmit("feedback_form", async (event) => { + if (!event.values.message || event.values.message.length < 5) { + return { action: "errors", errors: { message: "Too short" } }; + } + await event.relatedThread?.post(`Feedback: ${event.values.message}`); +}); + +// 3. Register a close handler (matched by callbackId string) +bot.onModalClose("feedback_form", async (event) => { + console.log(`${event.user.userName} cancelled feedback`); +}); +``` + +This has several issues: + +1. **Scattered logic** -- A single user interaction (click button, fill form, handle result) is split across 3 disconnected handler registrations. You have to mentally trace `callbackId` strings to understand the flow. + +2. **String coupling** -- The `"feedback"` action ID and `"feedback_form"` callback ID are magic strings that connect the handlers. Renaming one without the other silently breaks the flow. There's no compile-time safety. + +3. **No shared scope** -- The action handler and submit handler can't share local variables. Context must be threaded through `privateMetadata` (serialized JSON strings) or the state adapter, adding boilerplate and another source of bugs. + +4. **Linear flows are hard to express** -- Multi-step wizards (modal A -> modal B -> confirmation) require chaining multiple `onModalSubmit` handlers with increasingly complex `privateMetadata` passing. What should be a simple sequential flow becomes a state machine. + +5. **No built-in timeout or cancellation** -- If a user opens a modal and walks away, the modal context sits in Redis for 24 hours. There's no ergonomic way to add a timeout or cleanup logic. + +### The Vision + +What if a modal interaction was just an `await`? + +```tsx +bot.onAction("feedback", async (event) => { + "use workflow"; + + const result = await event.openModal( + + + + , + ); + + // This code runs after the user submits -- same scope, same function + await event.thread.post(`Feedback (${result.values.category}): ${result.values.message}`); +}); +``` + +No `callbackId`. No `onModalSubmit`. No `privateMetadata`. The workflow suspends when the modal opens and resumes with the form values when the user submits. Cancellation is just a try/catch. + +## Design + +### Core Primitive: `openModal` Returns a Promise + +Today `event.openModal()` returns `Promise<{ viewId: string } | undefined>` -- it fires and forgets. Under the workflow model, it returns `Promise` -- a promise that **suspends the workflow** until the user submits or closes the modal. + +Under the hood, this maps directly to Workflow DevKit's `createWebhook()` pattern. Per the WDK docs, `createWebhook()` returns a `Webhook` object where `await webhook` resolves to a standard `Request` (or `RequestWithResponse` for dynamic responses). The adapter POSTs the modal event data as JSON to `webhook.url`, and the workflow reads it via `request.json()`. + +The implementation is split into orchestration (workflow function) and actual work (step functions), following WDK best practices -- workflow functions orchestrate, step functions have full Node.js access: + +```ts +import { createWebhook, type RequestWithResponse } from "workflow"; + +// Step function: opens the modal on the platform (needs Node.js / adapter access) +async function platformOpenModal( + adapter: Adapter, + triggerId: string, + modalElement: ModalElement, + webhookUrl: string, +): Promise<{ viewId: string }> { + "use step"; + return adapter.openModal(triggerId, modalElement, undefined, { webhookUrl }); +} + +// Step function: parses the webhook request (respondWith must be in a step) +async function parseModalWebhook( + request: RequestWithResponse, +): Promise { + "use step"; + const data = await request.json(); + + // For validation flows: send a synchronous response back to the platform + // (e.g., Slack expects `response_action: "errors"` in the HTTP response) + // This is handled later via respondWith() -- see Validation Loop section + return data; +} + +// Conceptual implementation inside the Chat class +async openModal(modal: ModalElement | CardJSXElement): Promise { + "use workflow"; + + // createWebhook() -- no type parameter; always resolves to Request + const webhook = createWebhook({ respondWith: "manual" }); + + // Step: open the modal on the platform, passing webhook.url as the callback + await platformOpenModal(adapter, triggerId, modalElement, webhook.url); + + // Workflow suspends here -- no compute consumed while user fills the form + const request = await webhook; + + // Step: parse the request body + const data = await parseModalWebhook(request); + + if (data.type === "submit") { + return { action: "submit", values: data.values, user: data.user }; + } + + throw new ModalClosedError(data.user); +} +``` + +Key WDK patterns used here: + +- **`createWebhook()`** -- no generic type parameter (unlike `createHook()`). Always resolves to `Request`. +- **`respondWith: "manual"`** -- enables dynamic HTTP responses from step functions, critical for the validation loop (Slack needs `{ response_action: "errors" }` in the synchronous HTTP response). +- **Step functions for all "real work"** -- `platformOpenModal` and `parseModalWebhook` are `"use step"` functions with full Node.js access. The workflow function only orchestrates. +- **`respondWith()` called from step functions** -- per WDK docs, `request.respondWith()` must be called inside a `"use step"` function. + +The workflow **suspends** at `await webhook`. When the platform sends the modal submission back to chat-sdk, instead of routing to `onModalSubmit` handlers, the adapter POSTs to `webhook.url` to resume the workflow with the submitted data. + +### Inline `onAction` on Button Components + +Currently, buttons use an `id` prop and action handlers are registered separately via `bot.onAction("id", handler)`. This RFC proposes an additional `onAction` prop that binds the handler inline: + +```tsx +// Current pattern -- string coupling + + +bot.onAction("approve", async (event) => { /* ... */ }); + +// Proposed pattern -- inline binding + +``` + +**How it works:** + +1. When the JSX is rendered, `onAction` closures are registered in a per-render handler map keyed by an auto-generated action ID. +2. The `id` prop is auto-generated (e.g., `action_`) and embedded in the platform payload. +3. When the platform sends back the action event, the Chat class looks up the closure by auto-generated ID and invokes it. +4. Since the closure is a `"use workflow"` function, it becomes a durable workflow run that can suspend/resume. + +The existing `id` + `bot.onAction()` pattern continues to work -- `onAction` is purely additive. + +### Type-Safe Modal Results + +The `openModal` return type encodes the form field IDs and types from the modal definition: + +```ts +interface ModalResult = Record> { + action: "submit"; + values: TValues; + user: Author; + viewId: string; + raw: unknown; +} +``` + +With generics on the Modal component, we can infer the shape: + +```tsx +const result = await event.openModal( + title="Feedback"> + + + , +); + +result.values.message; // string -- type-safe +result.values.category; // string -- type-safe +result.values.typo; // TypeScript error +``` + +### Validation Loop + +Server-side validation that sends error messages back to the modal (Slack's `response_action: "errors"` pattern) becomes a simple loop: + +```tsx +bot.onAction("report", async (event) => { + "use workflow"; + + let result: ModalResult; + let errors: Record | null = null; + + do { + result = await event.openModal( + + + + + , + ); + + errors = null; + if (result.values.title.length < 3) { + errors = { title: "Title must be at least 3 characters" }; + } + } while (errors); + + await event.thread.post(`Bug filed: ${result.values.title} (${result.values.severity})`); +}); +``` + +Internally, when `errors` is set, the next `openModal` call uses `request.respondWith()` (from a `"use step"` function, per WDK requirements) to send a `response_action: "errors"` response back to the platform synchronously, then creates a new webhook and suspends again for the next submission. This leverages `createWebhook({ respondWith: "manual" })` so that each submission can receive a dynamic response before the workflow re-suspends. + +### Cancellation via Try/Catch + +When a user closes a modal (clicks Cancel or the X button), the webhook resolves with a `close` event. The `openModal` implementation throws a `ModalClosedError`: + +```tsx +bot.onAction("feedback", async (event) => { + "use workflow"; + + try { + const result = await event.openModal( + + + , + ); + await event.thread.post(`Thanks for the feedback: ${result.values.message}`); + } catch (err) { + if (err instanceof ModalClosedError) { + console.log(`${err.user.userName} cancelled the feedback form`); + // Optionally notify the user + } + } +}); +``` + +This replaces `bot.onModalClose()` entirely for workflows. The error is caught in the same scope where the modal was opened, with full access to the surrounding closure. + +### Timeout Pattern + +Using Workflow DevKit's `sleep()` and `Promise.race`: + +```tsx +bot.onAction("approval", async (event) => { + "use workflow"; + + const modalPromise = event.openModal( + + + , + ); + + const result = await Promise.race([ + modalPromise, + sleep("1h").then(() => "timeout" as const), + ]); + + if (result === "timeout") { + await event.thread.post("Approval request expired after 1h."); + return; + } + + await event.thread.post(`Approved: ${result.values.reason}`); +}); +``` + +No compute resources are consumed during the sleep or while waiting for the modal -- the workflow is fully suspended. + +### Multi-Step Wizard + +Sequential modals that would currently require chaining multiple `onModalSubmit` handlers with `privateMetadata` become a simple linear flow: + +```tsx +bot.onAction("onboarding", async (event) => { + "use workflow"; + + // Step 1: Basic info + const step1 = await event.openModal( + + + + , + ); + + // Step 2: Preferences (has access to step1 values in scope!) + const step2 = await event.openModal( + + + + , + ); + + // Step 3: Confirmation + const step3 = await event.openModal( + + + , + ); + + // All values available in one scope -- no privateMetadata gymnastics + await event.thread.post( + `Onboarded ${step1.values.name} (${step1.values.email}) to ${step2.values.team} in ${step2.values.timezone}`, + ); +}); +``` + +### Parallel Modal Collection + +Using `Promise.all` with webhooks to collect responses from multiple users: + +```tsx +async function collectVotes(thread: Thread, voters: string[]) { + "use workflow"; + + const results = await Promise.all( + voters.map(async (userId) => { + const dmThread = await bot.openDM(userId); + await dmThread.post( + + Please submit your vote. + + + + , + ); + }), + ); + + return results; +} +``` + +## Implementation + +### Architecture + +``` + ┌──────────────────────────────────────┐ + │ Workflow Runtime │ + │ │ + User clicks │ bot.onAction("feedback", async () { │ + [Feedback] button │ "use workflow"; │ + │ │ │ + ▼ │ // Step 1: open modal │ + ┌─────────┐ processAction() │ const webhook = createWebhook() │ + │ Platform ├─────────────────────►│ adapter.openModal(triggerId, │ + │ (Slack) │ │ modal, webhook.url) │ + └─────────┘ │ │ + │ │ ──── workflow suspends ──── │ + │ User fills form │ (no compute) │ + │ and clicks Submit │ │ + ▼ │ ──── webhook fires ──── │ + ┌─────────┐ POST webhook.url │ │ + │ Platform ├─────────────────────►│ const result = await webhook │ + │ (Slack) │ │ // { values, user, viewId } │ + └─────────┘ │ │ + │ // Step 2: handle result │ + │ await thread.post(...) │ + │ }); │ + └──────────────────────────────────────┘ +``` + +### Key Implementation Details + +#### 1. Webhook-Based Resumption + +The core mechanism uses `createWebhook()` from Workflow DevKit. When `openModal()` is called inside a `"use workflow"` function: + +1. A webhook is created via `createWebhook({ respondWith: "manual" })` -- `respondWith: "manual"` is needed so step functions can send dynamic HTTP responses (e.g., validation errors) back to the platform +2. The webhook URL is passed to the adapter's `openModal()` method (new `webhookUrl` parameter) +3. The adapter stores the webhook URL alongside the modal's platform-specific metadata +4. When the platform sends a submission/close event, the adapter POSTs to the webhook URL instead of calling `processModalSubmit()` +5. The workflow resumes -- `await webhook` resolves to a standard `Request` object (per WDK docs, `createWebhook` always resolves to `Request`, unlike `createHook` which resolves to `T`) +6. A step function parses the request via `request.json()` and optionally calls `request.respondWith()` for validation errors + +#### 2. Adapter Changes + +The `Adapter.openModal()` signature gains an optional `webhookUrl` parameter: + +```ts +interface Adapter { + openModal?( + triggerId: string, + modal: ModalElement, + contextId?: string, + options?: { webhookUrl?: string }, + ): Promise<{ viewId: string }>; +} +``` + +When `webhookUrl` is present, the adapter stores it in the modal metadata (e.g., Slack's `private_metadata`). On submission/close, if a webhook URL is found in the metadata, the adapter POSTs to it instead of calling `processModalSubmit()` / `processModalClose()`. + +#### 3. Serialization + +chat-sdk already has full `@workflow/serde` integration: + +- `ThreadImpl` has `WORKFLOW_SERIALIZE` and `WORKFLOW_DESERIALIZE` static methods +- `Message` has the same +- `chat.registerSingleton()` enables lazy adapter resolution after deserialization + +The `ActionEvent` and `ModalResult` types will need similar serde support so they can cross the workflow suspension boundary. + +#### 4. `onAction` Prop Handler Registry + +For inline `onAction` props on `