Skip to content

fix: support PPT generation without GEMINI_API_KEY by adding Path B fallback#1171

Open
Jennifer-00 wants to merge 4 commits intobytedance:mainfrom
Jennifer-00:fix-issue-424-ppt
Open

fix: support PPT generation without GEMINI_API_KEY by adding Path B fallback#1171
Jennifer-00 wants to merge 4 commits intobytedance:mainfrom
Jennifer-00:fix-issue-424-ppt

Conversation

@Jennifer-00
Copy link
Copy Markdown

Description

Fixes #424

Summary

This PR fixes PPT generation failures when API-based image generation is unavailable, and improves stability of PPT output across multi-turn conversations.

Root cause

Previously, PPT generation effectively depended on a single path (Path A), which required GEMINI_API_KEY (image-based generation flow).
When GEMINI_API_KEY was not configured (or Path A failed), the run could end without a generated .pptx artifact.

What this PR changes

  1. Introduces Path B (text-based generation via python-pptx) that does not require external image API keys.
  2. Adds a built-in generate_ppt tool flow with stronger subprocess/output validation.
  3. Adds PptEnforcementMiddleware fallback to ensure a .pptx artifact is produced for PPT requests.
  4. Improves fallback trigger logic for multi-turn/option-style follow-ups (e.g., user replies like 1).
  5. Adds MessageSizeGuardMiddleware to reduce failures caused by oversized historical messages.
  6. Sanitizes leaked model artifact tags (<think>, <system>) in frontend message rendering.
  7. Removes the right-side artifact preview panel in chat (keeps artifact generation and download flow).

Validation

  1. Python compile checks passed for updated backend files.
  2. Local runs confirmed .pptx artifacts are generated under thread outputs.
  3. Frontend lint check passed for modified chat layout file.

Notes

  1. Branch is rebased on latest main.
  2. Runtime config should use deerflow.* module paths (not legacy src.*) after package migration.

@CLAassistant
Copy link
Copy Markdown

CLAassistant commented Mar 17, 2026

CLA assistant check
All committers have signed the CLA.

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@Jennifer-00 why do you do it? It's not related to topic of this PR?

middlewares.append(todo_list_middleware)

# Add PPT enforcement middleware (ensures PPTX artifact exists for PPT requests)
middlewares.append(PptEnforcementMiddleware())
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's better to leave the PPT-related change to the skills to keep the extensions clearer.

Copy link
Copy Markdown
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the review. This makes sense. And I’ll narrow the fix to the ppt-generation skill itself and remove the unrelated UI / middleware / built-in tool changes from this PR.


@staticmethod
def _run_generate_text(plan_file: str, output_file: str) -> tuple[bool, str]:
script = get_skills_root_path() / "public" / "ppt-generation" / "scripts" / "generate_text.py"
Copy link
Copy Markdown
Collaborator

@WillemJiang WillemJiang Mar 17, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@Jennifer-00 The middleware should not know about skill details; we should keep the middleware as simple as possible.

@@ -0,0 +1,144 @@
"""Built-in tool: generate_ppt

Generates a PowerPoint presentation directly from a slide plan, bypassing
Copy link
Copy Markdown
Collaborator

@WillemJiang WillemJiang Mar 17, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@Jennifer-00 What's the difference between the generate_ppt_tool and the ppt-generations tools?
Please don't introduce the same tool in different places, this kind of tool can be used in skills.

jjjojoj added a commit to jjjojoj/deer-flow that referenced this pull request Mar 27, 2026
Fix issue bytedance#424 - agent does not load ppt-generation skill proactively

Root cause (Lntanohuang's analysis):
- Agent's strong CLARIFY → PLAN → ACT system prompt causes it to ask
  clarifying questions before loading the ppt-generation skill
- Skill matching is fully passive (prompt-guided) — the agent must
  recognize and read_file SKILL.md on its own, which weaker models fail to do
- PR bytedance#1171 only addressed Path B fallback for missing GEMINI_API_KEY

Changes:
- Add SKILL-FIRST PRIORITY section in get_skills_prompt_section() with
  explicit PPT keyword table and immediate skill loading directive
- Add 'Skill-First Priority' reminder in critical_reminders section
- Specify exact skill path /mnt/skills/public/ppt-generation/SKILL.md
- Instruct agent to NOT ask clarification (style/slide count) — skill has
  reasonable defaults and handles parameters itself
- Bilingual keywords: English + Chinese (幻灯片, 生成PPT, 制作PPT)

This makes PPT generation requests recognized and acted upon immediately,
bypassing the CLARIFY step that caused re-planning loops.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

生成PPT功能无法使用

3 participants