fix: categorization uses configured LLM; MCP returns str; Unicode JSON preserved #3429
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
Description
Fix: Categorization now respects the configured LLM provider/model (DeepSeek or OpenAI). Previously it was hardcoded to an OpenAI model, so even when both the LLM and embeddings used non‑OpenAI providers, categorization still forced OpenAI. It now follows the configured provider/model for these two backends.
Fix: Normalize MCP server tool-handler return type to str per the MCP interface specification (some handlers previously returned a Python dict).
Enhancement: Preserve Unicode in JSON responses (no ASCII escaping, e.g., ensure_ascii=False), improving non‑ASCII handling.
Type of change
How Has This Been Tested?
Manual test setup:
Environment variables:
Models:
Note: At this time, categorization supports DeepSeek and OpenAI only; this run validates the DeepSeek path with Qwen3 embeddings.
Steps and expected results:
Categorization respects configured provider/model (DeepSeek)
Unicode JSON responses
MCP tool call return type is normalized to string
Lint sanity (F811 fix)
Checklist:
Maintainer Checklist