-
Notifications
You must be signed in to change notification settings - Fork 321
Create Fine-tune with traces & tools #781
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 1
Caution
Some comments are outside the diff and can’t be posted inline due to platform limitations.
⚠️ Outside diff range comments (3)
libs/core/kiln_ai/adapters/fine_tune/dataset_formatter.py (2)
58-114: Remove unusedtool_definitionsparameter.The
tool_definitionsparameter is declared but never used within thebuild_training_chatfunction body. The tool definitions are only passed to the generator functions indump_to_file, not used during training chat construction.Apply this diff to remove the unused parameter:
def build_training_chat( task_run: TaskRun, system_message: str, data_strategy: ChatStrategy, thinking_instructions: str | None = None, - tool_definitions: list[ToolCallDefinition] | None = None, ) -> list[ChatMessage]:
139-179: Fix type annotation to match actual return type.The function signature declares
list[dict[str, str | None]]as the return type, but the function actually returns dictionaries that can containtool_calls(list) andtool_call_id(str), which don't match the annotation. The internal typing aslist[dict[str, Any]]is correct.Apply this diff to fix the type annotation:
def generate_chat_message_list( training_chat: list[ChatMessage], -) -> list[dict[str, str | None]]: +) -> list[dict[str, Any]]: """Generate OpenAI chat list. Not the full OpenAI body, just the list of messages."""app/web_ui/src/routes/(app)/fine_tune/[project_id]/[task_id]/create_finetune/+page.svelte (1)
598-652: Bug:update_data_strategies_supportedleaves Reasoning select empty for some download formatsFor download formats in
r1_disabled_for_downloads(download_huggingface_chat_template_toolcall,download_jsonl_toolcall,download_vertex_gemini), the function hits:if (r1_disabled_for_downloads.includes(model_provider)) { return ["final_only", "two_message_cot"] }This early
returnjust yields an array and exits without updatingdata_strategy_select_optionsordata_strategy, so the “Reasoning” dropdown can remain stale or empty when switching into these options.You can fold the special‑case into
compatible_data_strategiesinstead, so the function always updates the reactive state:- const r1_disabled_for_downloads = [ - // R1 data strategy currently disabled for toolcall downloads - // because unclear how to use in the best way - "download_huggingface_chat_template_toolcall", - "download_jsonl_toolcall", - - // R1 currently not supported by Vertex models - "download_vertex_gemini", - ] - if (r1_disabled_for_downloads.includes(model_provider)) { - return ["final_only", "two_message_cot"] - } - - const compatible_data_strategies: ChatStrategy[] = is_download - ? [ - "final_only", - "two_message_cot", - "final_and_intermediate_r1_compatible", - ] - : available_models + const r1_disabled_for_downloads = [ + // R1 data strategy currently disabled for toolcall downloads + // because unclear how to use in the best way + "download_huggingface_chat_template_toolcall", + "download_jsonl_toolcall", + + // R1 currently not supported by Vertex models + "download_vertex_gemini", + ] + + const compatible_data_strategies: ChatStrategy[] = is_download + ? (r1_disabled_for_downloads.includes(model_provider) + ? ["final_only", "two_message_cot"] + : [ + "final_only", + "two_message_cot", + "final_and_intermediate_r1_compatible", + ]) + : available_models ?.map((model) => model.models) .flat() .find((model) => model.id === base_model_id) ?.data_strategies_supported ?? []This keeps the R1 strategy hidden where needed but still drives
data_strategy_select_optionsanddata_strategyfor the UI.
♻️ Duplicate comments (1)
libs/core/kiln_ai/adapters/fine_tune/dataset_formatter.py (1)
410-450: Past review concern is addressed.The error handling for tool definition lookup is properly implemented. The try-except block (lines 438-448) catches all exceptions including
ValueErrorfromtool_from_idand wraps them in aRuntimeErrorwith clear context. This prevents a single bad task run from crashing the entire formatting operation, addressing the concern raised in the previous review.
🧹 Nitpick comments (2)
app/web_ui/src/routes/(app)/fine_tune/[project_id]/[task_id]/create_finetune/+page.svelte (2)
92-201: State persistence and reset flow look solid; consider future-proofing in‑memory resetsThe IndexedDB‑backed
SavedFinetuneStateplusstate_initializedguard now correctly preventsclear_saved_state()from being immediately overwritten by the reactive writer, and thesaved_dataset_idrestoration intoSelectFinetuneDatasetlooks consistent.One thing to keep in mind:
clear_saved_state()resets only the persisted state, not the local form fields (finetune_name,finetune_description,selected_dataset, etc.). That’s fine for the current flows (Reset does a fullwindow.location.reload(); post‑create you show theCompletedview), but if you later add a “Create another fine‑tune” path on the same component instance you’ll also want to clear the in‑memory fields and explicitly flipstate_initializedback totrueafter re‑initializing them so subsequent edits are re‑persisted.function clear_saved_state() { // Prevent the reactive block from immediately repopulating the store state_initialized = false // Clear the saved state in IndexedDB by saving the defaults saved_state.update((s) => ({ ...s, @@ tools: undefined, })) + + // (Optional for future "create another fine-tune" flows) + // Reset in-memory fields as well so the UI matches the cleared state. + // finetune_name = "" + // finetune_description = "" + // selected_dataset = null + // selected_tool_ids = [] + // data_strategy = "final_only" + // system_prompt_method = "simple_prompt_builder" }Also applies to: 217-225, 229-244, 826-851
853-855: Minor UX nit: Step 4 header is empty for download‑only flowsWhen
is_downloadis true and a dataset is selected,step_4_visibleis true so the “Step 4: Advanced Options” header renders, but the inner content is wrapped in{#if !is_download}and therefore empty. Then Step 5 (“Download JSONL”) appears below.That’s a little confusing visually (a numbered step with no controls). You could either hide Step 4 entirely for download flows, or repurpose the label so the sequence stays contiguous (e.g., make downloads the new Step 4 when
is_download).- $: step_4_visible = - $model_provider && $model_provider !== disabled_header && !!selected_dataset + $: step_4_visible = + $model_provider && + $model_provider !== disabled_header && + !!selected_dataset && + !is_download @@ - {#if step_4_visible} + {#if step_4_visible} <div class="text-xl font-bold">Step 4: Advanced Options</div> {#if !is_download} <div> <Collapse title="Advanced Options"> @@ - {#if step_5_download_visible} + {#if step_5_download_visible} <div> - <div class="text-xl font-bold">Step 5: Download JSONL</div> + <div class="text-xl font-bold"> + {step_4_visible ? "Step 5: Download JSONL" : "Step 4: Download JSONL"} + </div>Also applies to: 856-901, 905-937
📜 Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (2)
app/web_ui/src/routes/(app)/fine_tune/[project_id]/[task_id]/create_finetune/+page.svelte(12 hunks)libs/core/kiln_ai/adapters/fine_tune/dataset_formatter.py(11 hunks)
🧰 Additional context used
🧠 Learnings (2)
📚 Learning: 2025-10-22T04:31:20.500Z
Learnt from: leonardmq
Repo: Kiln-AI/Kiln PR: 739
File: app/web_ui/src/routes/(app)/generate/[project_id]/[task_id]/synth/+page.svelte:170-177
Timestamp: 2025-10-22T04:31:20.500Z
Learning: In Svelte components for the Kiln project, leonardmq prefers tracking previous query parameter values and comparing them in reactive blocks (rather than just checking for existence) to avoid duplicate execution on initial load while still reacting to subsequent navigation changes. This pattern supports both current behavior and future feature additions like mode switching.
Applied to files:
app/web_ui/src/routes/(app)/fine_tune/[project_id]/[task_id]/create_finetune/+page.svelte
📚 Learning: 2025-10-07T00:21:33.537Z
Learnt from: sfierro
Repo: Kiln-AI/Kiln PR: 698
File: app/web_ui/src/routes/(app)/evals/[project_id]/[task_id]/compare/+page.svelte:800-812
Timestamp: 2025-10-07T00:21:33.537Z
Learning: In the Kiln web UI codebase, the `save_new_task_run_config` function in `app/web_ui/src/lib/stores/run_configs_store.ts` automatically refreshes the run configs store by calling `load_task_run_configs(project_id, task_id, true)` with force_refresh=true after successfully creating a new run config. This means callers of save_new_task_run_config do not need to manually refresh the store afterward, as the store will already contain the newly created config.
Applied to files:
app/web_ui/src/routes/(app)/fine_tune/[project_id]/[task_id]/create_finetune/+page.svelte
🧬 Code graph analysis (1)
libs/core/kiln_ai/adapters/fine_tune/dataset_formatter.py (5)
libs/core/kiln_ai/adapters/chat/chat_formatter.py (6)
BasicChatMessage(16-18)ToolCallMessage(22-27)ToolResponseMessage(31-36)get_chat_formatter(239-259)append_messages(74-76)messages(71-72)libs/core/kiln_ai/adapters/chat/chat_utils.py (1)
build_tool_call_messages(10-61)libs/core/kiln_ai/adapters/fine_tune/vertex_formatter.py (1)
generate_vertex_gemini(18-235)libs/core/kiln_ai/tools/base_tool.py (1)
ToolCallDefinition(19-23)libs/core/kiln_ai/tools/tool_registry.py (1)
tool_from_id(25-117)
🪛 GitHub Actions: Debug Detector
libs/core/kiln_ai/adapters/fine_tune/dataset_formatter.py
[error] 435-435: Python print statement found. Remove or replace with proper logging.
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (10)
- GitHub Check: Generate Coverage Report
- GitHub Check: Build Desktop Apps (ubuntu-22.04)
- GitHub Check: Build Desktop Apps (windows-latest)
- GitHub Check: Build Desktop Apps (macos-15-intel)
- GitHub Check: Build Desktop Apps (ubuntu-22.04-arm)
- GitHub Check: Build Desktop Apps (macos-latest)
- GitHub Check: Build, Typecheck, and Test Python (3.13)
- GitHub Check: Build, Typecheck, and Test Python (3.12)
- GitHub Check: Build, Typecheck, and Test Python (3.10)
- GitHub Check: Build, Typecheck, and Test Python (3.11)
🔇 Additional comments (4)
libs/core/kiln_ai/adapters/fine_tune/dataset_formatter.py (3)
8-21: LGTM! Clean import additions for tool integration.The new imports are well-organized and all necessary for the tool call functionality being added to dataset formatting.
87-90: LGTM! Tool call extraction is correctly implemented.The tool call messages are properly extracted from traces and inserted into the training chat, enabling tool-based fine-tuning data generation.
339-406: LGTM! Async conversion is necessary and well-implemented.The method is correctly converted to async to support fetching tool definitions from potentially async sources (like MCP servers). Tool definitions are properly fetched and propagated through the formatting pipeline.
app/web_ui/src/routes/(app)/fine_tune/[project_id]/[task_id]/create_finetune/+page.svelte (1)
447-458: Run‑config and tools wiring into finetune creation looks coherentThe way you:
- Extract
model_namefrombase_model_id,- Build
run_config_propertiesfromrun_config_component.run_options_as_run_config_properties()plusmodel_nameandmodel_provider_name,- Bind
toolsintoRunConfigComponentand reuseselected_tool_idsboth for dataset filtering (required_tool_ids) and analytics (tool_count),- Gate tools via
supports_function_callingand clear them reactively for unsupported models,results in a clean, self‑contained payload (
run_config_properties) and consistent telemetry (supports_tools/tool_count).The null‑checks around
run_config_component,model_name, andprovider_idalso make this robust against partial state. No changes needed here.Also applies to: 487-487, 501-503, 806-824, 844-847
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 0
🧹 Nitpick comments (1)
libs/core/kiln_ai/adapters/fine_tune/dataset_formatter.py (1)
58-114: Remove unused parameter.The
tool_definitionsparameter added at line 63 is never used within the function body. Tool call messages are extracted from traces (lines 87-89), but the tool definitions themselves aren't needed during chat construction—they're only passed to the formatters later.Apply this diff to remove the unused parameter:
def build_training_chat( task_run: TaskRun, system_message: str, data_strategy: ChatStrategy, thinking_instructions: str | None = None, - tool_definitions: list[ToolCallDefinition] | None = None, ) -> list[ChatMessage]:And update the call site at line 393:
training_chat = build_training_chat( task_run=task_run, system_message=self.system_message, data_strategy=data_strategy, thinking_instructions=self.thinking_instructions, - tool_definitions=tools, )
📜 Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (1)
libs/core/kiln_ai/adapters/fine_tune/dataset_formatter.py(11 hunks)
🧰 Additional context used
🧠 Learnings (1)
📚 Learning: 2025-08-08T16:13:26.526Z
Learnt from: leonardmq
Repo: Kiln-AI/Kiln PR: 418
File: libs/core/kiln_ai/adapters/ml_model_list.py:1875-1890
Timestamp: 2025-08-08T16:13:26.526Z
Learning: In libs/core/kiln_ai/adapters/ml_model_list.py (Python), do not blanket-add r1_thinking/optional_r1_thinking parsers for R1-style models. Parser usage is provider-specific and must be based on observed responses in tests. For PR Kiln-AI/Kiln#418, deepseek_r1_0528_distill_qwen3_8b providers were validated without needing a parser.
Applied to files:
libs/core/kiln_ai/adapters/fine_tune/dataset_formatter.py
🧬 Code graph analysis (1)
libs/core/kiln_ai/adapters/fine_tune/dataset_formatter.py (5)
libs/core/kiln_ai/adapters/chat/chat_formatter.py (6)
BasicChatMessage(16-18)ToolCallMessage(22-27)ToolResponseMessage(31-36)get_chat_formatter(239-259)append_messages(74-76)messages(71-72)libs/core/kiln_ai/adapters/chat/chat_utils.py (1)
build_tool_call_messages(10-61)libs/core/kiln_ai/adapters/fine_tune/vertex_formatter.py (1)
generate_vertex_gemini(18-235)libs/core/kiln_ai/tools/base_tool.py (1)
ToolCallDefinition(19-23)libs/core/kiln_ai/tools/tool_registry.py (1)
tool_from_id(25-117)
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (5)
- GitHub Check: Build Desktop Apps (macos-15-intel)
- GitHub Check: Build Desktop Apps (macos-latest)
- GitHub Check: Build Desktop Apps (ubuntu-22.04)
- GitHub Check: Build Desktop Apps (ubuntu-22.04-arm)
- GitHub Check: Build Desktop Apps (windows-latest)
🔇 Additional comments (6)
libs/core/kiln_ai/adapters/fine_tune/dataset_formatter.py (6)
8-20: LGTM!The new imports for chat message types (BasicChatMessage, ToolCallMessage, ToolResponseMessage), tool utilities (build_tool_call_messages), and tool registry functions (tool_from_id, ToolCallDefinition) are all used appropriately throughout the file to support tool-based fine-tuning.
48-55: LGTM!The protocol extension to accept optional tool definitions is appropriate and all format generator implementations have been updated to match this signature.
139-179: LGTM!The function correctly handles the three message types (BasicChatMessage, ToolCallMessage, ToolResponseMessage) with appropriate type guards and error handling. The type annotation change to
Anyis necessary to accommodate tool_calls arrays, and the empty string fallback for None content (line 162) is well-documented for Fireworks compatibility.
410-449: LGTM!The helper method correctly fetches tool definitions with appropriate caching to avoid repeated MCP server calls. The error handling wraps any exceptions with context, addressing the previous review concern about tool lookup failures. Early returns for missing prerequisites (run_config, tools_config, task) are handled cleanly.
Based on past review comments (now resolved).
339-406: LGTM!The async conversion of
dump_to_fileis correct, properly awaiting the tool definition fetch (line 390) and threading tool definitions through to bothbuild_training_chatand the format generator (lines 398, 400). UTF-8 encoding withensure_ascii=Falseappropriately supports international content.
267-275: No issues found—HuggingFace format correctly embeds tool information in conversation structure.Verification confirms that both HuggingFace generator functions intentionally do not use the
toolsparameter. Thegenerate_huggingface_chat_template_toolcallfunction embeds tool call information directly within the conversation structure (via the"tool_calls"key in messages), whereas the OpenAI format includes tool definitions separately. This is the correct design pattern for HuggingFace chat templates, which do not use a separate tools field.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 2
🧹 Nitpick comments (2)
app/web_ui/src/lib/ui/run_config_component/tools_selector.svelte (1)
176-194: Consider simplifying redundant nullish coalescing.The
common_propsobject is a good refactor for reducing duplication. However, the fallback on lines 184-186 is redundant sincetools_selector_settings.empty_labelis guaranteed to have a value from the defaults merge on lines 31-34.Apply this diff to simplify:
- empty_label: - tools_selector_settings.empty_label ?? - default_tools_selector_settings.empty_label, + empty_label: tools_selector_settings.empty_label,Note: The current code works correctly; this is just a minor simplification.
app/web_ui/src/routes/(app)/fine_tune/[project_id]/[task_id]/create_finetune/+page.svelte (1)
182-201: Consider simplifying state clearing logic.The
clear_saved_statefunction usesupdate()to set each field toundefined(lines 187-200), which is verbose and maintains the object structure in IndexedDB with undefined values. Sincestate_initializedis already set tofalseto prevent reactive repopulation, a simpler approach would be to clear the entire object:function clear_saved_state() { // Prevent the reactive block from immediately repopulating the store state_initialized = false - - // Clear the saved state in IndexedDB by saving the defaults - saved_state.update((s) => ({ - ...s, - name: undefined, - description: undefined, - provider: undefined, - base_model_id: undefined, - dataset_split_id: undefined, - parameters: undefined, - system_message: undefined, - thinking_instructions: undefined, - data_strategy: "final_only", - system_prompt_method: "simple_prompt_builder", - tools: undefined, - })) + saved_state.set({}) }This achieves the same result with less code and more clearly expresses the intent to clear the state.
📜 Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (4)
app/web_ui/src/lib/ui/run_config_component/tools_selector.svelte(6 hunks)app/web_ui/src/lib/utils/form_element.svelte(1 hunks)app/web_ui/src/routes/(app)/fine_tune/[project_id]/[task_id]/create_finetune/+page.svelte(12 hunks)libs/core/kiln_ai/adapters/fine_tune/dataset_formatter.py(10 hunks)
🚧 Files skipped from review as they are similar to previous changes (1)
- libs/core/kiln_ai/adapters/fine_tune/dataset_formatter.py
🧰 Additional context used
🧠 Learnings (4)
📚 Learning: 2025-10-27T04:44:58.745Z
Learnt from: leonardmq
Repo: Kiln-AI/Kiln PR: 751
File: app/web_ui/src/routes/(app)/docs/rag_configs/[project_id]/create_rag_config/edit_rag_config_form.svelte:49-50
Timestamp: 2025-10-27T04:44:58.745Z
Learning: In the Kiln web UI codebase (Svelte), the FormElement component uses an `optional` boolean prop that defaults to `false`, rather than a `required` prop. FormElement also has built-in validation capabilities through its `validator` prop.
Applied to files:
app/web_ui/src/lib/utils/form_element.svelte
📚 Learning: 2025-10-22T04:31:20.500Z
Learnt from: leonardmq
Repo: Kiln-AI/Kiln PR: 739
File: app/web_ui/src/routes/(app)/generate/[project_id]/[task_id]/synth/+page.svelte:170-177
Timestamp: 2025-10-22T04:31:20.500Z
Learning: In Svelte components for the Kiln project, leonardmq prefers tracking previous query parameter values and comparing them in reactive blocks (rather than just checking for existence) to avoid duplicate execution on initial load while still reacting to subsequent navigation changes. This pattern supports both current behavior and future feature additions like mode switching.
Applied to files:
app/web_ui/src/routes/(app)/fine_tune/[project_id]/[task_id]/create_finetune/+page.svelte
📚 Learning: 2025-10-07T00:21:33.537Z
Learnt from: sfierro
Repo: Kiln-AI/Kiln PR: 698
File: app/web_ui/src/routes/(app)/evals/[project_id]/[task_id]/compare/+page.svelte:800-812
Timestamp: 2025-10-07T00:21:33.537Z
Learning: In the Kiln web UI codebase, the `save_new_task_run_config` function in `app/web_ui/src/lib/stores/run_configs_store.ts` automatically refreshes the run configs store by calling `load_task_run_configs(project_id, task_id, true)` with force_refresh=true after successfully creating a new run config. This means callers of save_new_task_run_config do not need to manually refresh the store afterward, as the store will already contain the newly created config.
Applied to files:
app/web_ui/src/routes/(app)/fine_tune/[project_id]/[task_id]/create_finetune/+page.svelte
📚 Learning: 2025-11-11T00:07:51.535Z
Learnt from: sfierro
Repo: Kiln-AI/Kiln PR: 756
File: app/web_ui/src/lib/ui/run_config_component/tools_selector.svelte:19-20
Timestamp: 2025-11-11T00:07:51.535Z
Learning: In app/web_ui/src/lib/ui/run_config_component/tools_selector.svelte, the single_select_selected_tool prop should NOT be synchronized with the tools_store. Only the multi-select tools array should persist to the store. The single_select mode is intended to use ephemeral component-local state.
Applied to files:
app/web_ui/src/lib/ui/run_config_component/tools_selector.svelte
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (8)
- GitHub Check: Build Desktop Apps (macos-latest)
- GitHub Check: Build Desktop Apps (ubuntu-22.04-arm)
- GitHub Check: Build Desktop Apps (ubuntu-22.04)
- GitHub Check: Build Desktop Apps (windows-latest)
- GitHub Check: Build Desktop Apps (macos-15-intel)
- GitHub Check: Build, Typecheck, and Test Python (3.12)
- GitHub Check: Build, Typecheck, and Test Python (3.10)
- GitHub Check: Build, Typecheck, and Test Python (3.11)
🔇 Additional comments (11)
app/web_ui/src/lib/ui/run_config_component/tools_selector.svelte (8)
1-10: LGTM: Clean imports supporting settings-driven refactor.The imports are well-organized and appropriate for the settings-driven architecture introduced in this refactor.
11-18: LGTM: Props correctly implement settings pattern and respect store synchronization.The new
settingsprop enables a clean, settings-driven API. Thesingle_select_selected_toolcorrectly remains component-local state and is not synchronized withtools_store, which aligns with the intended ephemeral behavior for single-select mode. Based on learnings.
22-34: LGTM: Well-implemented settings merge pattern.The settings-driven approach is correctly implemented with sensible defaults and proper reactive merging. This reduces prop proliferation and provides a cleaner API surface.
43-66: LGTM: Mandatory tools correctly integrated into load flow.The function properly combines mandatory tools from settings with existing task tools, preserving order and removing duplicates. The async initialization is handled correctly.
69-77: LGTM: Store synchronization correctly implements documented behavior.The reactive statement properly synchronizes only the multi-select
toolsarray with the store, while keepingsingle_select_selected_toolephemeral. This matches the intended design. Based on learnings.
79-109: LGTM: Robust filtering of unavailable tools.The function correctly filters out tools that are no longer available (e.g., server offline, removed tools) with appropriate console warnings for debugging.
113-174: LGTM: Settings correctly control tool option generation.The function properly uses settings to control UI behavior (hiding create button, disabling mandatory tools) while building option groups for the selector.
198-210: LGTM: Clean render implementation leveraging common props.The conditional rendering correctly uses settings to determine single vs. multi-select mode, and the
common_propsspread effectively eliminates duplication while maintaining proper value bindings for each mode.app/web_ui/src/routes/(app)/fine_tune/[project_id]/[task_id]/create_finetune/+page.svelte (3)
151-169: State persistence implementation looks good.The reactive block correctly persists form state to IndexedDB when
state_initializedis true, and the guard prevents premature execution. This works well with theclear_saved_statefunction (lines 182-201) which setsstate_initialized = falseto prevent the reactive writer from immediately repopulating the store—addressing the previous review concern.
806-823: RunConfigComponent integration looks well-structured.The component binding (line 807) enables the parent to call
clear_tools()(line 80), and the bidirectional tools binding (line 808) keepsselected_tool_idsin sync. The configuration properly adapts the UI for the fine-tuning context, hiding unnecessary elements and providing clear feedback when tools are disabled.
839-849: Good safeguard: dataset selector waits for state initialization.Conditionally rendering
SelectFinetuneDatasetonly afterstate_initializedis true (line 839) ensures thatsaved_dataset_idis properly loaded from IndexedDB before the component attempts to restore the dataset selection. This prevents race conditions where the dataset selector might render before the persisted dataset ID is available.
app/web_ui/src/routes/(app)/fine_tune/[project_id]/[task_id]/create_finetune/+page.svelte
Show resolved
Hide resolved
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 1
🧹 Nitpick comments (7)
app/desktop/studio_server/eval_api.py (1)
325-345: Avoid mutatingtask.run_configs()list when augmenting with finetune configsYou’re appending finetune-derived configs directly to the list returned by
task.run_configs(). If that method ever returns a cached/shared list, this can mutate internal state and potentially accumulate duplicates across calls.Consider copying before augmentation:
- async def get_run_configs(project_id: str, task_id: str) -> list[TaskRunConfig]: - # Returns all run configs of a given task. - task = task_from_id(project_id, task_id) - configs = task.run_configs() - - # Get run configs from finetunes - finetunes = task.finetunes() - for finetune in finetunes: + async def get_run_configs(project_id: str, task_id: str) -> list[TaskRunConfig]: + # Returns all run configs of a given task, plus finetune-derived run configs. + task = task_from_id(project_id, task_id) + # Use a shallow copy to avoid mutating the Task's internal collection. + configs = list(task.run_configs()) + + # Get run configs from finetunes + for finetune in task.finetunes(): if finetune.run_config is not None: configs.append( TaskRunConfig( id=f"finetune_run_config::{finetune.id}", # this id is not persisted name=finetune.name, description=finetune.description, run_config_properties=finetune.run_config, ) )This keeps the endpoint behavior the same while removing any reliance on
run_configs()returning a fresh list each time.app/web_ui/src/lib/api_schema.d.ts (3)
1578-1594: New/run_configs/path looks correct; consider trailing-slash consistency if not intentionalThe new GET
/api/projects/{project_id}/tasks/{task_id}/run_configs/endpoint wiring and typing match the corresponding operation andTaskRunConfig[]response, so the schema surface looks consistent. The only nit is the trailing/in the path, which differs from nearby task-scoped routes (task_run_configs,evals, etc.). If consistency across routes matters, you may want to normalize this in the FastAPI/OpenAPI definition and regenerate, otherwise this is fine as-is. Based on learnings, any change should be done in the backend schema, not in this generated file.
3899-3901:FinetuneDatasetInfoeligibility fields give a clear filtered viewAdding
eligible_datasetsandeligible_finetune_tagsalongside the existing collections is a nice way to distinguish “all known” vs “eligible under current constraints” in the UI. Just ensure the backend docs/comments clearly describe how eligibility is computed (e.g., tool support, data strategy) so future consumers don’t conflate these with the existing fields. Any doc tweaks must go in the backend schema and then be regenerated here.
9158-9189: Newget_run_configsoperation mirrors the existing task run-configs API
get_run_configs_api_projects__project_id__tasks__task_id__run_configs__getcleanly returnsTaskRunConfig[], matching the earlierget_task_run_configs...operation. Having both endpoints is fine if you’re standardizing naming for new consumers while keeping the old route for compatibility; if not, it might be worth documenting the intended distinction between the two in the backend/OpenAPI comments. The generated TypeScript wiring itself looks correct.app/web_ui/src/lib/ui/run_config_component/saved_run_configs_dropdown.svelte (1)
134-151: Check whether default finetune configs should appear in two groupsRight now a finetune run config that is also the task’s
default_run_config_idwill:
- Be added to
saved_configuration_optionsas"... (Default)"under Saved Configurations.- Also appear under the Fine-Tune Configuration group via
finetune_run_configs.If duplication isn’t intended, you may want to either:
- Skip pushing finetune defaults into
saved_configuration_options, or- Mark default status only in the finetune group (and keep them excluded from “Saved Configurations”).
Also, the comment at line 180 (“only if selected_finetune_id is set”) doesn’t match the current logic, which purely checks whether any finetune run configs exist. Consider updating it to reflect the actual behavior to avoid confusion.
Also applies to: 180-197
app/web_ui/src/lib/ui/run_config_component/run_config_component.svelte (2)
40-52: New props for tools selector settings, tools, and model visibility are consistent
tools_selector_settings: Partial<ToolsSelectorSettings> = {}cleanly exposes configuration forToolsSelectorwhile allowing call sites to omit fields.- Exporting
tools: string[] = []matches thebind:toolsusage and lets parents query/reset tools via the new helpers.hide_model_selectorprovides a clear toggle for hiding the model dropdown (useful for finetune flows or constrained UIs).These additions expand the public API in a coherent way; just ensure call sites document how/when to rely on
hide_model_selectorandtools_selector_settings.
61-61: Finetune run-config handling and is_updating_from_saved_config guard look correct
is_updating_from_saved_configis set before applying a saved config and cleared only after anawait tick(), which correctly preventsreset_to_custom_options_if_neededfrom flippingselected_run_config_idback to"custom"while you’re programmatically updating state.- Finetune IDs are derived from
selected_run_config.idassuming thefinetune_run_config::finetune_idconvention, and you consistently synthesize:
- Model:
kiln_fine_tune/${project_id}::${current_task.id}::${finetune_id}- Prompt:
fine_tune_prompt::${project_id}::${current_task.id}::${finetune_id}- The finetune branch in
reset_to_custom_options_if_neededcomparesmodel,prompt_method, andprovideragainst those canonical values, plus temperature/top_p/structured_output_mode/tools viaarrays_equal, which should reliably detect manual edits and switch back to"custom".Minor cleanups you might consider:
- The explicit
provider = selected_run_config.run_config_properties.model_provider_nameinside the finetune branch is effectively overridden by the reactive$: provider = model ? model.split("/")[0] : ""and can likely be removed to reduce confusion.- As with the other file, the
'finetune_run_config::'prefix is hard-coded in two places here; centralizing it in a shared constant would help keep this in sync if the prefix ever changes.Also applies to: 117-149, 160-165, 175-205
📜 Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (5)
app/desktop/studio_server/eval_api.py(1 hunks)app/web_ui/src/lib/api_schema.d.ts(7 hunks)app/web_ui/src/lib/stores/run_configs_store.ts(1 hunks)app/web_ui/src/lib/ui/run_config_component/run_config_component.svelte(8 hunks)app/web_ui/src/lib/ui/run_config_component/saved_run_configs_dropdown.svelte(3 hunks)
🧰 Additional context used
📓 Path-based instructions (9)
app/web_ui/**/*.svelte
📄 CodeRabbit inference engine (.cursor/rules/project.mdc)
app/web_ui/**/*.svelte: Use Svelte v4 (not v5) for the web frontend UI
Use DaisyUI for UI components in the web frontendUse Svelte v4 (not v5) for the web UI framework
Files:
app/web_ui/src/lib/ui/run_config_component/saved_run_configs_dropdown.svelteapp/web_ui/src/lib/ui/run_config_component/run_config_component.svelte
app/web_ui/**/*.{svelte,ts}
📄 CodeRabbit inference engine (.cursor/rules/project.mdc)
Use Tailwind CSS for styling in the web frontend
Files:
app/web_ui/src/lib/ui/run_config_component/saved_run_configs_dropdown.svelteapp/web_ui/src/lib/stores/run_configs_store.tsapp/web_ui/src/lib/ui/run_config_component/run_config_component.svelteapp/web_ui/src/lib/api_schema.d.ts
app/web_ui/**/*.{svelte,css,ts,tsx}
📄 CodeRabbit inference engine (AGENTS.md)
Use Tailwind and DaisyUI for styling the frontend
Files:
app/web_ui/src/lib/ui/run_config_component/saved_run_configs_dropdown.svelteapp/web_ui/src/lib/stores/run_configs_store.tsapp/web_ui/src/lib/ui/run_config_component/run_config_component.svelteapp/web_ui/src/lib/api_schema.d.ts
app/web_ui/**/*.{svelte,ts,tsx}
📄 CodeRabbit inference engine (AGENTS.md)
Make FastAPI backend calls from the Svelte web app to communicate with backend servers
Files:
app/web_ui/src/lib/ui/run_config_component/saved_run_configs_dropdown.svelteapp/web_ui/src/lib/stores/run_configs_store.tsapp/web_ui/src/lib/ui/run_config_component/run_config_component.svelteapp/web_ui/src/lib/api_schema.d.ts
**/*.py
📄 CodeRabbit inference engine (.cursor/rules/project.mdc)
**/*.py: Use asyncio for asynchronous operations in Python
Use Pydantic v2 (not v1) for data validation
Files:
app/desktop/studio_server/eval_api.py
app/desktop/**/*.py
📄 CodeRabbit inference engine (AGENTS.md)
Use python 3.13 for the desktop app
Files:
app/desktop/studio_server/eval_api.py
app/desktop/studio_server/**/*.py
📄 CodeRabbit inference engine (AGENTS.md)
Use FastAPI for the desktop app server that extends libs/server with web app-specific APIs
Files:
app/desktop/studio_server/eval_api.py
app/web_ui/**/*.ts
📄 CodeRabbit inference engine (.cursor/rules/project.mdc)
Use TypeScript for frontend type safety
Files:
app/web_ui/src/lib/stores/run_configs_store.tsapp/web_ui/src/lib/api_schema.d.ts
app/web_ui/**/*.{ts,tsx}
📄 CodeRabbit inference engine (AGENTS.md)
Use TypeScript for frontend development
Files:
app/web_ui/src/lib/stores/run_configs_store.tsapp/web_ui/src/lib/api_schema.d.ts
🧠 Learnings (6)
📚 Learning: 2025-10-07T00:21:33.537Z
Learnt from: sfierro
Repo: Kiln-AI/Kiln PR: 698
File: app/web_ui/src/routes/(app)/evals/[project_id]/[task_id]/compare/+page.svelte:800-812
Timestamp: 2025-10-07T00:21:33.537Z
Learning: In the Kiln web UI codebase, the `save_new_task_run_config` function in `app/web_ui/src/lib/stores/run_configs_store.ts` automatically refreshes the run configs store by calling `load_task_run_configs(project_id, task_id, true)` with force_refresh=true after successfully creating a new run config. This means callers of save_new_task_run_config do not need to manually refresh the store afterward, as the store will already contain the newly created config.
Applied to files:
app/web_ui/src/lib/ui/run_config_component/saved_run_configs_dropdown.svelteapp/web_ui/src/lib/stores/run_configs_store.tsapp/web_ui/src/lib/ui/run_config_component/run_config_component.svelte
📚 Learning: 2025-11-11T00:07:51.535Z
Learnt from: sfierro
Repo: Kiln-AI/Kiln PR: 756
File: app/web_ui/src/lib/ui/run_config_component/tools_selector.svelte:19-20
Timestamp: 2025-11-11T00:07:51.535Z
Learning: In app/web_ui/src/lib/ui/run_config_component/tools_selector.svelte, the single_select_selected_tool prop should NOT be synchronized with the tools_store. Only the multi-select tools array should persist to the store. The single_select mode is intended to use ephemeral component-local state.
Applied to files:
app/web_ui/src/lib/ui/run_config_component/run_config_component.svelte
📚 Learning: 2025-08-08T15:50:45.334Z
Learnt from: leonardmq
Repo: Kiln-AI/Kiln PR: 418
File: libs/core/kiln_ai/adapters/ml_model_list.py:221-228
Timestamp: 2025-08-08T15:50:45.334Z
Learning: In libs/core/kiln_ai/adapters/ml_model_list.KilnModelProvider, naming policy: keep cross-provider feature flags unprefixed (e.g., reasoning_optional_for_structured_output) to allow reuse across providers; prefix provider-specific toggles with the provider name (e.g., siliconflow_enable_thinking) when the implementation is specific and potentially unsafe to generalize.
Applied to files:
app/web_ui/src/lib/ui/run_config_component/run_config_component.svelte
📚 Learning: 2025-10-29T06:59:55.635Z
Learnt from: leonardmq
Repo: Kiln-AI/Kiln PR: 736
File: app/web_ui/src/lib/api_schema.d.ts:8143-8180
Timestamp: 2025-10-29T06:59:55.635Z
Learning: app/web_ui/src/lib/api_schema.d.ts is generated by openapi-typescript; do not propose manual edits. Schema changes should be made in the FastAPI backend (e.g., app/desktop/studio_server/data_gen_api.py or libs/server/kiln_server/*), then re-generate the TS types.
Applied to files:
app/web_ui/src/lib/api_schema.d.ts
📚 Learning: 2025-10-29T11:19:31.059Z
Learnt from: leonardmq
Repo: Kiln-AI/Kiln PR: 759
File: app/web_ui/src/lib/api_schema.d.ts:2485-2490
Timestamp: 2025-10-29T11:19:31.059Z
Learning: In libs/server/kiln_server/document_api.py and the generated app/web_ui/src/lib/api_schema.d.ts, CreateVectorStoreConfigRequest.properties is now a union of three distinct public schemas: LanceDBConfigFTSPropertiesPublic, LanceDBConfigVectorPropertiesPublic, and LanceDBConfigHybridPropertiesPublic. These are intentionally split to allow independent evolution; do not suggest merging them.
Applied to files:
app/web_ui/src/lib/api_schema.d.ts
📚 Learning: 2025-09-04T06:27:58.852Z
Learnt from: leonardmq
Repo: Kiln-AI/Kiln PR: 487
File: app/web_ui/src/lib/api_schema.d.ts:3315-3320
Timestamp: 2025-09-04T06:27:58.852Z
Learning: The file app/web_ui/src/lib/api_schema.d.ts is auto-generated by openapi-typescript from the backend API schema and should never be manually edited. Changes to API types should be made in the backend, which will then be reflected in the regenerated TypeScript definitions.
Applied to files:
app/web_ui/src/lib/api_schema.d.ts
🧬 Code graph analysis (1)
app/desktop/studio_server/eval_api.py (3)
libs/server/kiln_server/task_api.py (1)
task_from_id(16-25)libs/core/kiln_ai/datamodel/eval.py (1)
configs(357-358)libs/core/kiln_ai/datamodel/task.py (1)
run_configs(172-173)
🪛 GitHub Actions: Format and Lint
app/web_ui/src/lib/ui/run_config_component/saved_run_configs_dropdown.svelte
[error] 152-152: "Exlcude" is a misspelling of "Exclude".
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (9)
- GitHub Check: Build Desktop Apps (ubuntu-22.04-arm)
- GitHub Check: Build Desktop Apps (macos-15-intel)
- GitHub Check: Build Desktop Apps (macos-latest)
- GitHub Check: Build Desktop Apps (windows-latest)
- GitHub Check: Build Desktop Apps (ubuntu-22.04)
- GitHub Check: Build, Typecheck, and Test Python (3.13)
- GitHub Check: Build, Typecheck, and Test Python (3.11)
- GitHub Check: Build, Typecheck, and Test Python (3.10)
- GitHub Check: Build, Typecheck, and Test Python (3.12)
🔇 Additional comments (10)
app/web_ui/src/lib/stores/run_configs_store.ts (1)
58-69: Path update correctly aligned with new run_configs endpointSwitching
load_task_run_configsto call/tasks/{task_id}/run_configs/matches the new backend route and keeps all existing loading/error semantics intact. No further changes needed here.app/web_ui/src/lib/api_schema.d.ts (4)
2634-2634: Optionalrun_config_propertiesonCreateFinetuneRequestis a reasonable, backwards‑compatible extensionAdding
run_config_properties?: RunConfigProperties | null;cleanly reuses the shared run-config shape and keeps the field optional for callers that don’t yet send it. Frontend code consuming this should treat it as nullable/optional and handle the “no run config attached” case explicitly, but the schema change itself looks good. Any future tweaks should be made in the backend model and regenerated here.
3882-3883: PersistedFinetune.run_configaligns with the new creation request
run_config?: RunConfigProperties | null;onFinetunemirrors the creation request and will let the UI reason about which run configuration a fine-tune was built from. Keeping it optional preserves compatibility with older records. This addition looks consistent with the rest of the RunConfig wiring.
3945-3948:supports_function_callingon finetune models is appropriate; required boolean is fineThe new
supports_function_calling: boolean;flag onFinetuneProviderModelmatches howModelDetailsalready exposes this capability, and the@default truecomment makes sense for providers where function-calling is the norm. Since the field is required in the TS type, be sure all provider entries (including any persisted config) populate it to avoid surprises at runtime for older data. Any change to required/optional-ness should be done in the backend model and regenerated.
9019-9021:tool_idsquery param onfinetune_dataset_infois a sensible filter hookAllowing
tool_ids?: string[] | null;on thefinetune_dataset_infoGET parameters is a straightforward way to drive the newly addedeligible_*fields from the UI. The typing looks correct; just ensure callers always pass an array (even singleton) rather than a comma‑separated string, per this schema.app/web_ui/src/lib/ui/run_config_component/saved_run_configs_dropdown.svelte (1)
32-32: provider_name_from_id import is appropriateImporting
provider_name_from_idfrom$lib/storesmatches how you resolve provider display names elsewhere; no issues here.app/web_ui/src/lib/ui/run_config_component/run_config_component.svelte (4)
31-32: Using shared arrays_equal and typed ToolsSelectorSettings looks goodImporting
arrays_equalfrom$lib/utils/collectionsandToolsSelectorSettingsfor the new settings prop is a solid move toward centralizing utilities and tightening types.
303-309: get_tools returns a defensive copy and clear_tools resets correctly
get_tools()returning[...tools]avoids accidental external mutation of internal component state.clear_tools()centralizes tools clearing logic and works with the exportedtoolsprop.This is a good, minimal public surface for interacting with tools state.
313-321: hide_model_selector conditional around AvailableModelsDropdown is appropriateWrapping
<AvailableModelsDropdown>in{#if !hide_model_selector}cleanly supports flows where the model is fixed or implied (e.g., finetunes) without changing the underlyingmodelstate logic. Just ensure callers that rely onget_selected_model()are aware it can now returnnullwhen the selector is hidden.
330-336: Passing tools_selector_settings into ToolsSelector in both locations aligns with the new APIUsing:
<ToolsSelector bind:tools {project_id} task_id={current_task?.id ?? null} settings={tools_selector_settings} />in both the main and “Advanced Options” branches keeps ToolsSelector configuration consistent regardless of layout, while still only binding the
toolsarray.Based on learnings, this preserves the intended contract where only the multi-select
toolsarray persists to the store and any single-select mode stays ephemeral withinToolsSelector.Also applies to: 350-356
app/web_ui/src/lib/ui/run_config_component/saved_run_configs_dropdown.svelte
Outdated
Show resolved
Hide resolved
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 1
♻️ Duplicate comments (1)
app/web_ui/src/lib/ui/run_config_component/saved_run_configs_dropdown.svelte (1)
152-161: Centralize thefinetune_run_config::prefix instead of hard‑codingThe filter correctly excludes finetune configs from the generic “other task run configs” list, but the
"finetune_run_config::"prefix is now hard‑coded here and again below in the finetune group builder. To avoid drift if the prefix ever changes, consider introducing a shared constant (e.g.FINETUNE_RUN_CONFIG_PREFIX) in a common module and using it in both filters.
📜 Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (1)
app/web_ui/src/lib/ui/run_config_component/saved_run_configs_dropdown.svelte(3 hunks)
🧰 Additional context used
📓 Path-based instructions (4)
app/web_ui/**/*.svelte
📄 CodeRabbit inference engine (.cursor/rules/project.mdc)
app/web_ui/**/*.svelte: Use Svelte v4 (not v5) for the web frontend UI
Use DaisyUI for UI components in the web frontendUse Svelte v4 (not v5) for the web UI framework
Files:
app/web_ui/src/lib/ui/run_config_component/saved_run_configs_dropdown.svelte
app/web_ui/**/*.{svelte,ts}
📄 CodeRabbit inference engine (.cursor/rules/project.mdc)
Use Tailwind CSS for styling in the web frontend
Files:
app/web_ui/src/lib/ui/run_config_component/saved_run_configs_dropdown.svelte
app/web_ui/**/*.{svelte,css,ts,tsx}
📄 CodeRabbit inference engine (AGENTS.md)
Use Tailwind and DaisyUI for styling the frontend
Files:
app/web_ui/src/lib/ui/run_config_component/saved_run_configs_dropdown.svelte
app/web_ui/**/*.{svelte,ts,tsx}
📄 CodeRabbit inference engine (AGENTS.md)
Make FastAPI backend calls from the Svelte web app to communicate with backend servers
Files:
app/web_ui/src/lib/ui/run_config_component/saved_run_configs_dropdown.svelte
🧠 Learnings (2)
📚 Learning: 2025-10-07T00:21:33.537Z
Learnt from: sfierro
Repo: Kiln-AI/Kiln PR: 698
File: app/web_ui/src/routes/(app)/evals/[project_id]/[task_id]/compare/+page.svelte:800-812
Timestamp: 2025-10-07T00:21:33.537Z
Learning: In the Kiln web UI codebase, the `save_new_task_run_config` function in `app/web_ui/src/lib/stores/run_configs_store.ts` automatically refreshes the run configs store by calling `load_task_run_configs(project_id, task_id, true)` with force_refresh=true after successfully creating a new run config. This means callers of save_new_task_run_config do not need to manually refresh the store afterward, as the store will already contain the newly created config.
Applied to files:
app/web_ui/src/lib/ui/run_config_component/saved_run_configs_dropdown.svelte
📚 Learning: 2025-09-25T07:20:11.459Z
Learnt from: leonardmq
Repo: Kiln-AI/Kiln PR: 650
File: app/web_ui/src/routes/(app)/docs/library/[project_id]/[document_id]/+page.svelte:19-19
Timestamp: 2025-09-25T07:20:11.459Z
Learning: When analyzing relative import paths in SvelteKit route structures, be more careful to verify the actual directory structure exists before suggesting corrections. The import path ../../../../generate/[project_id]/[task_id]/table_button.svelte from app/web_ui/src/routes/(app)/docs/library/[project_id]/[document_id]/+page.svelte correctly resolves to the existing file at app/web_ui/src/routes/(app)/generate/[project_id]/[task_id]/table_button.svelte.
Applied to files:
app/web_ui/src/lib/ui/run_config_component/saved_run_configs_dropdown.svelte
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (5)
- GitHub Check: Build Desktop Apps (windows-latest)
- GitHub Check: Build Desktop Apps (ubuntu-22.04)
- GitHub Check: Build Desktop Apps (ubuntu-22.04-arm)
- GitHub Check: Build Desktop Apps (macos-15-intel)
- GitHub Check: Build Desktop Apps (macos-latest)
🔇 Additional comments (1)
app/web_ui/src/lib/ui/run_config_component/saved_run_configs_dropdown.svelte (1)
32-32: Provider label import looks goodUsing
provider_name_from_idhere to enrich finetune labels with provider names is a clean reuse of existing store logic; no issues from this change alone.
| // Add finetune run configs only if selected_finetune_id is set | ||
| const finetune_run_configs = ( | ||
| run_configs_by_task_composite_id[ | ||
| get_task_composite_id(project_id, current_task.id ?? "") | ||
| ] ?? [] | ||
| ).filter((config) => config.id?.startsWith("finetune_run_config::")) | ||
| if (finetune_run_configs.length > 0) { | ||
| options.push({ | ||
| label: "Fine-Tune Configuration", | ||
| options: finetune_run_configs.map((config) => ({ | ||
| value: config.id ?? "", | ||
| label: `${config.name} (${provider_name_from_id(config.run_config_properties.model_provider_name)})`, | ||
| description: `Base Model: ${config.run_config_properties.model_name}, | ||
| Prompt: ${getRunConfigPromptDisplayName(config, current_task_prompts)}`, | ||
| })), | ||
| }) | ||
| } |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Fix misleading comment and tighten finetune section semantics
- The comment
// Add finetune run configs only if selected_finetune_id is setis stale: there is noselected_finetune_id, and the block actually runs whenever any finetune configs exist. Suggest updating to something like// Add finetune run configs if any existto match behavior. - (Optional) The group label
"Fine-Tune Configuration"is singular but may contain multiple configs; consider"Fine-Tune Configurations"for clarity.
🤖 Prompt for AI Agents
In app/web_ui/src/lib/ui/run_config_component/saved_run_configs_dropdown.svelte
around lines 180 to 197, the inline comment is stale and misleading — it says
"// Add finetune run configs only if selected_finetune_id is set" but there is
no selected_finetune_id and the block actually adds finetune configs whenever
any exist; update the comment to accurately reflect behavior (e.g., "// Add
finetune run configs if any exist"). Also consider changing the group label from
the singular "Fine-Tune Configuration" to the plural "Fine-Tune Configurations"
to match that the options array may contain multiple entries; update the label
string accordingly.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 0
Caution
Some comments are outside the diff and can’t be posted inline due to platform limitations.
⚠️ Outside diff range comments (1)
app/web_ui/src/routes/(app)/fine_tune/[project_id]/[task_id]/create_finetune/+page.svelte (1)
619-659: Bug: R1‑disabled download options never populatedata_strategy_select_optionsIn
update_data_strategies_supported, the special‑case branch for R1‑disabled downloads:if (r1_disabled_for_downloads.includes(model_provider)) { return ["final_only", "two_message_cot"] }returns an array that is never used by the caller. As a result, for those download options (
download_huggingface_chat_template_toolcall,download_jsonl_toolcall,download_vertex_gemini),data_strategy_select_optionsanddata_strategynever get updated and will stay at their previous values (or empty on first selection).You can fix this by assigning to
compatible_data_strategiesinstead of returning early, e.g.:- const r1_disabled_for_downloads = [ + const r1_disabled_for_downloads = [ "download_huggingface_chat_template_toolcall", "download_jsonl_toolcall", "download_vertex_gemini", ] - if (r1_disabled_for_downloads.includes(model_provider)) { - return ["final_only", "two_message_cot"] - } - - const compatible_data_strategies: ChatStrategy[] = is_download - ? [ - "final_only", - "two_message_cot", - "final_and_intermediate_r1_compatible", - ] - : available_models + let compatible_data_strategies: ChatStrategy[] + + if (is_download && r1_disabled_for_downloads.includes(model_provider)) { + compatible_data_strategies = ["final_only", "two_message_cot"] + } else if (is_download) { + compatible_data_strategies = [ + "final_only", + "two_message_cot", + "final_and_intermediate_r1_compatible", + ] + } else { + compatible_data_strategies = + available_models ?.map((model) => model.models) .flat() .find((model) => model.id === base_model_id) - ?.data_strategies_supported ?? [] + ?.data_strategies_supported ?? [] + }Optionally, you might also only reset
data_strategyto the first compatible option if the currentdata_strategyis not incompatible_data_strategies, to better respect restored state:- data_strategy = compatible_data_strategies[0] + if (!compatible_data_strategies.includes(data_strategy)) { + data_strategy = compatible_data_strategies[0] + }
♻️ Duplicate comments (1)
app/web_ui/src/routes/(app)/fine_tune/[project_id]/[task_id]/create_finetune/+page.svelte (1)
74-85: Dataset selection can remain stale when tools are auto‑clearedWhen a user switches to a model without tool‑calling support, the reactive block at Lines 74‑81 clears tools via
run_config_component.clear_tools(), but any previously selected dataset (Lines 836‑845) is left unchanged, including when it was chosen under a specificrequired_tool_idsfilter.This can leave the form in a state where:
- No tools are selected, but
selected_datasetis still a tools‑filtered dataset that may no longer match the current configuration.Consider also clearing or re‑validating
selected_datasetwhen tools are cleared, for example:$: if ( selected_model && !selected_model.model.supports_function_calling && run_config_component ) { // Clear tools if the model doesn't support function calling run_config_component.clear_tools() + + // Clear dataset if it was selected while tools were required + if (selected_dataset && selected_tool_ids.length === 0) { + selected_dataset = null + } }This keeps dataset and tool configuration in sync with the selected model.
Also applies to: 836-845
🧹 Nitpick comments (2)
app/desktop/studio_server/test_eval_api.py (1)
1889-1975: Consider adding test coverage forpendingstatus.The test covers
completed,running,unknown, andfailedstatuses, butFineTuneStatusType.pendingis not tested. Based on the implementation ineval_api.py, finetunes withpendingstatus would be included in the results (since onlyunknownandfailedare excluded). Adding a test case forpendingwould ensure this behavior is verified.Finetune( + id="ft_pending", + name="Pending Finetune", + provider="openai", + base_model_id="model6", + dataset_split_id="split6", + system_message="System message", + latest_status=FineTuneStatusType.pending, + run_config=run_config_props, + parent=mock_task, + ), + Finetune( id="ft_no_run_config",And add the assertion:
assert "finetune_run_config::ft_running" in config_ids + assert "finetune_run_config::ft_pending" in config_ids assert "finetune_run_config::ft_unknown" not in config_idsapp/desktop/studio_server/eval_api.py (1)
326-350: LGTM overall. Minor suggestions for consistency.The endpoint correctly filters out finetunes with
unknownorfailedstatus and those without arun_config. A couple of observations:
The endpoint path has a trailing slash (
/run_configs/) while similar endpoints like/task_run_configs(line 319) don't. Consider removing the trailing slash for consistency unless this is intentional.The condition could be slightly more readable using a set membership check:
- @app.get("/api/projects/{project_id}/tasks/{task_id}/run_configs/") + @app.get("/api/projects/{project_id}/tasks/{task_id}/run_configs") async def get_run_configs(project_id: str, task_id: str) -> list[TaskRunConfig]: # Returns all run configs of a given task. task = task_from_id(project_id, task_id) configs = task.run_configs() # Get run configs from finetunes finetunes = task.finetunes() for finetune in finetunes: # Only include if the finetune is not unknown or failed if ( - finetune.latest_status != FineTuneStatusType.unknown - and finetune.latest_status != FineTuneStatusType.failed + finetune.latest_status not in (FineTuneStatusType.unknown, FineTuneStatusType.failed) and finetune.run_config is not None ):
📜 Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (3)
app/desktop/studio_server/eval_api.py(2 hunks)app/desktop/studio_server/test_eval_api.py(3 hunks)app/web_ui/src/routes/(app)/fine_tune/[project_id]/[task_id]/create_finetune/+page.svelte(12 hunks)
🧰 Additional context used
📓 Path-based instructions (8)
**/*.py
📄 CodeRabbit inference engine (.cursor/rules/project.mdc)
**/*.py: Use asyncio for asynchronous operations in Python
Use Pydantic v2 (not v1) for data validation
Files:
app/desktop/studio_server/eval_api.pyapp/desktop/studio_server/test_eval_api.py
app/desktop/**/*.py
📄 CodeRabbit inference engine (AGENTS.md)
Use python 3.13 for the desktop app
Files:
app/desktop/studio_server/eval_api.pyapp/desktop/studio_server/test_eval_api.py
app/desktop/studio_server/**/*.py
📄 CodeRabbit inference engine (AGENTS.md)
Use FastAPI for the desktop app server that extends libs/server with web app-specific APIs
Files:
app/desktop/studio_server/eval_api.pyapp/desktop/studio_server/test_eval_api.py
**/*test*.py
📄 CodeRabbit inference engine (.cursor/rules/project.mdc)
Use pytest for testing Python code
Use pytest for testing in python projects
Files:
app/desktop/studio_server/test_eval_api.py
app/web_ui/**/*.svelte
📄 CodeRabbit inference engine (.cursor/rules/project.mdc)
app/web_ui/**/*.svelte: Use Svelte v4 (not v5) for the web frontend UI
Use DaisyUI for UI components in the web frontendUse Svelte v4 (not v5) for the web UI framework
Files:
app/web_ui/src/routes/(app)/fine_tune/[project_id]/[task_id]/create_finetune/+page.svelte
app/web_ui/**/*.{svelte,ts}
📄 CodeRabbit inference engine (.cursor/rules/project.mdc)
Use Tailwind CSS for styling in the web frontend
Files:
app/web_ui/src/routes/(app)/fine_tune/[project_id]/[task_id]/create_finetune/+page.svelte
app/web_ui/**/*.{svelte,css,ts,tsx}
📄 CodeRabbit inference engine (AGENTS.md)
Use Tailwind and DaisyUI for styling the frontend
Files:
app/web_ui/src/routes/(app)/fine_tune/[project_id]/[task_id]/create_finetune/+page.svelte
app/web_ui/**/*.{svelte,ts,tsx}
📄 CodeRabbit inference engine (AGENTS.md)
Make FastAPI backend calls from the Svelte web app to communicate with backend servers
Files:
app/web_ui/src/routes/(app)/fine_tune/[project_id]/[task_id]/create_finetune/+page.svelte
🧠 Learnings (3)
📚 Learning: 2025-10-22T04:31:20.500Z
Learnt from: leonardmq
Repo: Kiln-AI/Kiln PR: 739
File: app/web_ui/src/routes/(app)/generate/[project_id]/[task_id]/synth/+page.svelte:170-177
Timestamp: 2025-10-22T04:31:20.500Z
Learning: In Svelte components for the Kiln project, leonardmq prefers tracking previous query parameter values and comparing them in reactive blocks (rather than just checking for existence) to avoid duplicate execution on initial load while still reacting to subsequent navigation changes. This pattern supports both current behavior and future feature additions like mode switching.
Applied to files:
app/web_ui/src/routes/(app)/fine_tune/[project_id]/[task_id]/create_finetune/+page.svelte
📚 Learning: 2025-11-11T00:07:51.535Z
Learnt from: sfierro
Repo: Kiln-AI/Kiln PR: 756
File: app/web_ui/src/lib/ui/run_config_component/tools_selector.svelte:19-20
Timestamp: 2025-11-11T00:07:51.535Z
Learning: In app/web_ui/src/lib/ui/run_config_component/tools_selector.svelte, the single_select_selected_tool prop should NOT be synchronized with the tools_store. Only the multi-select tools array should persist to the store. The single_select mode is intended to use ephemeral component-local state.
Applied to files:
app/web_ui/src/routes/(app)/fine_tune/[project_id]/[task_id]/create_finetune/+page.svelte
📚 Learning: 2025-10-07T00:21:33.537Z
Learnt from: sfierro
Repo: Kiln-AI/Kiln PR: 698
File: app/web_ui/src/routes/(app)/evals/[project_id]/[task_id]/compare/+page.svelte:800-812
Timestamp: 2025-10-07T00:21:33.537Z
Learning: In the Kiln web UI codebase, the `save_new_task_run_config` function in `app/web_ui/src/lib/stores/run_configs_store.ts` automatically refreshes the run configs store by calling `load_task_run_configs(project_id, task_id, true)` with force_refresh=true after successfully creating a new run config. This means callers of save_new_task_run_config do not need to manually refresh the store afterward, as the store will already contain the newly created config.
Applied to files:
app/web_ui/src/routes/(app)/fine_tune/[project_id]/[task_id]/create_finetune/+page.svelte
🧬 Code graph analysis (2)
app/desktop/studio_server/eval_api.py (4)
libs/core/kiln_ai/datamodel/datamodel_enums.py (1)
FineTuneStatusType(52-61)libs/server/kiln_server/task_api.py (1)
task_from_id(16-25)libs/core/kiln_ai/datamodel/eval.py (1)
configs(357-358)libs/core/kiln_ai/datamodel/task.py (1)
run_configs(172-173)
app/desktop/studio_server/test_eval_api.py (1)
libs/core/kiln_ai/datamodel/datamodel_enums.py (1)
FineTuneStatusType(52-61)
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (9)
- GitHub Check: Build, Typecheck, and Test Python (3.10)
- GitHub Check: Build Desktop Apps (macos-15-intel)
- GitHub Check: Build Desktop Apps (ubuntu-22.04-arm)
- GitHub Check: Build Desktop Apps (ubuntu-22.04)
- GitHub Check: Build Desktop Apps (macos-latest)
- GitHub Check: Build, Typecheck, and Test Python (3.11)
- GitHub Check: Build, Typecheck, and Test Python (3.12)
- GitHub Check: Build Desktop Apps (windows-latest)
- GitHub Check: Build, Typecheck, and Test Python (3.13)
🔇 Additional comments (3)
app/desktop/studio_server/eval_api.py (1)
11-11: Import addition looks correct.The
FineTuneStatusTypeimport is appropriately added and used for the finetune status filtering logic.app/web_ui/src/routes/(app)/fine_tune/[project_id]/[task_id]/create_finetune/+page.svelte (2)
92-113: SavedFinetuneState +state_initializedgating now correctly prevent repopulating cleared draftsThe IndexedDB‑backed
SavedFinetuneStateplus thestate_initializedguard (Lines 92‑113, 151‑201, 229‑244) correctly avoid re‑writing cleared state:
- The reactive writer only runs when
state_initializedistrue.clear_saved_state()flipsstate_initializedtofalsebefore updatingsaved_state, so the$:block no longer repopulates it.- After mount, you restore from
$saved_state(if any) and only then setstate_initialized = true, so initialization and persistence are nicely separated.- Calling
clear_saved_state()after a successful finetune (Lines 514‑516) clears the persisted draft without needing to reload.This addresses the earlier issue where clearing was immediately undone by the reactive writer while preserving a straightforward path to re‑enable autosave on a fresh page load.
Also applies to: 151-201, 229-244, 514-516
52-57: RunConfigComponent integration and tool plumbing look consistent with existing patternsThe new
RunConfigComponentwiring and tool plumbing are coherent:
bind:tools={selected_tool_ids}keeps the multi‑select tools array as the single source of truth on this page (Lines 83‑87, 803‑820), aligning with the prior guidance that only the multi‑select tools array should persist, not any single‑select state.disabled_tools_selectorplus the reactiveclear_tools()call correctly enforce “no tools” when the selected model doesn’t support function calling (Lines 74‑85, 803‑820).run_config_propertiesis assembled fromrun_config_component.run_options_as_run_config_properties()and enriched withmodel_name(derived frombase_model_id) andmodel_provider_namebefore being sent to the finetune API (Lines 447‑457, 487).- posthog metrics (
supports_tools,tool_count) and the post‑createload_task_run_configs(...)refresh (Lines 501‑513) keep analytics and the run‑config store aligned with the new finetune.This setup cleanly connects the shared RunConfigComponent to finetune creation and tool‑aware analytics.
Also applies to: 83-87, 447-457, 501-503, 503-513, 803-820
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 0
🧹 Nitpick comments (1)
libs/core/kiln_ai/adapters/fine_tune/base_finetune.py (1)
8-9: Run‑config propagation into finetunes is wired correctly
- Adding
run_config: RunConfigProperties | Nonetocreate_and_startand threading it intoFinetuneModel(run_config=run_config)cleanly persists the originating run configuration with the finetune.- Setting
run_config.model_provider_name = ModelProviderName.kiln_fine_tuneandrun_config.model_name = datamodel.model_id()after_startensures downstream runs will correctly target the new fine‑tuned model rather than the base model, and the provider enum aligns with the existingModelProviderName.kiln_fine_tunevalue.- Note that when
run_configis provided, this now implicitly requires the finetune to have a valid Project/Task parent chain (otherwisemodel_id()will raise), which matches the newFinetune.model_id()contract and seems appropriate for persisted fine‑tunes.Based on learnings, this usage of
ModelProviderNameon run‑config properties matches how provider enums are used elsewhere in the codebase.Also applies to: 67-68, 96-110, 115-119
📜 Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (6)
app/desktop/studio_server/provider_api.py(1 hunks)app/desktop/studio_server/test_provider_api.py(1 hunks)libs/core/kiln_ai/adapters/fine_tune/base_finetune.py(3 hunks)libs/core/kiln_ai/adapters/fine_tune/test_base_finetune.py(4 hunks)libs/core/kiln_ai/datamodel/finetune.py(2 hunks)libs/core/kiln_ai/datamodel/test_models.py(1 hunks)
🚧 Files skipped from review as they are similar to previous changes (1)
- libs/core/kiln_ai/datamodel/finetune.py
🧰 Additional context used
📓 Path-based instructions (6)
**/*test*.py
📄 CodeRabbit inference engine (.cursor/rules/project.mdc)
Use pytest for testing Python code
Use pytest for testing in python projects
Files:
app/desktop/studio_server/test_provider_api.pylibs/core/kiln_ai/datamodel/test_models.pylibs/core/kiln_ai/adapters/fine_tune/test_base_finetune.py
**/*.py
📄 CodeRabbit inference engine (.cursor/rules/project.mdc)
**/*.py: Use asyncio for asynchronous operations in Python
Use Pydantic v2 (not v1) for data validation
Files:
app/desktop/studio_server/test_provider_api.pyapp/desktop/studio_server/provider_api.pylibs/core/kiln_ai/datamodel/test_models.pylibs/core/kiln_ai/adapters/fine_tune/base_finetune.pylibs/core/kiln_ai/adapters/fine_tune/test_base_finetune.py
app/desktop/**/*.py
📄 CodeRabbit inference engine (AGENTS.md)
Use python 3.13 for the desktop app
Files:
app/desktop/studio_server/test_provider_api.pyapp/desktop/studio_server/provider_api.py
app/desktop/studio_server/**/*.py
📄 CodeRabbit inference engine (AGENTS.md)
Use FastAPI for the desktop app server that extends libs/server with web app-specific APIs
Files:
app/desktop/studio_server/test_provider_api.pyapp/desktop/studio_server/provider_api.py
libs/core/**/*.py
📄 CodeRabbit inference engine (.cursor/rules/project.mdc)
Use Python 3.10+ for the core library (libs/core) and Python 3.13 for the desktop app
Use python 3.10+ for the core library (libs/core) and python 3.13 for the desktop app
Files:
libs/core/kiln_ai/datamodel/test_models.pylibs/core/kiln_ai/adapters/fine_tune/base_finetune.pylibs/core/kiln_ai/adapters/fine_tune/test_base_finetune.py
libs/**/*.py
📄 CodeRabbit inference engine (AGENTS.md)
Use Pydantic v2 (not v1) for data validation and serialization
Files:
libs/core/kiln_ai/datamodel/test_models.pylibs/core/kiln_ai/adapters/fine_tune/base_finetune.pylibs/core/kiln_ai/adapters/fine_tune/test_base_finetune.py
🧠 Learnings (4)
📓 Common learnings
Learnt from: chiang-daniel
Repo: Kiln-AI/Kiln PR: 781
File: app/web_ui/src/routes/(app)/fine_tune/[project_id]/[task_id]/create_finetune/+page.svelte:74-86
Timestamp: 2025-11-26T05:40:24.416Z
Learning: In the fine-tune creation flow (app/web_ui/src/routes/(app)/fine_tune/[project_id]/[task_id]/create_finetune/+page.svelte), it is acceptable for a dataset containing tool data to be used when training a model that doesn't support function calling. The training will proceed without using the tool call data from the dataset. This allows for dataset reuse across different model types.
📚 Learning: 2025-08-08T16:13:26.526Z
Learnt from: leonardmq
Repo: Kiln-AI/Kiln PR: 418
File: libs/core/kiln_ai/adapters/ml_model_list.py:1875-1890
Timestamp: 2025-08-08T16:13:26.526Z
Learning: In libs/core/kiln_ai/adapters/ml_model_list.py (Python), do not blanket-add r1_thinking/optional_r1_thinking parsers for R1-style models. Parser usage is provider-specific and must be based on observed responses in tests. For PR Kiln-AI/Kiln#418, deepseek_r1_0528_distill_qwen3_8b providers were validated without needing a parser.
Applied to files:
libs/core/kiln_ai/adapters/fine_tune/base_finetune.pylibs/core/kiln_ai/adapters/fine_tune/test_base_finetune.py
📚 Learning: 2025-08-22T11:15:53.584Z
Learnt from: leonardmq
Repo: Kiln-AI/Kiln PR: 413
File: libs/core/kiln_ai/adapters/embedding/embedding_registry.py:14-20
Timestamp: 2025-08-22T11:15:53.584Z
Learning: In the Kiln AI codebase, model_provider_name fields in datamodels like EmbeddingConfig are typed as ModelProviderName enum (which inherits from str, Enum), not as plain strings. Therefore, accessing .value on these fields is valid and returns the string representation of the enum value.
Applied to files:
libs/core/kiln_ai/adapters/fine_tune/base_finetune.py
📚 Learning: 2025-08-08T15:50:45.334Z
Learnt from: leonardmq
Repo: Kiln-AI/Kiln PR: 418
File: libs/core/kiln_ai/adapters/ml_model_list.py:221-228
Timestamp: 2025-08-08T15:50:45.334Z
Learning: In libs/core/kiln_ai/adapters/ml_model_list.KilnModelProvider, naming policy: keep cross-provider feature flags unprefixed (e.g., reasoning_optional_for_structured_output) to allow reuse across providers; prefix provider-specific toggles with the provider name (e.g., siliconflow_enable_thinking) when the implementation is specific and potentially unsafe to generalize.
Applied to files:
libs/core/kiln_ai/adapters/fine_tune/base_finetune.py
🧬 Code graph analysis (4)
app/desktop/studio_server/test_provider_api.py (3)
libs/core/kiln_ai/datamodel/finetune.py (1)
model_id(99-109)app/web_ui/src/lib/types.ts (1)
ModelProviderName(65-65)libs/core/kiln_ai/datamodel/datamodel_enums.py (1)
ModelProviderName(84-106)
app/desktop/studio_server/provider_api.py (1)
libs/core/kiln_ai/datamodel/finetune.py (1)
model_id(99-109)
libs/core/kiln_ai/datamodel/test_models.py (2)
libs/core/kiln_ai/tools/rag_tools.py (1)
project(110-114)libs/core/kiln_ai/datamodel/finetune.py (2)
Finetune(23-127)model_id(99-109)
libs/core/kiln_ai/adapters/fine_tune/test_base_finetune.py (1)
libs/core/kiln_ai/datamodel/finetune.py (2)
Finetune(23-127)model_id(99-109)
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (9)
- GitHub Check: Generate Coverage Report
- GitHub Check: Build, Typecheck, and Test Python (3.12)
- GitHub Check: Build Desktop Apps (ubuntu-22.04-arm)
- GitHub Check: Build, Typecheck, and Test Python (3.10)
- GitHub Check: Build Desktop Apps (macos-15-intel)
- GitHub Check: Build, Typecheck, and Test Python (3.11)
- GitHub Check: Build Desktop Apps (ubuntu-22.04)
- GitHub Check: Build Desktop Apps (macos-latest)
- GitHub Check: Build Desktop Apps (windows-latest)
🔇 Additional comments (5)
app/desktop/studio_server/provider_api.py (1)
1452-1452: LGTM! Good refactoring to centralize ID construction.Replacing inline string formatting with the centralized
model_id()method improves maintainability and ensures consistent ID generation across the codebase. The method also validates parent relationships and fails fast with a clear error if they're not properly set up, which is better than potentially creating invalid IDs.app/desktop/studio_server/test_provider_api.py (1)
1267-1282: Mockingmodel_id()to match new composite ID format looks correctStubbing
mock_fine_tune1.model_id()/mock_fine_tune2.model_id()withprojX::taskY::ftZaligns the test with the newFinetune.model_id()behavior while avoiding real parent relationships on the mocks. Assertions onresult.models[*].idremain deterministic and consistent with the datamodel.libs/core/kiln_ai/datamodel/test_models.py (1)
334-378:Finetune.model_id()coverage is solid and matches datamodel contractThis test exercises the composite ID generation and both error branches (missing task, missing project) with the exact messages used in
Finetune.model_id(). It gives good confidence that downstream callers relying on theproject_id::task_id::finetune_idformat will behave correctly.libs/core/kiln_ai/adapters/fine_tune/test_base_finetune.py (2)
11-19: Project/Task fixture setup correctly supports finetunemodel_id()and run‑config testsImporting
Project/Taskand updatingsample_taskto create a saved Project with a Task parent ensuresdataset.parent_task()has a path andtask.parent_project()is non‑null. That matches the requirements ofFinetune.model_id()and makes the new run‑config tests realistic relative to how finetunes are created in the app.Also applies to: 48-52, 57-60
189-190: Run‑config behavior forcreate_and_startis well covered
- The extra assertion in
test_create_and_start_successthatdatamodel.run_config is Noneconfirms the default path isn’t leaking any config.test_create_and_start_with_run_configthoroughly checks that:
- a provided
RunConfigPropertiesis attached to the finetune,- its
model_provider_nameis rewritten tokiln_fine_tune,- its
model_nameis set to the composite finetune ID (datamodel.model_id()),- and all other fields (e.g.,
prompt_id,temperature,top_p) survive the round‑trip through persistence.This aligns the tests with the new adapter behavior and with the enum‑typed
model_provider_name/structured_output_modedesign.Also applies to: 222-260
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 0
🧹 Nitpick comments (2)
libs/core/kiln_ai/adapters/fine_tune/base_finetune.py (1)
58-72: run_config propagation is sound; consider minor DRY and signature cleanupThe run_config threading into
FinetuneModeland the post‑_startmutation (provider, model_name, prompt_id, structured_output_mode) looks logically correct and matches the finetunemodel_id()behavior fromlibs/core/kiln_ai/datamodel/finetune.py. The tests exercise this well.Two optional improvements:
- Avoid mutable default for
parametersYou can eliminate the shared
{}default and keep the public API the same:- parameters: dict[str, str | int | float | bool] = {}, + parameters: dict[str, str | int | float | bool] | None = None, @@ - cls.validate_parameters(parameters) + if parameters is None: + parameters = {} + cls.validate_parameters(parameters) @@ - parameters=parameters, + parameters=parameters,
- Reuse
model_id()when buildingprompt_idTo avoid re‑implementing the
project/task/idlogic that already exists inFinetuneModel.model_id(), you can derive bothmodel_nameandprompt_idfrom a single call:- run_config.model_provider_name = ModelProviderName.kiln_fine_tune - run_config.model_name = datamodel.model_id() - # Build the fine-tune prompt ID - task = datamodel.parent_task() - if task is None: - raise ValueError("Finetune must have a parent task") - project = task.parent_project() - if project is None: - raise ValueError("Task must have a parent project") - run_config.prompt_id = ( - f"fine_tune_prompt::{project.id}::{task.id}::{datamodel.id}" - ) + run_config.model_provider_name = ModelProviderName.kiln_fine_tune + model_id = datamodel.model_id() + run_config.model_name = model_id + run_config.prompt_id = f"fine_tune_prompt::{model_id}"This keeps the ID construction logic centralized in the datamodel. Based on relevant code in
libs/core/kiln_ai/datamodel/finetune.py.Also applies to: 100-114, 119-138
libs/core/kiln_ai/adapters/fine_tune/test_base_finetune.py (1)
243-290: run_config mutation and persistence test is thorough
test_create_and_start_with_run_configdoes a solid job:
- Verifies that
model_provider_nameis rewritten tokiln_fine_tune.- Asserts
model_namematchesdatamodel.model_id().- Checks
prompt_idformat using project/task/datamodel IDs.- Confirms temperature/top_p passthrough and JSON‑instructions defaulting.
- Confirms run_config is persisted via
FinetuneModel.load_from_file.One optional enhancement would be an additional negative test where a Task lacks a parent Project but a
run_configis provided, asserting the expectedValueError, to lock in that failure mode as intentional.
📜 Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (2)
libs/core/kiln_ai/adapters/fine_tune/base_finetune.py(3 hunks)libs/core/kiln_ai/adapters/fine_tune/test_base_finetune.py(4 hunks)
🧰 Additional context used
📓 Path-based instructions (4)
libs/core/**/*.py
📄 CodeRabbit inference engine (.cursor/rules/project.mdc)
Use Python 3.10+ for the core library (libs/core) and Python 3.13 for the desktop app
Use python 3.10+ for the core library (libs/core) and python 3.13 for the desktop app
Files:
libs/core/kiln_ai/adapters/fine_tune/base_finetune.pylibs/core/kiln_ai/adapters/fine_tune/test_base_finetune.py
**/*.py
📄 CodeRabbit inference engine (.cursor/rules/project.mdc)
**/*.py: Use asyncio for asynchronous operations in Python
Use Pydantic v2 (not v1) for data validation
Files:
libs/core/kiln_ai/adapters/fine_tune/base_finetune.pylibs/core/kiln_ai/adapters/fine_tune/test_base_finetune.py
libs/**/*.py
📄 CodeRabbit inference engine (AGENTS.md)
Use Pydantic v2 (not v1) for data validation and serialization
Files:
libs/core/kiln_ai/adapters/fine_tune/base_finetune.pylibs/core/kiln_ai/adapters/fine_tune/test_base_finetune.py
**/*test*.py
📄 CodeRabbit inference engine (.cursor/rules/project.mdc)
Use pytest for testing Python code
Use pytest for testing in python projects
Files:
libs/core/kiln_ai/adapters/fine_tune/test_base_finetune.py
🧠 Learnings (4)
📓 Common learnings
Learnt from: chiang-daniel
Repo: Kiln-AI/Kiln PR: 781
File: app/web_ui/src/routes/(app)/fine_tune/[project_id]/[task_id]/create_finetune/+page.svelte:74-86
Timestamp: 2025-11-26T05:40:24.416Z
Learning: In the fine-tune creation flow (app/web_ui/src/routes/(app)/fine_tune/[project_id]/[task_id]/create_finetune/+page.svelte), it is acceptable for a dataset containing tool data to be used when training a model that doesn't support function calling. The training will proceed without using the tool call data from the dataset. This allows for dataset reuse across different model types.
📚 Learning: 2025-08-08T16:13:26.526Z
Learnt from: leonardmq
Repo: Kiln-AI/Kiln PR: 418
File: libs/core/kiln_ai/adapters/ml_model_list.py:1875-1890
Timestamp: 2025-08-08T16:13:26.526Z
Learning: In libs/core/kiln_ai/adapters/ml_model_list.py (Python), do not blanket-add r1_thinking/optional_r1_thinking parsers for R1-style models. Parser usage is provider-specific and must be based on observed responses in tests. For PR Kiln-AI/Kiln#418, deepseek_r1_0528_distill_qwen3_8b providers were validated without needing a parser.
Applied to files:
libs/core/kiln_ai/adapters/fine_tune/base_finetune.pylibs/core/kiln_ai/adapters/fine_tune/test_base_finetune.py
📚 Learning: 2025-08-08T15:50:45.334Z
Learnt from: leonardmq
Repo: Kiln-AI/Kiln PR: 418
File: libs/core/kiln_ai/adapters/ml_model_list.py:221-228
Timestamp: 2025-08-08T15:50:45.334Z
Learning: In libs/core/kiln_ai/adapters/ml_model_list.KilnModelProvider, naming policy: keep cross-provider feature flags unprefixed (e.g., reasoning_optional_for_structured_output) to allow reuse across providers; prefix provider-specific toggles with the provider name (e.g., siliconflow_enable_thinking) when the implementation is specific and potentially unsafe to generalize.
Applied to files:
libs/core/kiln_ai/adapters/fine_tune/base_finetune.py
📚 Learning: 2025-08-08T16:14:54.346Z
Learnt from: leonardmq
Repo: Kiln-AI/Kiln PR: 418
File: libs/core/kiln_ai/adapters/ml_model_list.py:1979-1983
Timestamp: 2025-08-08T16:14:54.346Z
Learning: In libs/core/kiln_ai/adapters/ml_model_list.py, for ModelName.deepseek_r1_distill_qwen_32b on provider siliconflow_cn (model_id "deepseek-ai/DeepSeek-R1-Distill-Qwen-32B"), do not set a parser (r1_thinking/optional_r1_thinking). Verified via testing that SiliconFlow returns final outputs that parse without an R1 parser, and reasoning may be omitted under structured output; rely on reasoning_optional_for_structured_output=True.
Applied to files:
libs/core/kiln_ai/adapters/fine_tune/base_finetune.pylibs/core/kiln_ai/adapters/fine_tune/test_base_finetune.py
🧬 Code graph analysis (1)
libs/core/kiln_ai/adapters/fine_tune/base_finetune.py (4)
libs/core/kiln_ai/adapters/fine_tune/test_base_finetune.py (2)
_start(24-25)_start(50-51)app/web_ui/src/lib/stores.ts (1)
model_name(466-479)libs/core/kiln_ai/datamodel/finetune.py (2)
model_id(99-109)parent_task(94-97)libs/core/kiln_ai/tools/rag_tools.py (1)
project(110-114)
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (10)
- GitHub Check: Generate Coverage Report
- GitHub Check: Build Desktop Apps (ubuntu-22.04-arm)
- GitHub Check: Build, Typecheck, and Test Python (3.11)
- GitHub Check: Build Desktop Apps (macos-latest)
- GitHub Check: Build, Typecheck, and Test Python (3.12)
- GitHub Check: Build Desktop Apps (windows-latest)
- GitHub Check: Build, Typecheck, and Test Python (3.10)
- GitHub Check: Build, Typecheck, and Test Python (3.13)
- GitHub Check: Build Desktop Apps (macos-15-intel)
- GitHub Check: Build Desktop Apps (ubuntu-22.04)
🔇 Additional comments (6)
libs/core/kiln_ai/adapters/fine_tune/base_finetune.py (1)
8-13: Enums and RunConfigProperties imports look correctImports align with new run_config behavior and structured output handling; no issues here.
libs/core/kiln_ai/adapters/fine_tune/test_base_finetune.py (5)
11-18: New datamodel and enum imports are appropriateBringing in
Project,ModelProviderName,StructuredOutputMode, andRunConfigPropertiesmatches the new run_config behavior under test; no issues.
47-66: MockFinetuneWithStructuredOutput correctly simulates adapter behaviorThis mock neatly exercises the case where
_startsetsstructured_output_mode(like Fireworks), allowing you to verify that run_config picks up the adapter‑defined mode.
70-79: Project-backedsample_taskfixture aligns with finetune model_id/prompt_id requirementsCreating and persisting a
Projectand using it as the Task parent ensures thatparent_project()is non‑null, which is required byFinetuneModel.model_id()and your new run_config prompt_id logic. Good fixture hardening.
210-210: Default run_config assertion is usefulAsserting
datamodel.run_config is Nonein the minimal create‑and‑start path clearly documents the default and guards against regressions where a run_config might be auto‑created.
292-320: Structured output mode interaction test covers adapter-driven behavior
test_create_and_start_with_datamodel_structured_output_modenicely validates that:
- The adapter can set
datamodel.structured_output_modein_start.- The final
run_config.structured_output_modemirrors that value (overriding the initialdefault).This confirms the intended precedence between adapter behavior and incoming run_config.
…ncy fixes to make it easier to work with. - UI code was overly aware of FineTune run configs. They are just run configs, treat them as such. Exception is one UI centric behaviour: when I select a FineTune, select the FT run config in the UI. Made this work with a model_specific_run_config concept in API. - This UI code had weird circular reactivity (see example below). Was solved by fragile `await tick()` and guard vars calls, but that means it has implicit deps on call order, and if certain awaits were actually async or not. Fixed by making all reactivity one big method, with deterministic order and debouncing. - Remove guard vars for deterministic run order - Fix a bug where legacy FT models with data_mode=unknown couldn’t use the rest of their run config because data mode never matched - Only show the a fine-tune run config if the user has selected that fine-tune. No need to show them all all the time. They don’t have meaning or value outside of the fine tune. - Show a warning if using a fine-tune, but not it’s suggested run_config Example of circular reactivity: Changing the run config will update model, but selecting model will set the run config (jump back to "custom" or set fine-tune run config). These are legit desired behaviour: respect the user's last selection, and make the rest consistent. But it makes updating the state a bit tricky with many async reactive statements.
…config_ui_updates_v2
… is specified (bug previously that reloaded every time)
Clean fine-tuning UI for fine-tune-run-configs.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 2
Caution
Some comments are outside the diff and can’t be posted inline due to platform limitations.
⚠️ Outside diff range comments (1)
app/web_ui/src/lib/api_schema.d.ts (1)
4973-5007: RunConfigPropertiesprompt_idfield has a documented/type mismatch that requires backend schema correctionThe auto-generated TypeScript schema correctly reflects the backend Pydantic model, but the backend model itself has an inconsistency:
prompt_idis defined as a requiredPromptIdfield inlibs/core/kiln_ai/datamodel/run_config.py(line 35–37), yet its description states "Defaults to building a simple prompt from the task if not provided"—implying optional behavior.This mismatch is evident in usage patterns:
app/desktop/studio_server/data_gen_api.py(lines 144–146, 174–176) explicitly overridesprompt_idwith a comment noting "in case we change the default in the UI in the future," suggesting the backend team recognizesprompt_idshould be optional.To fix: update the backend Pydantic model to make
prompt_idoptional (prompt_id: PromptId | None = None), update its description to clarify the default behavior, regenerate the OpenAPI schema, and the frontend types will sync automatically viaopenapi-typescript.
🧹 Nitpick comments (5)
app/web_ui/src/lib/ui/run_config_component/run_config_component.svelte (5)
85-103: Structured output mode auto-selection is solid; consider dropping the unused returnUsing
available_model_detailsto derivestructured_output_modewhenrequires_structured_outputis true is a good centralization of logic. Thereturn trueon Line 100 is currently unused and could be removed to avoid implying the function’s result is meaningful.
105-132: Run-config → current run options mapping guarded byprior_selected_run_config_idlooks correctThe
prior_selected_run_config_idcheck prevents re-applying the same config on every reactive pass, while still ensuring that the latestselected_run_config_idwins even under rapid user changes. Mapping all relevant fields (model, prompt, tools, temperature, top_p, structured_output_mode) from the selected run config is complete and matchesrun_options_as_run_config_properties.One thing to keep in mind: if
current_taskchanges without resettingselected_run_config_id, you might want to explicitly resetprior_selected_run_config_idas well so the new task’s selected config is always applied once.
134-198: Debounced reactive pipeline is well-structured; consider catching errors insidedebounce_update_for_state_changesThe debouncing via
running/run_againandtick()is a reasonable way to serialize the circular dependencies between model, run config, and tools while avoiding thrash. The stepwiseupdate_for_state_changes(model change → structured output defaults → apply selected run config → reset to custom if diverged) is easy to reason about.Right now, any exception thrown from
update_for_state_changeswill propagate out ofdebounce_update_for_state_changes(thetry/finallydoesn’t catch), which can surface as an unhandled promise since the reactive call doesn’tawaitit. Wrapping the body in atry/catch/finallythat logs or otherwise handles errors would make this more robust against transient failures inload_task_run_configsor similar calls.
233-260: Run-config “custom” reset logic is more robust nowBreaking out
model_changed,provider_changed, andprompt_changed, adding the"unknown"special-case forstructured_output_mode, and switching toarrays_equalfor comparing tools arrays makes the “has the user diverged from this saved config?” check much clearer and less fragile. This should avoid spurious resets when legacy configs use"unknown"and gives a cleaner tools comparison.
350-356: Exposing tools via copy andclear_toolsaligns with ToolsSelector learningsReturning a shallow copy from
get_toolsand providingclear_tools()to reset local tools state avoids external consumers mutating internal arrays by reference, while still giving parents control when they bindtools. This also keeps the persistent state surface centered on the multi-selecttoolsarray, with single-select behavior remaining insideToolsSelectoritself, which is consistent with the earlier guidance about not synchronizing single-select state to the store. Based on learnings, this is the intended behavior.
📜 Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (10)
app/desktop/studio_server/eval_api.py(1 hunks)app/desktop/studio_server/provider_api.py(5 hunks)app/desktop/studio_server/test_eval_api.py(3 hunks)app/desktop/studio_server/test_provider_api.py(5 hunks)app/web_ui/src/lib/api_schema.d.ts(9 hunks)app/web_ui/src/lib/stores/run_configs_store.ts(3 hunks)app/web_ui/src/lib/ui/run_config_component/run_config_component.svelte(6 hunks)app/web_ui/src/lib/ui/run_config_component/saved_run_configs_dropdown.svelte(7 hunks)app/web_ui/src/routes/(app)/run/+page.svelte(3 hunks)libs/core/kiln_ai/datamodel/finetune.py(3 hunks)
🚧 Files skipped from review as they are similar to previous changes (5)
- app/web_ui/src/lib/ui/run_config_component/saved_run_configs_dropdown.svelte
- app/web_ui/src/lib/stores/run_configs_store.ts
- app/desktop/studio_server/eval_api.py
- app/desktop/studio_server/test_provider_api.py
- libs/core/kiln_ai/datamodel/finetune.py
🧰 Additional context used
📓 Path-based instructions (10)
app/web_ui/**/*.svelte
📄 CodeRabbit inference engine (.cursor/rules/project.mdc)
app/web_ui/**/*.svelte: Use Svelte v4 (not v5) for the web frontend UI
Use DaisyUI for UI components in the web frontendUse Svelte v4 (not v5) for the web UI framework
Files:
app/web_ui/src/routes/(app)/run/+page.svelteapp/web_ui/src/lib/ui/run_config_component/run_config_component.svelte
app/web_ui/**/*.{svelte,ts}
📄 CodeRabbit inference engine (.cursor/rules/project.mdc)
Use Tailwind CSS for styling in the web frontend
Files:
app/web_ui/src/routes/(app)/run/+page.svelteapp/web_ui/src/lib/api_schema.d.tsapp/web_ui/src/lib/ui/run_config_component/run_config_component.svelte
app/web_ui/**/*.{svelte,css,ts,tsx}
📄 CodeRabbit inference engine (AGENTS.md)
Use Tailwind and DaisyUI for styling the frontend
Files:
app/web_ui/src/routes/(app)/run/+page.svelteapp/web_ui/src/lib/api_schema.d.tsapp/web_ui/src/lib/ui/run_config_component/run_config_component.svelte
app/web_ui/**/*.{svelte,ts,tsx}
📄 CodeRabbit inference engine (AGENTS.md)
Make FastAPI backend calls from the Svelte web app to communicate with backend servers
Files:
app/web_ui/src/routes/(app)/run/+page.svelteapp/web_ui/src/lib/api_schema.d.tsapp/web_ui/src/lib/ui/run_config_component/run_config_component.svelte
**/*.py
📄 CodeRabbit inference engine (.cursor/rules/project.mdc)
**/*.py: Use asyncio for asynchronous operations in Python
Use Pydantic v2 (not v1) for data validation
Files:
app/desktop/studio_server/provider_api.pyapp/desktop/studio_server/test_eval_api.py
app/desktop/**/*.py
📄 CodeRabbit inference engine (AGENTS.md)
Use python 3.13 for the desktop app
Files:
app/desktop/studio_server/provider_api.pyapp/desktop/studio_server/test_eval_api.py
app/desktop/studio_server/**/*.py
📄 CodeRabbit inference engine (AGENTS.md)
Use FastAPI for the desktop app server that extends libs/server with web app-specific APIs
Files:
app/desktop/studio_server/provider_api.pyapp/desktop/studio_server/test_eval_api.py
app/web_ui/**/*.ts
📄 CodeRabbit inference engine (.cursor/rules/project.mdc)
Use TypeScript for frontend type safety
Files:
app/web_ui/src/lib/api_schema.d.ts
app/web_ui/**/*.{ts,tsx}
📄 CodeRabbit inference engine (AGENTS.md)
Use TypeScript for frontend development
Files:
app/web_ui/src/lib/api_schema.d.ts
**/*test*.py
📄 CodeRabbit inference engine (.cursor/rules/project.mdc)
Use pytest for testing Python code
Use pytest for testing in python projects
Files:
app/desktop/studio_server/test_eval_api.py
🧠 Learnings (12)
📓 Common learnings
Learnt from: chiang-daniel
Repo: Kiln-AI/Kiln PR: 781
File: app/web_ui/src/routes/(app)/fine_tune/[project_id]/[task_id]/create_finetune/+page.svelte:74-86
Timestamp: 2025-11-26T05:40:24.426Z
Learning: In the fine-tune creation flow (app/web_ui/src/routes/(app)/fine_tune/[project_id]/[task_id]/create_finetune/+page.svelte), it is acceptable for a dataset containing tool data to be used when training a model that doesn't support function calling. The training will proceed without using the tool call data from the dataset. This allows for dataset reuse across different model types.
📚 Learning: 2025-11-11T00:07:51.535Z
Learnt from: sfierro
Repo: Kiln-AI/Kiln PR: 756
File: app/web_ui/src/lib/ui/run_config_component/tools_selector.svelte:19-20
Timestamp: 2025-11-11T00:07:51.535Z
Learning: In app/web_ui/src/lib/ui/run_config_component/tools_selector.svelte, the single_select_selected_tool prop should NOT be synchronized with the tools_store. Only the multi-select tools array should persist to the store. The single_select mode is intended to use ephemeral component-local state.
Applied to files:
app/web_ui/src/routes/(app)/run/+page.svelteapp/web_ui/src/lib/ui/run_config_component/run_config_component.svelte
📚 Learning: 2025-11-24T19:13:27.863Z
Learnt from: CR
Repo: Kiln-AI/Kiln PR: 0
File: AGENTS.md:0-0
Timestamp: 2025-11-24T19:13:27.863Z
Learning: Applies to app/web_ui/**/*.{svelte,ts,tsx} : Make FastAPI backend calls from the Svelte web app to communicate with backend servers
Applied to files:
app/web_ui/src/routes/(app)/run/+page.svelte
📚 Learning: 2025-11-24T19:13:27.863Z
Learnt from: CR
Repo: Kiln-AI/Kiln PR: 0
File: AGENTS.md:0-0
Timestamp: 2025-11-24T19:13:27.863Z
Learning: Applies to app/web_ui/**/*.svelte : Use Svelte v4 (not v5) for the web UI framework
Applied to files:
app/web_ui/src/routes/(app)/run/+page.svelte
📚 Learning: 2025-08-08T15:50:45.334Z
Learnt from: leonardmq
Repo: Kiln-AI/Kiln PR: 418
File: libs/core/kiln_ai/adapters/ml_model_list.py:221-228
Timestamp: 2025-08-08T15:50:45.334Z
Learning: In libs/core/kiln_ai/adapters/ml_model_list.KilnModelProvider, naming policy: keep cross-provider feature flags unprefixed (e.g., reasoning_optional_for_structured_output) to allow reuse across providers; prefix provider-specific toggles with the provider name (e.g., siliconflow_enable_thinking) when the implementation is specific and potentially unsafe to generalize.
Applied to files:
app/desktop/studio_server/provider_api.py
📚 Learning: 2025-08-08T16:14:54.346Z
Learnt from: leonardmq
Repo: Kiln-AI/Kiln PR: 418
File: libs/core/kiln_ai/adapters/ml_model_list.py:1979-1983
Timestamp: 2025-08-08T16:14:54.346Z
Learning: In libs/core/kiln_ai/adapters/ml_model_list.py, for ModelName.deepseek_r1_distill_qwen_32b on provider siliconflow_cn (model_id "deepseek-ai/DeepSeek-R1-Distill-Qwen-32B"), do not set a parser (r1_thinking/optional_r1_thinking). Verified via testing that SiliconFlow returns final outputs that parse without an R1 parser, and reasoning may be omitted under structured output; rely on reasoning_optional_for_structured_output=True.
Applied to files:
app/desktop/studio_server/provider_api.py
📚 Learning: 2025-08-15T14:07:02.226Z
Learnt from: scosman
Repo: Kiln-AI/Kiln PR: 473
File: app/desktop/studio_server/provider_api.py:1081-1081
Timestamp: 2025-08-15T14:07:02.226Z
Learning: For Docker Model Runner, untested models are intentionally set with supports_structured_output=True, unlike Ollama where untested models have supports_structured_output=False. This is a deliberate design choice by the maintainers.
Applied to files:
app/desktop/studio_server/provider_api.py
📚 Learning: 2025-10-29T06:59:55.635Z
Learnt from: leonardmq
Repo: Kiln-AI/Kiln PR: 736
File: app/web_ui/src/lib/api_schema.d.ts:8143-8180
Timestamp: 2025-10-29T06:59:55.635Z
Learning: app/web_ui/src/lib/api_schema.d.ts is generated by openapi-typescript; do not propose manual edits. Schema changes should be made in the FastAPI backend (e.g., app/desktop/studio_server/data_gen_api.py or libs/server/kiln_server/*), then re-generate the TS types.
Applied to files:
app/web_ui/src/lib/api_schema.d.ts
📚 Learning: 2025-10-29T11:19:31.059Z
Learnt from: leonardmq
Repo: Kiln-AI/Kiln PR: 759
File: app/web_ui/src/lib/api_schema.d.ts:2485-2490
Timestamp: 2025-10-29T11:19:31.059Z
Learning: In libs/server/kiln_server/document_api.py and the generated app/web_ui/src/lib/api_schema.d.ts, CreateVectorStoreConfigRequest.properties is now a union of three distinct public schemas: LanceDBConfigFTSPropertiesPublic, LanceDBConfigVectorPropertiesPublic, and LanceDBConfigHybridPropertiesPublic. These are intentionally split to allow independent evolution; do not suggest merging them.
Applied to files:
app/web_ui/src/lib/api_schema.d.ts
📚 Learning: 2025-09-04T06:27:58.852Z
Learnt from: leonardmq
Repo: Kiln-AI/Kiln PR: 487
File: app/web_ui/src/lib/api_schema.d.ts:3315-3320
Timestamp: 2025-09-04T06:27:58.852Z
Learning: The file app/web_ui/src/lib/api_schema.d.ts is auto-generated by openapi-typescript from the backend API schema and should never be manually edited. Changes to API types should be made in the backend, which will then be reflected in the regenerated TypeScript definitions.
Applied to files:
app/web_ui/src/lib/api_schema.d.ts
📚 Learning: 2025-11-26T05:40:24.426Z
Learnt from: chiang-daniel
Repo: Kiln-AI/Kiln PR: 781
File: app/web_ui/src/routes/(app)/fine_tune/[project_id]/[task_id]/create_finetune/+page.svelte:74-86
Timestamp: 2025-11-26T05:40:24.426Z
Learning: In the fine-tune creation flow (app/web_ui/src/routes/(app)/fine_tune/[project_id]/[task_id]/create_finetune/+page.svelte), it is acceptable for a dataset containing tool data to be used when training a model that doesn't support function calling. The training will proceed without using the tool call data from the dataset. This allows for dataset reuse across different model types.
Applied to files:
app/web_ui/src/lib/api_schema.d.ts
📚 Learning: 2025-10-07T00:21:33.537Z
Learnt from: sfierro
Repo: Kiln-AI/Kiln PR: 698
File: app/web_ui/src/routes/(app)/evals/[project_id]/[task_id]/compare/+page.svelte:800-812
Timestamp: 2025-10-07T00:21:33.537Z
Learning: In the Kiln web UI codebase, the `save_new_task_run_config` function in `app/web_ui/src/lib/stores/run_configs_store.ts` automatically refreshes the run configs store by calling `load_task_run_configs(project_id, task_id, true)` with force_refresh=true after successfully creating a new run config. This means callers of save_new_task_run_config do not need to manually refresh the store afterward, as the store will already contain the newly created config.
Applied to files:
app/web_ui/src/lib/ui/run_config_component/run_config_component.svelte
🧬 Code graph analysis (2)
app/desktop/studio_server/provider_api.py (2)
libs/core/kiln_ai/datamodel/finetune.py (2)
Finetune(23-127)model_id(99-109)libs/core/kiln_ai/datamodel/datamodel_enums.py (1)
StructuredOutputMode(27-49)
app/desktop/studio_server/test_eval_api.py (3)
libs/core/kiln_ai/datamodel/finetune.py (1)
Finetune(23-127)libs/core/kiln_ai/datamodel/datamodel_enums.py (1)
FineTuneStatusType(52-61)libs/core/kiln_ai/datamodel/test_eval_model.py (1)
mock_task(19-20)
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (11)
- GitHub Check: Cursor Bugbot
- GitHub Check: Generate Coverage Report
- GitHub Check: Build, Typecheck, and Test Python (3.10)
- GitHub Check: Build, Typecheck, and Test Python (3.11)
- GitHub Check: Build, Typecheck, and Test Python (3.13)
- GitHub Check: Build, Typecheck, and Test Python (3.12)
- GitHub Check: Build Desktop Apps (ubuntu-22.04-arm)
- GitHub Check: Build Desktop Apps (macos-15-intel)
- GitHub Check: Build Desktop Apps (macos-latest)
- GitHub Check: Build Desktop Apps (ubuntu-22.04)
- GitHub Check: Build Desktop Apps (windows-latest)
🔇 Additional comments (11)
app/desktop/studio_server/provider_api.py (4)
159-160: LGTM!The new
model_specific_run_configfield is well-designed with a clear comment explaining its purpose for fine-tuned models with baked-in run configurations.
1444-1458: LGTM!The helper function provides proper backward compatibility with a clear migration path from legacy
structured_output_modeto the newrun_config.structured_output_mode, with a safe fallback tojson_instructions.
1489-1492: LGTM!The refactored code properly uses the helper function and passes
model_specific_run_configtoModelDetails, making the logic clearer and more maintainable.
1476-1476: Based on my verification, the review comment raises valid and well-founded concerns about error handling. Here's what I found:Key Findings:
model_id()definitely can raiseValueError- confirmed in implementation (raises when parent task or project is None)Parent relationships CAN be broken in practice - The
KilnParentedModelclass attempts to load parents from disk if not in memory. If the file structure is corrupted,load_parent()can return None, causingmodel_id()to raise.Inconsistent error handling pattern - The
custom_models()function in the same file (lines 1400-1448) has a try-except block that catches errors and logs them, whileall_fine_tuned_models()has no error handling.No existing tests for corrupted data - The existing tests for
all_fine_tuned_models()don't cover the error scenario.The review comment's concern is legitimate: without error handling, corrupted finetune files would crash the entire
/api/available_modelsendpoint.
Verify error handling consistency with similar code patterns.
The use of
fine_tune.model_id()is correct and cleaner than manual ID construction. However, this method raisesValueErrorif parent task or project is missing, which can occur if finetune files have corrupted parent relationships.Currently,
all_fine_tuned_models()lacks error handling, while the similarcustom_models()function in the same file includes a try-except block that logs errors and continues. This inconsistency means a corrupted finetune would crash the/api/available_modelsendpoint.Recommendation: Add a try-except block similar to
custom_models()to gracefully skip corrupted finetunes with a logged warning, rather than crashing the entire endpoint.app/web_ui/src/routes/(app)/run/+page.svelte (2)
25-26: Model-specific run config ID state is well-scoped and typed correctlyUsing a nullable string for
selected_model_specific_run_config_idwith a clear comment keeps the model-specific suggested run config behavior explicit and localized to this page state. No issues found here.
203-214: Binding and propagatingselected_model_specific_run_config_idthrough child components looks correctThreading
selected_model_specific_run_config_idvia one-way prop intoSavedRunConfigurationsDropdownand two-way binding it toRunConfigComponentcleanly centralizes the model-specific suggested run config logic inRunConfigComponentwhile keeping the dropdown in sync. This matches the existingselected_run_config_idpattern and fits the new model-specific behavior without introducing coupling problems.app/web_ui/src/lib/api_schema.d.ts (3)
2625-2652: Fine‑tune/run‑config threading and capability flags are consistent with the PR goalsThe additions of:
CreateFinetuneRequest.run_config_properties?: RunConfigProperties | nullFinetune.run_config?: RunConfigProperties | null(withstructured_output_modeexplicitly marked legacy)FinetuneDatasetInfo.eligible_datasets/eligible_finetune_tagsFinetuneProviderModel.supports_function_calling: booleanModelDetails.model_specific_run_config?: string | nulltogether give a coherent surface for: (a) persisting the full run configuration used when training, (b) surfacing which models support function calling for fine‑tunes, and (c) exposing eligibility‑filtered datasets/tags in the UI. This aligns with the PR’s objective of fine‑tuning with traces & tools, while keeping backward compatibility via the legacy
structured_output_modefield.Given the retrieved learning that datasets with tool data may still be used with non‑function‑calling models (tool data is simply ignored at training time), just ensure the backend logic that populates
eligible_datasets/eligible_finetune_tagsand usessupports_function_callingdoesn’t over‑restrict selection beyond what the UI flow expects. Any behavior tweaks should be implemented in the backend schema/handlers and then regenerated here. Based on learnings, this file is auto‑generated.Also applies to: 3800-3905, 3910-3921, 3956-3968, 4322-4366
9071-9075: Tool‑aware finetune dataset info surface looks correct; verify backend filtering semanticsExtending
finetune_dataset_infoto accepttool_ids?: string[] | nulland addingeligible_datasets/eligible_finetune_tagstoFinetuneDatasetInfois a good fit for tool‑filtered fine‑tune flows. This should let the UI ask “given these tools, which datasets/tags are suitable?” while still being able to show the full historical context viaexisting_datasets/finetune_tags.Given the documented behavior that datasets with tool data are still allowed for non‑function‑calling models (tool data is just unused), please confirm the backend logic for
tool_idsand the “eligible” fields preserves that flexibility (e.g., doesn’t block such datasets outright when no tools or a non‑tool‑capable model is selected). Any adjustments belong in the backend/OpenAPI schema, then re‑generate this file. Based on learnings, this file is auto‑generated.Also applies to: 3910-3921
1595-1611: Both concerns in the review are valid: trailing slash is inconsistent with established patterns, and functional overlap exists between endpoints.The endpoint
/api/projects/{project_id}/tasks/{task_id}/run_configs/has a trailing slash, which breaks the pattern established by all other similar endpoints in the same file (e.g.,task_run_configs,evals,eval_configs,progress—all without trailing slashes).Additionally, the endpoints do have functional overlap:
/task_run_configsreturns only direct task run configs/run_configs/returns task run configs plus finetune-derived run configs (extended functionality)While the api_schema.d.ts correctly reflects the backend definition, the inconsistency should be fixed in the FastAPI route definition (either add trailing slash consistently or remove it). If
run_configs/is intentionally the newer/extended version, consider documenting thattask_run_configsis deprecated to avoid future confusion.Any changes must be made to
app/desktop/studio_server/eval_api.pyand the schema regenerated from the OpenAPI spec.app/web_ui/src/lib/ui/run_config_component/run_config_component.svelte (2)
31-55: New tools selector settings and exported props are coherent with run-config usageImporting
arrays_equalandToolsSelectorSettings, exportingtools_selector_settings,hide_model_selector,selected_model_specific_run_config_id,model, andtools, and derivingrequires_tool_supportfromtools.lengthgives a clear, externally-controllable surface for model/tool filtering and run-config serialization. This looks consistent with the rest of the component’s responsibilities.
360-368: Based on my verification, I can now provide the rewritten review comment.Template changes for model/tools selectors are consistent and flexible, with proper model handling confirmed
Conditionally rendering
AvailableModelsDropdownbehind!hide_model_selectorand wiringsettings={tools_selector_settings}into both ToolsSelector instances (inline at line 382 and in Advanced Options at line 402) gives callers fine-grained control over visibility and behavior without duplicating logic. The Tailwind-based layout (flex flex-col gap-4) remains unchanged and compatible with the existing UI stack.Verified: The only call site that hides the model selector is the fine-tune page (
create_finetune/+page.svelte), which safely manages model selection separately via the$model_providerstore and overrides the model in run config properties. Since the fine-tune page doesn't setrequires_structured_output, the model-dependent behaviors in the component don't create issues. The default model from$ui_state.selected_modelworks as expected for other call sites that don't hide the selector.
| @pytest.mark.asyncio | ||
| async def test_get_run_configs_excludes_unknown_and_failed_finetunes( | ||
| client, mock_task_from_id, mock_task | ||
| ): | ||
| """Test that finetunes with unknown or failed status are excluded from run configs.""" | ||
| mock_task_from_id.return_value = mock_task | ||
|
|
||
| run_config_props = RunConfigProperties( | ||
| model_name="gpt-4", | ||
| model_provider_name=ModelProviderName.openai, | ||
| prompt_id="simple_chain_of_thought_prompt_builder", | ||
| structured_output_mode=StructuredOutputMode.json_schema, | ||
| ) | ||
|
|
||
| finetunes = [ | ||
| Finetune( | ||
| id="ft_completed", | ||
| name="Completed Finetune", | ||
| provider="openai", | ||
| base_model_id="model1", | ||
| dataset_split_id="split1", | ||
| system_message="System message", | ||
| latest_status=FineTuneStatusType.completed, | ||
| run_config=run_config_props, | ||
| parent=mock_task, | ||
| ), | ||
| Finetune( | ||
| id="ft_running", | ||
| name="Running Finetune", | ||
| provider="openai", | ||
| base_model_id="model2", | ||
| dataset_split_id="split2", | ||
| system_message="System message", | ||
| latest_status=FineTuneStatusType.running, | ||
| run_config=run_config_props, | ||
| parent=mock_task, | ||
| ), | ||
| Finetune( | ||
| id="ft_unknown", | ||
| name="Unknown Finetune", | ||
| provider="openai", | ||
| base_model_id="model3", | ||
| dataset_split_id="split3", | ||
| system_message="System message", | ||
| latest_status=FineTuneStatusType.unknown, | ||
| run_config=run_config_props, | ||
| parent=mock_task, | ||
| ), | ||
| Finetune( | ||
| id="ft_failed", | ||
| name="Failed Finetune", | ||
| provider="openai", | ||
| base_model_id="model4", | ||
| dataset_split_id="split4", | ||
| system_message="System message", | ||
| latest_status=FineTuneStatusType.failed, | ||
| run_config=run_config_props, | ||
| parent=mock_task, | ||
| ), | ||
| Finetune( | ||
| id="ft_no_run_config", | ||
| name="No Run Config Finetune", | ||
| provider="openai", | ||
| base_model_id="model5", | ||
| dataset_split_id="split5", | ||
| system_message="System message", | ||
| latest_status=FineTuneStatusType.completed, | ||
| run_config=None, | ||
| parent=mock_task, | ||
| ), | ||
| ] | ||
|
|
||
| for finetune in finetunes: | ||
| finetune.save_to_file() | ||
|
|
||
| response = client.get("/api/projects/project1/tasks/task1/run_configs/") | ||
|
|
||
| assert response.status_code == 200 | ||
| configs = response.json() | ||
|
|
||
| config_ids = [config["id"] for config in configs] | ||
|
|
||
| assert "finetune_run_config::project1::task1::ft_completed" in config_ids | ||
| assert "finetune_run_config::project1::task1::ft_running" in config_ids | ||
| assert "finetune_run_config::project1::task1::ft_failed" in config_ids | ||
| assert "finetune_run_config::project1::task1::ft_unknown" in config_ids | ||
| assert "finetune_run_config::project1::task1::ft_no_run_config" not in config_ids |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Critical: Test name and docstring contradict the actual test behavior.
The test name test_get_run_configs_excludes_unknown_and_failed_finetunes and docstring claim that finetunes with unknown or failed status are excluded, but the assertions prove the opposite:
- Lines 1973-1974 assert that
ft_unknownandft_failedARE included in config_ids - Line 1975 shows the only excluded finetune is
ft_no_run_config(which hasrun_config=None)
The actual behavior being tested is: "Include all finetunes that have a run_config, regardless of status."
Apply this diff to fix the test name and docstring:
@pytest.mark.asyncio
-async def test_get_run_configs_excludes_unknown_and_failed_finetunes(
+async def test_get_run_configs_includes_finetunes_with_run_config(
client, mock_task_from_id, mock_task
):
- """Test that finetunes with unknown or failed status are excluded from run configs."""
+ """Test that finetunes are included in run configs only if they have a run_config set."""
mock_task_from_id.return_value = mock_task🤖 Prompt for AI Agents
In app/desktop/studio_server/test_eval_api.py around lines 1889 to 1975, the
test name and docstring incorrectly state that finetunes with unknown or failed
status are excluded; update the test name and docstring to reflect that the test
verifies all finetunes that have a run_config are included regardless of status
(i.e., include completed, running, unknown, failed) and only finetunes with
run_config=None are excluded; rename the test (e.g.,
test_get_run_configs_includes_all_with_run_config) and change the docstring to
"Test that finetunes with a run_config are included in run configs regardless of
status, and finetunes without a run_config are excluded."
| let prior_model: string | null = null | ||
| async function process_model_change() { | ||
| // only run once immediately after a model change, not every reactive update | ||
| if (prior_model === model) { | ||
| return | ||
| } | ||
| prior_model = model | ||
| // Special case on model change: if the model says it has a model-specific run config, select that run config. | ||
| // Currently used by fine-tuned models which need to be called like they are trained. | ||
| const model_details = available_model_details( | ||
| model_name, | ||
| provider, | ||
| $available_models, | ||
| ) | ||
| if (model_details?.model_specific_run_config) { | ||
| selected_run_config_id = model_details.model_specific_run_config | ||
| selected_model_specific_run_config_id = | ||
| model_details.model_specific_run_config | ||
| } else { | ||
| selected_model_specific_run_config_id = null | ||
| } | ||
| } |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Potential override of user-selected run config when model has model_specific_run_config
process_model_change unconditionally sets selected_run_config_id to model_details.model_specific_run_config whenever the model changes and that property is present, and also updates selected_model_specific_run_config_id. Because model changes can be induced indirectly by selecting a saved run config (via update_current_run_options_for_selected_run_config), this can lead to the following behavior:
- User selects a saved run config whose model has a
model_specific_run_config. - That run config sets
modelto the chosen model. - On the next debounced pass,
process_model_changesees the “new” model and overwritesselected_run_config_idwith the model-specific run config, effectively discarding the user’s chosen saved config.
This seems at odds with the comment in Lines 139–143 that “Select a saved run config” should apply that config’s values. If the intent is to only auto-select model_specific_run_config when the user changes the model from the dropdown (not when the model is changed as a side effect of picking a run config), you may want to guard this block with additional context (e.g., only run when the last change came from the model dropdown, or skip when selected_run_config_id was just set explicitly by the user).
🤖 Prompt for AI Agents
In app/web_ui/src/lib/ui/run_config_component/run_config_component.svelte around
lines 200 to 222, process_model_change unconditionally overrides
selected_run_config_id when model_details.model_specific_run_config exists;
change it so the model-specific run config is only auto-selected when the user
explicitly changed the model (e.g., from the model dropdown) and not when the
model value was set as a side-effect of selecting a saved run config. Implement
a small guard such as tracking the last change source (e.g., lastChangeSource =
'model' vs 'runConfig') or check that selected_run_config_id was not just set by
update_current_run_options_for_selected_run_config before assigning
selected_run_config_id; if the change did originate from an explicit model
dropdown action, set selected_run_config_id and
selected_model_specific_run_config_id as now, otherwise only update
selected_model_specific_run_config_id (or leave both unchanged).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This PR is being reviewed by Cursor Bugbot
Details
You are on the Bugbot Free tier. On this plan, Bugbot will review limited PRs each billing cycle.
To receive Bugbot reviews on all of your PRs, visit the Cursor dashboard to activate Pro and start your 14-day free trial.
| high_quality_count: Dict[str, int] = {} | ||
| reasoning_and_high_quality_count: Dict[str, int] = {} | ||
|
|
||
| required_tools_set = set(tool_filter) if tool_filter else None |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Bug: Empty tool filter treated as no filter
The compute_finetune_tag_info function's docstring states "Only runs with exactly these tools will be included", but passing an empty list [] as tool_filter is treated the same as None (no filtering). This is because set(tool_filter) if tool_filter else None evaluates bool([]) as False. If someone wants to filter to runs with exactly zero tools, this won't work as documented. The condition should check if tool_filter is not None instead of just if tool_filter to properly distinguish between "no filter requested" and "filter to runs with no tools". The same pattern exists for eligible_datasets filtering at line 380. The frontend guards against this by only sending non-empty arrays, but the backend function's behavior is inconsistent with its documentation.
What does this PR do?
Big PR to support fine-tuning with Trace. This PR is based on the work from #758 where a new Trace based Dataset Formatter is introduced.
Checklists
Summary by CodeRabbit
New Features
Improvements
Tests
✏️ Tip: You can customize this high-level summary in your review settings.
Note
Adds tool-aware fine-tune creation and model-specific run configs, extends dataset exports to include traces/tools (incl. Vertex), and updates UI to select compatible models/datasets with persisted state.
GET /api/projects/{project_id}/tasks/{task_id}/run_configs/(includes fine-tune run configs).run_config_properties, exposesupports_function_calling, filterfinetune_dataset_infobytool_ids, and returneligible_datasets/eligible_finetune_tags.model_specific_run_config; fine-tuned models expose consistentmodel_id, structured output mode, and optional model-specific run config.run_configpost-start (kiln_fine_tune model, generated prompt id, structured output mode).DatasetSplitcomputes per-split tool consistency viatool_info.Written by Cursor Bugbot for commit 4adf7f4. This will update automatically on new commits. Configure here.