-
Notifications
You must be signed in to change notification settings - Fork 170
Merge jerm/2026-01-13-optimizer-in-vmcp into main #3373
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Conversation
22e020c to
3f3a011
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Large PR Detected
This PR exceeds 1000 lines of changes and requires justification before it can be reviewed.
How to unblock this PR:
Add a section to your PR description with the following format:
## Large PR Justification
[Explain why this PR must be large, such as:]
- Generated code that cannot be split
- Large refactoring that must be atomic
- Multiple related changes that would break if separated
- Migration or data transformationAlternative:
Consider splitting this PR into smaller, focused changes (< 1000 lines each) for easier review and reduced risk.
See our Contributing Guidelines for more details.
This review will be automatically dismissed once you add the justification section.
|
Here's the demo scripts I mentioned #3375 |
| @@ -0,0 +1,134 @@ | |||
| # Integrating Optimizer with vMCP | |||
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
if the optimizer is only being used by the Operator, I would put it into cmd/thv-operator/pkg instead. Because this makes it easier for us to rip out the Operator into its own repo in future. It also allows us to keep local ToolHive CLI a separate as possible.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I was initially testing in docker mode (it does work with docker) but k8s is the main target. I'm happy either way.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
+1 to chris's suggestion
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ok - I'm outvoted. I'll do it.
Cheers
Large PR justification has been provided. Thank you!
|
✅ Large PR justification has been provided. The size review has been dismissed and this PR can now proceed with normal review. |
#3253) * feat: Add optimizer package with semantic tool discovery and ingestion This PR introduces the optimizer package, a Go port of the mcp-optimizer Python service that provides semantic tool discovery and ingestion for MCP servers. - **Semantic tool search** using vector embeddings (384-dim) - **Token counting** for LLM cost estimation - **Full-text search** via SQLite FTS5 - **Multiple embedding backends**: Ollama, vLLM, or placeholder (testing) - **Production-ready database** with sqlite-vec for vector similarity search
* feat: Add optimizer integration endpoints and tool discovery - Add find_tool and call_tool endpoints to vmcp optimizer - Add semantic search and string matching for tool discovery - Update optimizer integration documentation - Add test scripts for optimizer functionality
* feat: Add token metrics and observability to optimizer integration
…failures The checkPodsReady function was checking all pods with matching labels, including old pods that had completed (Phase: Succeeded) from previous deployments. This caused the auth discovery e2e test to fail when old pods were still present during deployment updates. Fix: Skip pods that are not in Running phase and ensure at least one running pod exists after filtering.
The test was failing with 'connection reset by peer' errors when trying to connect to the health endpoint. This can happen if pods crash or restart between the BeforeAll setup and the actual test execution. Fix: Add explicit pod readiness verification right before the health check and also check pod readiness inside the Eventually loop to catch pods that crash during health check retries. This makes the test more robust by ensuring pods are stable before attempting HTTP connections.
| @@ -0,0 +1,140 @@ | |||
| // SPDX-FileCopyrightText: Copyright 2025 Stacklok, Inc. | |||
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The changes in this and parent dir are for inspecting the local db. They could be ditched, but they are useful for debugging.
| crd-helm-wrapper | ||
| cmd/vmcp/__debug_bin* | ||
|
|
||
| # Demo files |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is to stop me checking these in by accident. This can go.
Signed-off-by: nigel brown <nigel@stacklok.com>
jerm-dro
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I suspect something is off with the diff here. I'm seeing some changes that I don't expect in this PR.
| BUILD_DATE: '{{dateInZone "2006-01-02T15:04:05Z" (now) "UTC"}}' | ||
| cmds: | ||
| - go install -ldflags "-s -w -X github.com/stacklok/toolhive/pkg/versions.Version={{.VERSION}} -X github.com/stacklok/toolhive/pkg/versions.Commit={{.COMMIT}} -X github.com/stacklok/toolhive/pkg/versions.BuildDate={{.BUILD_DATE}}" -v ./cmd/vmcp | ||
| - go install -tags="fts5" -ldflags "-s -w -X github.com/stacklok/toolhive/pkg/versions.Version={{.VERSION}} -X github.com/stacklok/toolhive/pkg/versions.Commit={{.COMMIT}} -X github.com/stacklok/toolhive/pkg/versions.BuildDate={{.BUILD_DATE}}" -v ./cmd/vmcp |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Will the build fail if you don't include this? Or will problems only be encountered at runtime?
I ask because you had to add it to a lot of places. If you missed one, I wouldn't want that to result in runtime failures.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It is an optional sqlite module. If it isn't there there will be a runtime "no such module: fts5" and it will crash. It is needed for BM25 searches.
I can't see a way round including this (can you?).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'll make sure a check fails for this.
Reinstated the deleted tests in pkg/vmcp/schema/reflect_test.go that were removed in commit 38cf21e. Updated the tests to work with the current optimizer implementation by: - Creating FindToolInput and CallToolInput test types that match the current optimizer tool schemas (optim_find_tool and optim_call_tool) - Updating tests to reflect current schema (tool_keywords as string instead of array, added limit and backend_id fields) - All tests now pass and validate schema generation and translation functions work correctly with optimizer tool inputs
Remove the EmbeddingService field from OptimizerConfig and all related conversion logic. Users should now provide the full service URL including port for in-cluster services (e.g., http://service-name.namespace.svc.cluster.local:port). This simplifies the codebase by removing Kubernetes-specific service resolution logic and making the configuration more explicit and platform-agnostic.
- Restore pkg/vmcp/server/adapter/optimizer_adapter.go with original structure - Use optim_ prefix for tool names (optim_find_tool, optim_call_tool) - Remove router check for optim_ prefix (optimizer tools don't go through router) - Eliminate schema duplication by defining schemas once in optimizer_adapter.go - Update server to use adapter.CreateOptimizerTools() directly - Remove obsolete EmbeddingService references from commands.go - Fix .gitignore pattern to avoid ignoring vmcp source files
Per PR review feedback, move the optimizer package from pkg/optimizer/ to cmd/thv-operator/pkg/optimizer/ to make it easier to extract the operator into its own repository in the future while keeping the ToolHive CLI separate. Updated all import statements and references across the codebase.
Signed-off-by: nigel brown <nigel@stacklok.com>
- Fix import ordering (gci) in optimizer db files - Fix unused receiver in capability_adapter.go - Add optimizer_adapter_test.go with updated tests for current API - Bump Helm chart version to 0.0.100
Merge jerm/2026-01-13-optimizer-in-vmcp into main
This PR merges 17 commits that integrate the MCP optimizer into vMCP, adding semantic tool discovery, observability, Kubernetes support, and various bug fixes and improvements.
Core Optimizer Integration
Add Optimizer Package (#3253)
Add Optimizer Integration Endpoints (#3318)
find_toolandcall_toolendpoints to vMCP optimizerResolve Tool Names in optim.find_tool (#3337)
Observability & Metrics
Add Token Metrics and Observability (#3347)
Add OpenTelemetry Tracing to Capability Aggregation
AggregateCapabilities(parent span)QueryAllCapabilities(parallel backend queries)QueryCapabilities(per-backend queries)ResolveConflicts(conflict resolution)MergeCapabilities(final merge)Kubernetes Integration
Add Dynamic/Static Mode Support (#3235)
Add DeepCopy and Kubernetes Service Resolution
DeepCopy()for automatic passthrough of config fields (Optimizer, Metadata, etc.)resolveEmbeddingService()to resolve Kubernetes Service names to URLsKubernetes Optimizer Integration Fixes (#3359)
WithContinuousListening()to use timeout-based approachTesting & Reliability Improvements
Run API E2E Test Server as Standalone Process (#3356)
Fix Flaky E2E Tests
checkPodsReadyto prevent flaky test failuresFix Unrecognized Dotty Names
Infrastructure
Bump Operator CRDs Chart Version
Documentation Updates
Summary
This PR consolidates the complete integration of the MCP optimizer into vMCP, enabling semantic tool discovery, reducing token usage, and providing comprehensive observability. The integration includes full Kubernetes support, robust error handling, and improved test reliability.
Related PRs
Large PR Justification