Add exploration queue for high selectivity multi-hop filter search#897
Draft
ChenSunriseJiaBao wants to merge 2 commits intomicrosoft:mainfrom
Draft
Add exploration queue for high selectivity multi-hop filter search#897ChenSunriseJiaBao wants to merge 2 commits intomicrosoft:mainfrom
ChenSunriseJiaBao wants to merge 2 commits intomicrosoft:mainfrom
Conversation
Add complete 4-layer defense against high selectivity scenarios: Layer 1: Exploration Queue (already implemented in multihop_search.rs) - Uses RejectAndNeedExpand to enable continued graph traversal Layer 2: Match Rate Detection - Automatically detects high selectivity after 30 samples - If match rate < 2%, enables exploration mode (RejectAndNeedExpand) - Caches result using compare-and-swap for thread safety Layer 3: Checkpoint-Based Timeout - Checks timeout every 1000 visits instead of every visit - Reduces syscall overhead from Instant::now() by ~99% Layer 4: Two-Tier Early Stop - Soft timeout (default 10ms): triggers when elapsed > soft AND matched >= min_matched_count - Hard timeout (default 100ms): unconditional termination to bound worst-case latency Added comprehensive tests for all 4 layers: - Match rate detection (low/high match rate scenarios) - Checkpoint interval behavior - Hard early stop - Soft early stop with/without enough matches - Timeout clamping (soft <= hard) - Default constants verification Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
|
@ChenSunriseJiaBao please read the following Contributor License Agreement(CLA). If you agree with the CLA, please reply with the following information.
Contributor License AgreementContribution License AgreementThis Contribution License Agreement (“Agreement”) is agreed to by the party signing below (“You”),
|
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
RejectAndNeedExpandvariant toQueryVisitDecisionenum to signal when exploration mode should be enabledscratch.bestis exhaustedHighSelectivityBitmapFilterthat usesRejectAndNeedExpandfor high selectivity scenariostopk-high-selectivity-multihop-filterbenchmark search typeProblem
When filter selectivity is very high (e.g., 99% of vectors filtered out), the standard multi-hop search can fail because:
scratch.best(priority queue of matching nodes) gets exhausted quicklySolution
The exploration queue enables continued graph traversal through non-matching nodes when the primary queue is empty. This is critical for low match rate scenarios where we need to traverse through many non-matching nodes to find the few that match.
Test plan
cargo test --package diskanncargo run --package diskann-benchmark -- run --input-file diskann-benchmark/example/async-multihop-high-selectivity-small.json --output-file output.json🤖 Generated with Claude Code