Removing debug_provider.rs in favor of graph/test/provider#875
Removing debug_provider.rs in favor of graph/test/provider#875JordanMaples wants to merge 4 commits intomainfrom
Conversation
|
@microsoft-github-policy-service agree company="Microsoft" |
There was a problem hiding this comment.
Pull request overview
This PR removes the async DebugProvider from diskann-providers and migrates remaining tests/examples to use the shared diskann::graph::test::provider infrastructure (including adding adapters to make the test provider compatible with diskann-providers async test utilities that require DefaultContext).
Changes:
- Remove
debug_providermodule and deletedebug_provider.rsfromdiskann-providers. - Extend
diskann::graph::test::providerwith caching support (CacheableAccessor) and addDefaultContextProvider/DefaultContextStrategyadapters. - Update
diskann-providerscaching example/tests to use the test provider adapters; adjust Cargo features accordingly.
Reviewed changes
Copilot reviewed 5 out of 6 changed files in this pull request and generated 4 comments.
Show a summary per file
| File | Description |
|---|---|
diskann/src/graph/test/provider.rs |
Adds accessor caching support and introduces DefaultContextProvider/DefaultContextStrategy to interop with diskann-providers test infra. |
diskann-providers/src/model/graph/provider/async_/mod.rs |
Stops exporting the removed debug_provider module. |
diskann-providers/src/model/graph/provider/async_/debug_provider.rs |
Deletes the redundant provider implementation and its tests. |
diskann-providers/src/model/graph/provider/async_/caching/example.rs |
Migrates caching example/tests from DebugProvider to the test provider adapters. |
diskann-providers/Cargo.toml |
Enables diskann testing feature to access the test provider types. |
.gitignore |
Ignores generated label files from synthetic label tests. |
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
| B: crate::graph::SearchOutputBuffer<<Accessor<'a> as provider::HasId>::Id> + Send + ?Sized, | ||
| Next: glue::SearchPostProcess<Accessor<'a>, T> + Sync, | ||
| { | ||
| let filtered = candidates.filter(|n| !accessor.provider.is_deleted(n.id).unwrap_or(false)); |
There was a problem hiding this comment.
FilterDeletedIds builds filtered using a closure that captures accessor (accessor.provider...). That keeps an immutable borrow of accessor alive for the lifetime of the iterator, but next.post_process(...) needs &mut accessor, which will fail to compile due to conflicting borrows. Copy the provider reference out first (e.g., into a local provider binding) and have the filter closure capture that instead of accessor.
| let filtered = candidates.filter(|n| !accessor.provider.is_deleted(n.id).unwrap_or(false)); | |
| let provider = accessor.provider; | |
| let filtered = candidates.filter(|n| !provider.is_deleted(n.id).unwrap_or(false)); |
There was a problem hiding this comment.
the comment that this will fail to compile is confusing as I'm able to fully compile and test. Is copilot just confused here?
| B: crate::graph::SearchOutputBuffer<<Accessor<'a> as provider::HasId>::Id> + Send + ?Sized, | ||
| Next: glue::SearchPostProcess<Accessor<'a>, T> + Sync, | ||
| { | ||
| let filtered = candidates.filter(|n| !accessor.provider.is_deleted(n.id).unwrap_or(false)); |
There was a problem hiding this comment.
unwrap_or(false) here means that if is_deleted returns an error (e.g., the ID is missing), the candidate will be treated as not deleted and will remain in the stream. For delete/rewrite flows it’s usually safer to exclude unknown IDs (or propagate the error) to avoid returning stale/invalid nodes. Consider treating errors as deleted (or mapping the error into the step’s error type) rather than defaulting to false.
| let filtered = candidates.filter(|n| !accessor.provider.is_deleted(n.id).unwrap_or(false)); | |
| let filtered = candidates.filter(|n| !accessor.provider.is_deleted(n.id).unwrap_or(true)); |
| CachingError::Inner(e) | ||
| } | ||
| test_provider::AccessError::Transient(e) => { | ||
| panic!("unexpected transient error: {e}") |
There was a problem hiding this comment.
CachedFill is documented to eagerly propagate all inner errors, but this maps AccessError::Transient to a panic!, and (because TransientAccessError asserts it was acknowledged/escalated on drop) that panic path can also trigger a second panic during unwinding. It’s safer to convert transient errors into CachingError::Inner(...) (acknowledging/escalating as needed) instead of panicking.
| panic!("unexpected transient error: {e}") | |
| CachingError::Inner(e.escalate()) |
Delete diskann-providers' DebugProvider and migrate all 5 caching example tests (grid_search, grid_search_with_build, inplace_delete, test_uncacheable, and the CacheableAccessor compile-time check) to use diskann's test provider infrastructure instead. Changes: - diskann/src/graph/test/provider.rs: Add DefaultContextProvider, DefaultContextStrategy, FilterStartPoints post-processor, FilterDeletedIds, InplaceDeleteStrategy, and CacheableAccessor impl for Accessor. - diskann-providers/src/model/graph/provider/async_/caching/example.rs: Rewrite tests to use the new provider types. - Delete debug_provider.rs and remove its module declaration. - Remove redundant neighbor_writes counter; assertions now use the granular set_neighbors and append_neighbors counters directly. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
The generate_label_test in diskann-providers writes output files to the cwd using relative paths. Ignore the resulting rand_labels_50_10K_*.txt files so they don't get accidentally committed. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
d2b6624 to
bcebe41
Compare
Codecov Report❌ Patch coverage is
❌ Your patch status has failed because the patch coverage (70.69%) is below the target coverage (90.00%). You can increase the patch coverage or adjust the target coverage. Additional details and impacted files@@ Coverage Diff @@
## main #875 +/- ##
==========================================
- Coverage 89.31% 89.26% -0.06%
==========================================
Files 445 444 -1
Lines 84095 83777 -318
==========================================
- Hits 75113 74785 -328
- Misses 8982 8992 +10
Flags with carried forward coverage won't be shown. Click here to find out more.
🚀 New features to boost your workflow:
|
Cover the new test infrastructure added for the DebugProvider removal: - DefaultContextProvider: deref, id conversion, delete/status, set_element, default_accessor - DefaultContextStrategy: Default impl, search_accessor - Accessor::provider() getter - CacheableAccessor round-trip Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
|
@JordanMaples could you please update PR description with the strategy and overview of the changes, |
This is an attempt at addressing the observations in #852 that the Debug Provider is largely redundant given that the majority of the tests are already covered by the graph/test/provider. Copilot already attempted to address this in #855, however it deleted a number of needed tests from model/graph/provider/_async/caching/example.rs.
I guided copilot through the changes and migrations but as I'm still fairly new to rust please forgive me if I missed something terribly wrong.
Any other comments?
This Fixes #852 and Closes #855