Add DiskAnnPy Python wheel (diskannpy-rust) and PyPI publish workflow by YuanyuanTian-hh · Pull Request #872 · microsoft/DiskANN

YuanyuanTian-hh · 2026-03-31T05:08:04Z

Summary

Add DiskAnnPy, a PyO3-based Python wheel that exposes DiskANN's Rust index types to Python, and a GitHub Actions workflow to build and publish it to PyPI as \diskannpy-rust.

What's included

Rust bindings (\DiskAnnPy/src/)

\AsyncMemoryIndex\ (F32, U8, Int8) in-memory async index with insert/delete/search
\BfTreeIndex\ (F32, U8, Int8) BfTree-backed index with PQ support, save/load
\StaticDiskIndex\ (F32, U8, Int8) disk-based search
Index builders: \�uild_disk_index, \�uild_memory_index\
Quantization wrappers: MinMax and Product quantizers

Python layer (\DiskAnnPy/python/diskannpy/)

High-level wrappers: \AsyncDiskIndex, \BfTreeIndex, \StaticDiskIndex\
Builder functions: \�uild_disk_index, \�uild_async_index\
File I/O utilities, type aliases, defaults

Build & publish (.github/workflows/publish-diskannpy.yml)

Builds wheels for Linux x86_64 and Windows x86_64 (Python 3.9-3.13)
Builds source distribution (sdist)
Tests a subset of built wheels
Publishes to PyPI via OIDC Trusted Publishing (no API tokens needed)
Triggers on GitHub Release or manual workflow dispatch

Tests (\DiskAnnPy/tests/)

\ est_async_index.py, \ est_bftree_index.py, \ est_builder.py, \ est_quantization.py, \ est_static_disk_index.py\

CI changes

.github/workflows/ci.yml: Added --exclude DiskAnnPy\ to all --workspace\ cargo commands (clippy, build, test, coverage). DiskAnnPy is a PyO3 cdylib that requires Python dev headers at build time it has its own dedicated CI in \publish-diskannpy.yml.

Test results

Wheel builds (all passed ):

Platform	Python versions	Status
Linux x86_64	3.9, 3.10, 3.11, 3.12, 3.13
Windows x86_64	3.9, 3.10, 3.11, 3.12, 3.13
sdist

Wheel tests (all passed ):

OS	Python	Status
ubuntu-latest	3.10
ubuntu-latest	3.12
windows-latest	3.10

PyPI publish: Successfully published \diskannpy-rust==0.49.1\ (link)
Workflow run: https://github.com/microsoft/DiskANN/actions/runs/23782386152

Known limitations

x86_64 only: aarch64/macOS builds are disabled due to a linking issue (\cfg_if\ guard in \lib.rs)
Published under personal PyPI account needs transfer to Microsoft org for official publishing
The existing \diskannpy\ package (v0.7.0, C++ based) on PyPI is owned by \dwpryce\ / \microsoft\ coordination needed to either reclaim that name or establish \diskannpy-rust\ as the official Rust-based distribution

How to install

\
pip install diskannpy-rust
\\

Copilot review fixes addressed

Fixed \�alid_dtype()\ / \�alid_dap_dtype()\ to canonicalize dtype aliases (e.g.
p.single\
p.float32) and raise on unmapped types instead of returning \None\
Fixed \search()\ in _async_diskann.py: \l_value\ was not being updated when \k_value > l_value\ (assigned to unused \complexity\ variable)
Fixed \�uild_disk_index\ log: was printing compile-time \ALPHA\ constant instead of the actual \�lpha\ parameter
Fixed \rom_stats_slice_inmem(): guarded against empty slice to avoid division by zero / NaN
Removed duplicate \rom typing import Optional\ import in _static_disk_index.py\
Fixed _all_\ exports: \QueryResponseBatch\ \QueryResponseBatchWithStats\ + \RangeQueryResponseBatchWithStats\

Copilot

Pull request overview

This PR introduces a new DiskAnnPy Python package (Rust pyo3 extension + Python wrapper) into the DiskANN workspace and adds CI automation to build/test/publish wheels.

Changes:

Adds a new Rust DiskAnnPy crate exposing DiskANN APIs to Python via pyo3 (static disk index, async memory index, BfTree index, quantizers, shared utils).
Adds a Python wrapper package (DiskAnnPy/python/diskannpy) with builders, index classes, file utilities, and type hints.
Adds Python unit tests + fixtures and a GitHub Actions workflow to build/test/publish wheels to PyPI.

Reviewed changes

Copilot reviewed 48 out of 58 changed files in this pull request and generated 11 comments.

Show a summary per file

File	Description
DiskAnnPy/tests/test_static_disk_index.py	Adds StaticDiskIndex unit tests (recall, validation, relative paths).
DiskAnnPy/tests/test_quantization.py	Adds MinMax/Product quantizer workflow and accuracy tests.
DiskAnnPy/tests/test_builder.py	Adds build_disk_index input validation tests.
DiskAnnPy/tests/test_bftree_index.py	Adds BfTreeIndex tests including on-disk save/load and PQ paths.
DiskAnnPy/tests/test_async_index.py	Adds AsyncDiskIndex tests, including PQ and delete/insert flows.
DiskAnnPy/tests/fixtures/recall.py	Provides recall computation + PQ recall cutoff.
DiskAnnPy/tests/fixtures/create_test_data.py	Utilities to generate/write vector test data.
DiskAnnPy/tests/fixtures/build_disk_index.py	Fixture to build a StaticDiskIndex test dataset/index.
DiskAnnPy/tests/fixtures/build_bftree_index.py	Fixtures to build in-memory/on-disk BfTree indexes (PQ and non-PQ).
DiskAnnPy/tests/fixtures/build_async_index.py	Fixture to build async memory indexes (PQ and non-PQ).
DiskAnnPy/tests/fixtures/init.py	Re-exports fixtures for tests.
DiskAnnPy/tests/data/query_preexisting_vector.bin	Adds binary test data.
DiskAnnPy/tests/data/query_insert_vector.bin	Adds binary test data.
DiskAnnPy/tests/data/ann_metadata.bin	Adds binary test data.
DiskAnnPy/src/utils/search_result.rs	Adds Rust structs for Python-facing search results + stats helpers.
DiskAnnPy/src/utils/parallel_tasks.rs	Adds async “worker pool over iterator” helper.
DiskAnnPy/src/utils/mod.rs	Wires new utils modules into the Rust crate.
DiskAnnPy/src/utils/metric_py.rs	Adds Python-exposed Metric enum and parsing.
DiskAnnPy/src/utils/index_build_utils.rs	Adds runtime init + common error helpers.
DiskAnnPy/src/utils/graph_data_types.rs	Defines GraphDataType bindings for f32/u8/i8.
DiskAnnPy/src/utils/dataset_utils.rs	Adds dataset loading/alignment utilities + recall/truthset helpers.
DiskAnnPy/src/utils/data_type.rs	Adds Python-exposed DataType enum and parsing.
DiskAnnPy/src/utils/convert_py_array.rs	Adds helper to convert PyArray2 to Vec<Vec> (+ unit test).
DiskAnnPy/src/utils/ann_result_py.rs	Adds Python exception wrapper type for ANNError.
DiskAnnPy/src/static_disk_index.rs	Adds Rust implementation + Python bindings for StaticDiskIndex.
DiskAnnPy/src/quantization/README.md	Adds user documentation for quantizers.
DiskAnnPy/src/quantization/product.rs	Adds ProductQuantizer Python bindings.
DiskAnnPy/src/quantization/mod.rs	Exposes quantization modules/types.
DiskAnnPy/src/quantization/minmax.rs	Adds MinMaxQuantizer Python bindings.
DiskAnnPy/src/quantization/metric.rs	Adds metric parsing/translation for quantization.
DiskAnnPy/src/quantization/base.rs	Adds base quantizer class for shared metadata.
DiskAnnPy/src/lib.rs	Defines the `_diskannpy` Python module and registers exported classes/functions.
DiskAnnPy/src/build_disk_index.rs	Adds Rust entrypoint for building disk indexes via Python.
DiskAnnPy/src/build_async_memory_index.rs	Adds Rust entrypoint for building async memory indexes via Python.
DiskAnnPy/README.md	Adds DiskAnnPy build/test instructions (maturin-based).
DiskAnnPy/python/diskannpy/_static_disk_index.py	Adds Python wrapper for StaticDiskIndex.
DiskAnnPy/python/diskannpy/_files.py	Adds vector/tag file I/O helpers.
DiskAnnPy/python/diskannpy/_defaults.py	Adds Python-side defaults/constants.
DiskAnnPy/python/diskannpy/_common.py	Adds shared validation/helpers + metadata read/write.
DiskAnnPy/python/diskannpy/_builder.py	Adds Python builder functions for disk + async indexes.
DiskAnnPy/python/diskannpy/_bftree_diskann.py	Adds Python wrapper for BfTreeIndex (incl. save/load).
DiskAnnPy/python/diskannpy/_async_diskann.py	Adds Python wrapper for AsyncDiskIndex.
DiskAnnPy/python/diskannpy/init.py	Adds public API surface + type aliases for the package.
DiskAnnPy/pyproject.toml	Adds packaging metadata and dependencies for the Python project.
DiskAnnPy/py.typed	Marks package as typed.
DiskAnnPy/Cargo.toml	Adds Rust crate config/dependencies for the Python extension.
DiskAnnPy/.gitignore	Adds local ignores for the new Python/Rust build outputs.
Cargo.toml	Adds `DiskAnnPy` to the workspace members.
Cargo.lock	Updates lockfile for the new crate/dependencies.
.github/workflows/publish-diskannpy.yml	Adds wheel build/test/publish workflow using maturin + trusted publishing.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Copilot · 2026-03-31T05:14:30Z

DiskAnnPy/python/diskannpy/_common.py

+    if dtype == np.uint8:
+        return np.uint8
+    if dtype == np.int8:
+        return np.int8
+    if dtype == np.float32:
+        return np.float32
+
+def valid_dap_dtype(dtype: Type) -> dap.DataType:
+    _assert_dtype(dtype)
+    if dtype == np.uint8:
+        return dap.DataType.Uint8
+    if dtype == np.int8:
+        return dap.DataType.Int8
+    if dtype == np.float32:
+        return dap.DataType.Float


valid_dtype() allows dtypes that are castable to the supported set (e.g., np.single, np.byte, np.ubyte) via _assert_dtype, but then only returns for exact np.float32/np.int8/np.uint8. For castable aliases it falls off the end and returns None, which will later break index building/loading. Map aliases to canonical dtypes (e.g., treat np.single as np.float32, np.byte as np.int8, np.ubyte as np.uint8) and ensure the function always returns a dtype after _assert_dtype passes.

Suggested change

if dtype == np.uint8:

return np.uint8

if dtype == np.int8:

return np.int8

if dtype == np.float32:

return np.float32

def valid_dap_dtype(dtype: Type) -> dap.DataType:

_assert_dtype(dtype)

if dtype == np.uint8:

return dap.DataType.Uint8

if dtype == np.int8:

return dap.DataType.Int8

if dtype == np.float32:

return dap.DataType.Float

canonical_dtype = np.dtype(dtype).type

if canonical_dtype == np.uint8:

return np.uint8

if canonical_dtype == np.int8:

return np.int8

if canonical_dtype == np.float32:

return np.float32

# This should be unreachable because _assert_dtype ensures castability to _VALID_DTYPES

raise ValueError(f"Unsupported dtype after canonicalization: {dtype!r}")

def valid_dap_dtype(dtype: Type) -> dap.DataType:

_assert_dtype(dtype)

canonical_dtype = np.dtype(dtype).type

if canonical_dtype == np.uint8:

return dap.DataType.Uint8

if canonical_dtype == np.int8:

return dap.DataType.Int8

if canonical_dtype == np.float32:

return dap.DataType.Float

# This should be unreachable because _assert_dtype ensures castability to _VALID_DTYPES

raise ValueError(f"Unsupported dtype after canonicalization: {dtype!r}")

Copilot · 2026-03-31T05:14:31Z

DiskAnnPy/python/diskannpy/__init__.py

+__all__ = [
+    "build_disk_index",
+    "StaticDiskIndex",
+    "build_async_index",
+    "AsyncDiskIndex",
+    "BfTreeIndex",
+    "defaults",
+    "DistanceMetric",
+    "VectorDType",
+    "QueryResponse",
+    "QueryResponseBatch",
+    "VectorIdentifier",
+    "VectorIdentifierBatch",
+    "VectorLike",
+    "VectorLikeBatch",
+    "Metadata",
+    "vectors_metadata_from_file",
+    "vectors_to_file",
+    "vectors_from_file",
+    "tags_to_file",
+    "tags_from_file",
+    "valid_dtype",
+]


__all__ exports "QueryResponseBatch", but no such symbol is defined in this module (the type is named QueryResponseBatchWithStats). This makes from diskannpy import QueryResponseBatch fail and also contradicts the docstring list above. Either rename/alias QueryResponseBatchWithStats to QueryResponseBatch, or update __all__ and docs to the correct name.

Copilot · 2026-03-31T05:14:31Z

DiskAnnPy/python/diskannpy/_async_diskann.py

+            warnings.warn(
+                f"k_neighbors={k_value} asked for, but list_size={l_value} was smaller. Increasing {l_value} to {k_value}"
+            )
+            complexity = k_value


When k_value > l_value, the code warns that it is increasing l_value but assigns to a new variable complexity that is never used. As a result, the call to self._index.search(... l_value=l_value ...) still uses the too-small l_value. Update this to set l_value = k_value (or otherwise ensure the native call receives the adjusted value).

Suggested change

complexity = k_value

l_value = k_value

Copilot · 2026-03-31T05:14:31Z

DiskAnnPy/src/static_disk_index.rs

+        zipped.for_each_in_pool(
+            &pool,
+            |(((query, query_result_ids), distance_results), query_stats)| {
+                let search_result = self
+                    .search_internal(query, recall_at, l_value, query_stats)
+                    .unwrap();
+                *query_result_ids = search_result.ids;
+                *distance_results = search_result.distances;
+            },
+        );


batch_search calls search_internal(...).unwrap() inside the rayon loop. Any search error will panic, which will abort the Python process instead of returning a Python exception. Propagate errors out of the parallel loop (e.g., collect Results and return the first error, or use a shared error slot and short-circuit).

Copilot · 2026-03-31T05:14:32Z

DiskAnnPy/src/utils/parallel_tasks.rs

+pub async fn run<I, T, F, Fut>(iterator: I, num_tasks: usize, task_fn: F)
+where
+    I: Iterator<Item = T> + Send + 'static,
+    T: Send + 'static,
+    F: Fn(T) -> Fut + Send + Sync + Clone + 'static,
+    Fut: Future + Send + 'static,
+{
+    let mut tasks = JoinSet::new();
+    let iterator = Arc::new(Mutex::new(iterator));
+
+    for _ in 0..num_tasks {
+        let iterator_clone = iterator.clone();
+        let task_fn = task_fn.clone();
+
+        tasks.spawn(async move {
+            loop {
+                let item = {
+                    let mut guard = iterator_clone.lock().await;
+                    guard.next()
+                };
+
+                match item {
+                    Some(item) => {
+                        let _ = task_fn(item).await;
+                    }
+                    None => break,
+                }
+            }
+        });


The helper drops the output of task_fn(item).await (let _ = ...). Callers in this repo pass async blocks that return ANNResult, so failures are silently ignored and operations may report success despite partial failure. Consider constraining Fut::Output to a Result and returning an aggregated error (or at least capturing/logging failures) instead of discarding the result.

Copilot · 2026-03-31T05:14:32Z

DiskAnnPy/src/build_disk_index.rs

+) -> ANNResultPy<()> {
+    println!(
+         "Starting index build with R: {}  Lbuild: {}  alpha: {}  #threads: {} num_of_pq_chunks: {} build_DRAM_budget: {}",
+         graph_degree, complexity, ALPHA, num_threads, num_of_pq_chunks, build_dram_budget


The log line prints ALPHA (the compile-time default) instead of the alpha parameter passed into build_disk_index. This can be misleading when callers override alpha; print the actual argument value.

Suggested change

graph_degree, complexity, ALPHA, num_threads, num_of_pq_chunks, build_dram_budget

graph_degree, complexity, alpha, num_threads, num_of_pq_chunks, build_dram_budget

Copilot · 2026-03-31T05:14:32Z

DiskAnnPy/tests/test_async_index.py

+    @classmethod
+    def setUpClass(cls) -> None:
+        cls._test_matrix = [
+            build_random_vectors_and_async_index(np.float32, "l2"),
+            build_random_vectors_and_async_index(np.uint8, "l2"),
+            build_random_vectors_and_async_index(np.int8, "l2"),
+            build_random_vectors_and_async_index(np.float32, "cosine"),
+        ]
+        cls._test_matrix_with_pq = [
+            build_random_vectors_and_async_index(np.float32, "l2", use_pq=True, num_pq_bytes=5, use_opq=False),
+            build_random_vectors_and_async_index(np.uint8, "l2", use_pq=True, num_pq_bytes=5, use_opq=False),
+            build_random_vectors_and_async_index(np.int8, "l2", use_pq=True, num_pq_bytes=5, use_opq=False),
+            build_random_vectors_and_async_index(np.int8, "cosine", use_pq=True, num_pq_bytes=5, use_opq=False),
+        ]


These tests build multiple full indices in setUpClass using 10,000-point datasets (and test_async_index builds 8 of them). This is likely to make CI runs for wheel validation very slow/flaky. Consider reducing vector counts for unit tests, reusing a single built index across classes, and/or gating the large benchmarks behind an env var (e.g., only run the large-build matrix when explicitly requested).

Copilot · 2026-03-31T05:14:33Z

DiskAnnPy/python/diskannpy/_static_disk_index.py

+from typing import Optional
+
+import numpy as np
+from typing import Optional


Duplicate from typing import Optional import. Remove the redundant import to avoid lint noise and keep imports tidy.

Suggested change

from typing import Optional

Copilot · 2026-03-31T05:14:33Z

DiskAnnPy/src/build_async_memory_index.rs

+        // Allow the iterator to be shared
+        let dataset_iter = Arc::new(Mutex::new(
+            VectorDataIterator::<StorageType, AdHoc<T>>::new(
+                &data_path,
+                Option::None,
+                storage_provider,
+            )?
+            .enumerate(),
+        ));


dataset_iter is protected by std::sync::Mutex but is locked from within async tasks (tasks.spawn(async move { ... lock() ... })). A blocking mutex in async code can stall Tokio worker threads under contention. Prefer tokio::sync::Mutex (or a channel/stream-based producer) for the shared iterator to avoid blocking the runtime.

Copilot · 2026-03-31T05:14:33Z

DiskAnnPy/python/diskannpy/_common.py

+    if dtype == np.uint8:
+        return np.uint8
+    if dtype == np.int8:
+        return np.int8
+    if dtype == np.float32:
+        return np.float32
+
+def valid_dap_dtype(dtype: Type) -> dap.DataType:
+    _assert_dtype(dtype)
+    if dtype == np.uint8:
+        return dap.DataType.Uint8
+    if dtype == np.int8:
+        return dap.DataType.Int8
+    if dtype == np.float32:
+        return dap.DataType.Float


valid_dap_dtype() has the same issue as valid_dtype(): _assert_dtype permits castable aliases (e.g., np.single), but this function only matches exact np.float32/np.int8/np.uint8 and otherwise returns None. Canonicalize aliases before the equality checks (or compare via np.dtype(dtype)), and ensure this function always returns a dap.DataType once _assert_dtype passes.

Suggested change

if dtype == np.uint8:

return np.uint8

if dtype == np.int8:

return np.int8

if dtype == np.float32:

return np.float32

def valid_dap_dtype(dtype: Type) -> dap.DataType:

_assert_dtype(dtype)

if dtype == np.uint8:

return dap.DataType.Uint8

if dtype == np.int8:

return dap.DataType.Int8

if dtype == np.float32:

return dap.DataType.Float

canonical = np.dtype(dtype).type

if canonical == np.uint8:

return np.uint8

if canonical == np.int8:

return np.int8

if canonical == np.float32:

return np.float32

# This should be unreachable if _assert_dtype and _VALID_DTYPES stay in sync,

# but is kept as a defensive check.

raise ValueError(f"Unsupported vector dtype: {dtype!r}")

def valid_dap_dtype(dtype: Type) -> dap.DataType:

"""

Utility method to determine the corresponding dap.DataType for a supported numpy dtype.

"""

_assert_dtype(dtype)

canonical = np.dtype(dtype).type

if canonical == np.uint8:

return dap.DataType.Uint8

if canonical == np.int8:

return dap.DataType.Int8

if canonical == np.float32:

return dap.DataType.Float

# This should be unreachable if _assert_dtype and _VALID_DTYPES stay in sync,

# but is kept as a defensive check.

raise ValueError(f"Unsupported dap vector dtype: {dtype!r}")

…y-slice guard, unused import, duplicate import, __all__ exports; clean up CI triggers

microsoft-github-policy-service · 2026-03-31T08:17:02Z

@YuanyuanTian-hh please read the following Contributor License Agreement(CLA). If you agree with the CLA, please reply with the following information.

@microsoft-github-policy-service agree [company="{your company}"]

Options:

(default - no company specified) I have sole ownership of intellectual property rights to my Submissions and I am not making Submissions in the course of work for my employer.
@microsoft-github-policy-service agree
(when company given) I am making Submissions in the course of work for my employer (or my employer has intellectual property rights in my Submissions by contract or applicable law). I have permission from my employer to make Submissions and enter into this Agreement on behalf of my employer. By signing below, the defined term “You” includes me and my employer.
@microsoft-github-policy-service agree company="Microsoft"

Contributor License Agreement

Contribution License Agreement

This Contribution License Agreement (“Agreement”) is agreed to by the party signing below (“You”),
and conveys certain license rights to Microsoft Corporation and its affiliates (“Microsoft”) for Your
contributions to Microsoft open source projects. This Agreement is effective as of the latest signature
date below.

Definitions.
“Code” means the computer software code, whether in human-readable or machine-executable form,
that is delivered by You to Microsoft under this Agreement.
“Project” means any of the projects owned or managed by Microsoft and offered under a license
approved by the Open Source Initiative (www.opensource.org).
“Submit” is the act of uploading, submitting, transmitting, or distributing code or other content to any
Project, including but not limited to communication on electronic mailing lists, source code control
systems, and issue tracking systems that are managed by, or on behalf of, the Project for the purpose of
discussing and improving that Project, but excluding communication that is conspicuously marked or
otherwise designated in writing by You as “Not a Submission.”
“Submission” means the Code and any other copyrightable material Submitted by You, including any
associated comments and documentation.
Your Submission. You must agree to the terms of this Agreement before making a Submission to any
Project. This Agreement covers any and all Submissions that You, now or in the future (except as
described in Section 4 below), Submit to any Project.
Originality of Work. You represent that each of Your Submissions is entirely Your original work.
Should You wish to Submit materials that are not Your original work, You may Submit them separately
to the Project if You (a) retain all copyright and license information that was in the materials as You
received them, (b) in the description accompanying Your Submission, include the phrase “Submission
containing materials of a third party:” followed by the names of the third party and any licenses or other
restrictions of which You are aware, and (c) follow any other instructions in the Project’s written
guidelines concerning Submissions.
Your Employer. References to “employer” in this Agreement include Your employer or anyone else
for whom You are acting in making Your Submission, e.g. as a contractor, vendor, or agent. If Your
Submission is made in the course of Your work for an employer or Your employer has intellectual
property rights in Your Submission by contract or applicable law, You must secure permission from Your
employer to make the Submission before signing this Agreement. In that case, the term “You” in this
Agreement will refer to You and the employer collectively. If You change employers in the future and
desire to Submit additional Submissions for the new employer, then You agree to sign a new Agreement
and secure permission from the new employer before Submitting those Submissions.
Licenses.

Copyright License. You grant Microsoft, and those who receive the Submission directly or
indirectly from Microsoft, a perpetual, worldwide, non-exclusive, royalty-free, irrevocable license in the
Submission to reproduce, prepare derivative works of, publicly display, publicly perform, and distribute
the Submission and such derivative works, and to sublicense any or all of the foregoing rights to third
parties.
Patent License. You grant Microsoft, and those who receive the Submission directly or
indirectly from Microsoft, a perpetual, worldwide, non-exclusive, royalty-free, irrevocable license under
Your patent claims that are necessarily infringed by the Submission or the combination of the
Submission with the Project to which it was Submitted to make, have made, use, offer to sell, sell and
import or otherwise dispose of the Submission alone or with the Project.
Other Rights Reserved. Each party reserves all rights not expressly granted in this Agreement.
No additional licenses or rights whatsoever (including, without limitation, any implied licenses) are
granted by implication, exhaustion, estoppel or otherwise.

Representations and Warranties. You represent that You are legally entitled to grant the above
licenses. You represent that each of Your Submissions is entirely Your original work (except as You may
have disclosed under Section 3). You represent that You have secured permission from Your employer to
make the Submission in cases where Your Submission is made in the course of Your work for Your
employer or Your employer has intellectual property rights in Your Submission by contract or applicable
law. If You are signing this Agreement on behalf of Your employer, You represent and warrant that You
have the necessary authority to bind the listed employer to the obligations contained in this Agreement.
You are not expected to provide support for Your Submission, unless You choose to do so. UNLESS
REQUIRED BY APPLICABLE LAW OR AGREED TO IN WRITING, AND EXCEPT FOR THE WARRANTIES
EXPRESSLY STATED IN SECTIONS 3, 4, AND 6, THE SUBMISSION PROVIDED UNDER THIS AGREEMENT IS
PROVIDED WITHOUT WARRANTY OF ANY KIND, INCLUDING, BUT NOT LIMITED TO, ANY WARRANTY OF
NONINFRINGEMENT, MERCHANTABILITY, OR FITNESS FOR A PARTICULAR PURPOSE.
Notice to Microsoft. You agree to notify Microsoft in writing of any facts or circumstances of which
You later become aware that would make Your representations in this Agreement inaccurate in any
respect.
Information about Submissions. You agree that contributions to Projects and information about
contributions may be maintained indefinitely and disclosed publicly, including Your name and other
information that You submit with Your Submission.
Governing Law/Jurisdiction. This Agreement is governed by the laws of the State of Washington, and
the parties consent to exclusive jurisdiction and venue in the federal courts sitting in King County,
Washington, unless no federal subject matter jurisdiction exists, in which case the parties consent to
exclusive jurisdiction and venue in the Superior Court of King County, Washington. The parties waive all
defenses of lack of personal jurisdiction and forum non-conveniens.
Entire Agreement/Assignment. This Agreement is the entire agreement between the parties, and
supersedes any and all prior agreements, understandings or communications, written or oral, between
the parties relating to the subject matter hereof. This Agreement may be assigned by Microsoft.

codecov-commenter · 2026-03-31T08:31:01Z

Codecov Report

✅ All modified and coverable lines are covered by tests.
✅ Project coverage is 89.31%. Comparing base (b616aa3) to head (74d1e2c).
⚠️ Report is 3 commits behind head on main.

Additional details and impacted files

@@            Coverage Diff             @@
##             main     #872      +/-   ##
==========================================
- Coverage   90.45%   89.31%   -1.14%     
==========================================
  Files         442      445       +3     
  Lines       83248    84424    +1176     
==========================================
+ Hits        75301    75407     +106     
- Misses       7947     9017    +1070

Flag	Coverage Δ
miri	`89.31% <ø> (-1.14%)`	⬇️
unittests	`89.16% <ø> (-1.26%)`	⬇️

Flags with carried forward coverage won't be shown. Click here to find out more.
see 73 files with indirect coverage changes

🚀 New features to boost your workflow:

❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

This reverts commit 208b01c.

Yuanyuan Tian (from Dev Box) added 2 commits March 31, 2026 10:45

Add DiskAnnPy Python wheel and PyPI publish workflow

cb90711

Rename PyPI package to diskannpy-rust and remove environment requirement

4772ba3

YuanyuanTian-hh marked this pull request as ready for review March 31, 2026 05:08

YuanyuanTian-hh requested review from a team and Copilot March 31, 2026 05:08

Copilot started reviewing on behalf of YuanyuanTian-hh March 31, 2026 05:09 View session

temp: add push trigger to test wheel build on branch

1b00a8f

Copilot AI reviewed Mar 31, 2026

View reviewed changes

YuanyuanTian-hh changed the title ~~Add wheel~~ Add DiskAnnPy Python wheel (diskannpy-rust) and PyPI publish workflow Mar 31, 2026

Yuanyuan Tian (from Dev Box) added 5 commits March 31, 2026 13:27

Apply cargo fmt to async_memory_index.rs and bftree_index.rs

d0a46a7

temp: enable publish on push for testing

0845167

Exclude DiskAnnPy from workspace CI (built separately via maturin)

c3401b6

Fix: exclude DiskAnnPy from coverage, remove temp push triggers

4f6d353

Address Copilot review: fix dtype canonicalization, l_value bug, empt…

8f0e25e

…y-slice guard, unused import, duplicate import, __all__ exports; clean up CI triggers

Yuanyuan Tian (from Dev Box) added 2 commits March 31, 2026 16:43

Include DiskAnnPy in workspace CI (remove --exclude)

208b01c

Revert "Include DiskAnnPy in workspace CI (remove --exclude)"

74d1e2c

This reverts commit 208b01c.

	graph_degree, complexity, ALPHA, num_threads, num_of_pq_chunks, build_dram_budget
	graph_degree, complexity, alpha, num_threads, num_of_pq_chunks, build_dram_budget

Conversation

YuanyuanTian-hh commented Mar 31, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Summary

What's included

CI changes

Test results

Known limitations

How to install

Copilot review fixes addressed

Uh oh!

Copilot AI left a comment

Choose a reason for hiding this comment

Pull request overview

Reviewed changes

Uh oh!

Copilot AI Mar 31, 2026

Choose a reason for hiding this comment

Uh oh!

Copilot AI Mar 31, 2026

Choose a reason for hiding this comment

Uh oh!

Copilot AI Mar 31, 2026

Choose a reason for hiding this comment

Uh oh!

Copilot AI Mar 31, 2026

Choose a reason for hiding this comment

Uh oh!

Copilot AI Mar 31, 2026

Choose a reason for hiding this comment

Uh oh!

Copilot AI Mar 31, 2026

Choose a reason for hiding this comment

Uh oh!

Copilot AI Mar 31, 2026

Choose a reason for hiding this comment

Uh oh!

Copilot AI Mar 31, 2026

Choose a reason for hiding this comment

Uh oh!

Copilot AI Mar 31, 2026

Choose a reason for hiding this comment

Uh oh!

Copilot AI Mar 31, 2026

Choose a reason for hiding this comment

Uh oh!

microsoft-github-policy-service bot commented Mar 31, 2026

Contribution License Agreement

Uh oh!

codecov-commenter commented Mar 31, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Codecov Report

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

YuanyuanTian-hh commented Mar 31, 2026 •

edited

Loading

codecov-commenter commented Mar 31, 2026 •

edited

Loading