-
Notifications
You must be signed in to change notification settings - Fork 379
Add API to load pre-built index in wrapped_async #749
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Pull request overview
Adds a synchronous API in wrapped_async for loading a pre-built in-memory index from files by constructing a DataProvider via the storage loading traits, then wrapping it in the sync DiskANNIndex wrapper.
Changes:
- Introduces
SyncLoadWith+create_data_provider()indiskann-providers/src/index/wrapped_async.rsto load an index from a path/config. - Adds
futures-executordependency to synchronouslyblock_onthe asyncLoadWithfuture. - Updates
Cargo.lockfor the new dependency.
Reviewed changes
Copilot reviewed 2 out of 3 changed files in this pull request and generated 6 comments.
| File | Description |
|---|---|
| diskann-providers/src/index/wrapped_async.rs | Adds synchronous load API and data-provider construction for pre-built index loading. |
| diskann-providers/Cargo.toml | Adds futures-executor dependency. |
| Cargo.lock | Records dependency resolution changes for futures-executor. |
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
| let data_provider = DP::load_with(provider, &pq_context); | ||
|
|
||
| block_on(data_provider) | ||
| } |
Copilot
AI
Feb 10, 2026
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This introduces futures_executor::block_on + a new futures-executor dependency just to synchronously wait for DP::load_with. Since this module already depends on Tokio and constructs runtimes/handles, prefer using a Tokio runtime/handle to block_on the load (or otherwise avoid adding an extra executor dependency).
| prost = "0.14.1" | ||
| futures-util.workspace = true | ||
| futures-executor = "0.3" |
Copilot
AI
Feb 10, 2026
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
New dependency is added with an inline version (futures-executor = "0.3") instead of using [workspace.dependencies] like most shared third-party crates in this repo. Consider adding it to the workspace dependency list and referencing it via workspace = true to keep versions consistent across crates.
| pub trait SyncLoadWith: Sized { | ||
|
|
||
| fn load_with(path: &str, index_config: IndexConfiguration) -> Self; | ||
| } | ||
|
|
||
| // Load static memory index from pre-built files synchronously | ||
| impl<DP> SyncLoadWith for DiskANNIndex<DP> | ||
| where | ||
| DP: DataProvider<InternalId = u32> + LoadWith<AsyncQuantLoadContext, Error = ANNError>, | ||
| { | ||
| fn load_with(path: &str, index_config: IndexConfiguration) -> Self { | ||
| let storage = FileStorageProvider; | ||
| let data_provider = create_data_provider(&storage, path, &index_config); | ||
|
|
||
| match data_provider { | ||
| Ok(dp) => DiskANNIndex::<DP>::new_with_current_thread_runtime(index_config.config, dp), | ||
| Err(e) => panic!("Failed to create data provider: {}", e), | ||
| } | ||
| } | ||
| } | ||
|
|
||
| pub fn create_data_provider<'a, P, DP>( | ||
| provider: &P, | ||
| path: &'a str, | ||
| index_config: &'a IndexConfiguration, | ||
| ) -> ANNResult<DP> | ||
| where | ||
| P: StorageReadProvider, | ||
| DP: DataProvider + LoadWith<AsyncQuantLoadContext, Error = ANNError>, | ||
| { | ||
| let pq_context = AsyncQuantLoadContext { | ||
| metadata: AsyncIndexMetadata::new(path), | ||
| num_frozen_points: index_config.num_frozen_pts, | ||
| metric: index_config.dist_metric, | ||
| prefetch_lookahead: index_config.prefetch_lookahead.map(|x| x.get()), | ||
| is_disk_index: false, // only support in-memory index loading here | ||
| prefetch_cache_line_level: index_config.prefetch_cache_line_level, | ||
| }; | ||
|
|
||
| let data_provider = DP::load_with(provider, &pq_context); | ||
|
|
||
| block_on(data_provider) | ||
| } |
Copilot
AI
Feb 10, 2026
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The new synchronous load path (SyncLoadWith::load_with / create_data_provider) isn’t covered by tests. Given the number of moving parts (context creation, file layout, error propagation), please add a unit/integration test that round-trips an index to storage and loads it through this wrapper API (ideally using VirtualStorageProvider to avoid filesystem side effects).
|
|
||
| use crate::{model::IndexConfiguration, | ||
| storage::{StorageReadProvider, LoadWith, AsyncQuantLoadContext,AsyncIndexMetadata, file_storage_provider::FileStorageProvider}}; | ||
|
|
Copilot
AI
Feb 10, 2026
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The new use crate::{... storage::{...}} import block is not formatted consistently (extra blank line, missing spaces after commas, long single line). Please run rustfmt / reformat this use to match the rest of the crate’s import style for readability and to avoid CI format/lint churn.
| fn load_with(path: &str, index_config: IndexConfiguration) -> Self { | ||
| let storage = FileStorageProvider; | ||
| let data_provider = create_data_provider(&storage, path, &index_config); | ||
|
|
||
| match data_provider { | ||
| Ok(dp) => DiskANNIndex::<DP>::new_with_current_thread_runtime(index_config.config, dp), | ||
| Err(e) => panic!("Failed to create data provider: {}", e), | ||
| } |
Copilot
AI
Feb 10, 2026
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
SyncLoadWith::load_with panics on data-provider load failure. Since this is a public API for loading an index, it should return a Result (e.g., ANNResult<Self>) and propagate the underlying error instead of crashing the process.
| let pq_context = AsyncQuantLoadContext { | ||
| metadata: AsyncIndexMetadata::new(path), | ||
| num_frozen_points: index_config.num_frozen_pts, | ||
| metric: index_config.dist_metric, | ||
| prefetch_lookahead: index_config.prefetch_lookahead.map(|x| x.get()), | ||
| is_disk_index: false, // only support in-memory index loading here | ||
| prefetch_cache_line_level: index_config.prefetch_cache_line_level, | ||
| }; |
Copilot
AI
Feb 10, 2026
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
create_data_provider duplicates the existing storage::create_load_context() helper (diskann-providers/src/storage/index_storage.rs) that already builds an AsyncQuantLoadContext from IndexConfiguration. Reusing the shared helper would reduce drift and keep load semantics consistent across the crate.
Codecov Report❌ Patch coverage is
❌ Your patch status has failed because the patch coverage (0.00%) is below the target coverage (90.00%). You can increase the patch coverage or adjust the target coverage. Additional details and impacted files@@ Coverage Diff @@
## main #749 +/- ##
==========================================
- Coverage 89.01% 88.97% -0.04%
==========================================
Files 428 428
Lines 78294 78321 +27
==========================================
- Hits 69691 69687 -4
- Misses 8603 8634 +31
Flags with carried forward coverage won't be shown. Click here to find out more.
🚀 New features to boost your workflow:
|
Reference Issues/PRs
What does this implement/fix? Briefly explain your changes.
Any other comments?