-
Notifications
You must be signed in to change notification settings - Fork 238
Prefix caching for NPU in nested plugin config structure #3929
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Pull request overview
This PR modifies the plugin configuration structure to support prefix caching for NPU devices by moving NPU-specific parameters into a nested DEVICE_PROPERTIES.NPU structure instead of keeping them at the top level.
Changes:
- Updated C++ code to check for and add
NPUW_LLM_ENABLE_PREFIX_CACHINGwithin the nestedDEVICE_PROPERTIES.NPUstructure - Modified Python export script to create the nested structure for NPU-specific properties including
MAX_PROMPT_LENandNPUW_LLM_ENABLE_PREFIX_CACHING
Reviewed changes
Copilot reviewed 2 out of 2 changed files in this pull request and generated 4 comments.
| File | Description |
|---|---|
| src/graph_export/graph_export.cpp | Enhanced conflict detection to search both top-level and nested structures, and changed the insertion logic to add prefix caching parameter within DEVICE_PROPERTIES.NPU |
| demos/common/export_models/export_model.py | Removed top-level MAX_PROMPT_LEN assignment and consolidated NPU-specific properties into the nested DEVICE_PROPERTIES.NPU structure |
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
| devicePropsItr = d.FindMember("DEVICE_PROPERTIES"); | ||
| } | ||
|
|
||
| auto npuItr = devicePropsItr->value.FindMember("NPU"); | ||
| if (npuItr == devicePropsItr->value.MemberEnd()) { | ||
| rapidjson::Value npuObj(rapidjson::kObjectType); | ||
| devicePropsItr->value.AddMember("NPU", npuObj, d.GetAllocator()); | ||
| npuItr = devicePropsItr->value.FindMember("NPU"); |
Copilot
AI
Jan 28, 2026
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
After adding the member in line 621, the iterator lookup in line 622 is unnecessary. The AddMember operation returns void, but you can use the newly created deviceProps object directly or leverage the fact that the member was just added to avoid the redundant FindMember call. Consider storing a reference or iterator during creation to avoid this second lookup.
| devicePropsItr = d.FindMember("DEVICE_PROPERTIES"); | |
| } | |
| auto npuItr = devicePropsItr->value.FindMember("NPU"); | |
| if (npuItr == devicePropsItr->value.MemberEnd()) { | |
| rapidjson::Value npuObj(rapidjson::kObjectType); | |
| devicePropsItr->value.AddMember("NPU", npuObj, d.GetAllocator()); | |
| npuItr = devicePropsItr->value.FindMember("NPU"); | |
| devicePropsItr = d.MemberEnd(); | |
| --devicePropsItr; | |
| } | |
| auto npuItr = devicePropsItr->value.FindMember("NPU"); | |
| if (npuItr == devicePropsItr->value.MemberEnd()) { | |
| rapidjson::Value npuObj(rapidjson::kObjectType); | |
| devicePropsItr->value.AddMember("NPU", npuObj, d.GetAllocator()); | |
| npuItr = devicePropsItr->value.MemberEnd(); | |
| --npuItr; |
| devicePropsItr->value.AddMember("NPU", npuObj, d.GetAllocator()); | ||
| npuItr = devicePropsItr->value.FindMember("NPU"); | ||
| } | ||
|
|
||
| npuItr->value.AddMember("NPUW_LLM_ENABLE_PREFIX_CACHING", value, d.GetAllocator()); |
Copilot
AI
Jan 28, 2026
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Similar to the DEVICE_PROPERTIES creation, after adding the NPU member in line 628, the FindMember call in line 629 is redundant. Consider refactoring to avoid this unnecessary lookup.
| devicePropsItr->value.AddMember("NPU", npuObj, d.GetAllocator()); | |
| npuItr = devicePropsItr->value.FindMember("NPU"); | |
| } | |
| npuItr->value.AddMember("NPUW_LLM_ENABLE_PREFIX_CACHING", value, d.GetAllocator()); | |
| npuObj.AddMember("NPUW_LLM_ENABLE_PREFIX_CACHING", value, d.GetAllocator()); | |
| devicePropsItr->value.AddMember("NPU", npuObj, d.GetAllocator()); | |
| } else { | |
| npuItr->value.AddMember("NPUW_LLM_ENABLE_PREFIX_CACHING", value, d.GetAllocator()); | |
| } |
| device_properties = { "NPU": npu_properties } | ||
| plugin_config['DEVICE_PROPERTIES'] = device_properties |
Copilot
AI
Jan 28, 2026
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The variable name 'device_properties' is ambiguous since it only contains NPU properties in this context. Consider renaming to 'npu_device_properties' or using a more descriptive name to clarify that this is specifically for NPU configuration.
| device_properties = { "NPU": npu_properties } | |
| plugin_config['DEVICE_PROPERTIES'] = device_properties | |
| npu_device_properties = { "NPU": npu_properties } | |
| plugin_config['DEVICE_PROPERTIES'] = npu_device_properties |
| npu_properties['NPUW_LLM_ENABLE_PREFIX_CACHING'] = True | ||
| device_properties = { "NPU": npu_properties } | ||
| plugin_config['DEVICE_PROPERTIES'] = device_properties | ||
|
|
Copilot
AI
Jan 28, 2026
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There is an extra blank line here. Consider removing this trailing whitespace to maintain consistent code formatting.
No description provided.