Skip to content

Conversation

@przepeck
Copy link
Collaborator

🛠 Summary

CVS-179961
Changing --enable_prefix_caching param to be consistent with ovms cli api, dropping this param since its true by default now,
Adding devstral tool_parser option to export_model

🧪 Checklist

  • Unit tests added.
  • The documentation updated.
  • Change follows security best practices.
    ``

Copy link
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR updates the --enable_prefix_caching parameter to align with OVMS CLI API conventions by changing it from an action flag to a boolean type with a default value of True. Since prefix caching is now enabled by default, all explicit usages of this parameter have been removed from documentation examples. Additionally, the "devstral" option has been added to the tool_parser choices.

Changes:

  • Modified --enable_prefix_caching argument from action='store_true' to type=bool with default=True
  • Removed --enable_prefix_caching flags from all command examples across documentation
  • Added "devstral" as a new tool_parser option

Reviewed changes

Copilot reviewed 4 out of 4 changed files in this pull request and generated 1 comment.

File Description
demos/continuous_batching/accuracy/README.md Removed --enable_prefix_caching flag from export_model command example
demos/common/export_models/export_model.py Changed enable_prefix_caching to boolean type with True default; added devstral to tool_parser choices
demos/common/export_models/README.md Updated help text to reflect new boolean parameter type and added devstral to tool_parser options
demos/code_local_assistant/README.md Removed --enable_prefix_caching flags from multiple export_model and ovms command examples

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

@@ -21,7 +21,7 @@ mkdir models
python demos/common/export_models/export_model.py text_generation --source_model meta-llama/Meta-Llama-3-8B-Instruct --weight-format fp16 --kv_cache_precision u8 --config_file_path models/config.json --model_repository_path models
python demos/common/export_models/export_model.py text_generation --source_model meta-llama/Meta-Llama-3-8B --weight-format fp16 --kv_cache_precision u8 --config_file_path models/config.json --model_repository_path models
python demos/common/export_models/export_model.py text_generation --source_model OpenGVLab/InternVL2_5-8B --weight-format fp16 --config_file_path models/config.json --model_repository_path models
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

should I add --enable_prefix_caching false to these commands?

przepeck and others added 2 commits January 30, 2026 11:36
Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants