Parakeet: Support quantization on XNNPACK by mergennachin · Pull Request #17175 · pytorch/executorch

mergennachin · 2026-02-03T22:58:25Z

model.pte is 719.2 MB

Runtime output:

(executorch_dev) mnachin@mnachin-mbp executorch % ./cmake-out/examples/models/parakeet/parakeet_runner \
  --model_path examples/models/parakeet/parakeet_quantized_xnnpack/model.pte \
  --audio_path output.wav \
  --tokenizer_path examples/models/parakeet/parakeet_quantized_xnnpack/tokenizer.model
I tokenizers:regex.cpp:27] Registering override fallback regex
E tokenizers:hf_tokenizer.cpp:82] Error parsing json file: [json.exception.parse_error.101] parse error at line 2, column 1: syntax error while parsing value - invalid literal; last read: '<U+000A><U+000E>'
E tokenizers:tiktoken.cpp:59] invalid tiktoken line:
Transcribed text: mister Quilter is the apostle of the middle classes, and we are glad to welcome his gospel. Nor is Mr. Quilter's manner less interesting than his matter. He tells us that at this festive season of the year, with Christmas and roast beef looming before us, similes drawn from eating and its results occur most readily to the mind. He has grave doubts whether Sir Frederick Leighton's work is really Greek after all, and can discover

Segment timestamps:
0.24s - 5.6s : mister Quilter is the apostle of the middle classes, and we are glad to welcome his gospel.
6.24s - 6.96s : Nor is Mr.
7.12s - 10.24s : Quilter's manner less interesting than his matter.
11.04s - 22.96s : He tells us that at this festive season of the year, with Christmas and roast beef looming before us, similes drawn from eating and its results occur most readily to the mind.
23.44s - 29.76s : He has grave doubts whether Sir Frederick Leighton's work is really Greek after all, and can discover

pytorch-bot · 2026-02-03T22:58:30Z

🔗 Helpful Links

🧪 See artifacts and rendered test results at hud.pytorch.org/pr/pytorch/executorch/17175

📄 Preview Python docs built from this PR

Note: Links to docs will display an error until the docs builds have been completed.

❌ 1 New Failure

As of commit 22e744b with merge base eee5d96 ():

NEW FAILURE - The following job has failed:

pull / unittest-arm-backend-with-no-deps (test_pytest_models_tosa) / linux-job (gh)
RuntimeError: Command docker exec -t e2d7c709c2d7e47eea0fd2d3562118eb292a0c99568528b54225ae886dee402b /exec failed with exit code 1

This comment was automatically generated by Dr. CI and updates every 15 minutes.

github-actions · 2026-02-03T22:59:14Z

This PR needs a `release notes:` label

If your change should be included in the release notes (i.e. would users of this library care about this change?), please use a label starting with release notes:. This helps us keep track and include your important work in the next release notes.

To add a label, you can comment to pytorchbot, for example
@pytorchbot label "release notes: none"

For more information, see
https://github.com/pytorch/pytorch/wiki/PyTorch-AutoLabel-Bot#why-categorize-for-release-notes-and-how-does-it-work.

Copilot

Pull request overview

This PR adds support for dynamic quantization (8da4w) on the XNNPACK backend for the Parakeet TDT speech recognition model.

Changes:

Adds HQQ (Half-Quadratic Quantization) scale-only algorithm for 8da4w quantization configuration
Enables XNNPACK backend to handle both dynamically quantized operations and remaining floating-point operations using dual partitioners
Adds documentation and examples for using dynamic quantization with XNNPACK

Reviewed changes

Copilot reviewed 3 out of 3 changed files in this pull request and generated no comments.

File	Description
examples/models/parakeet/quantize.py	Adds `intx_choose_qparams_algorithm="hqq_scale_only"` parameter to 8da4w quantization config for improved quantization quality with grouped quantization
examples/models/parakeet/export_parakeet_tdt.py	Imports and uses `XnnpackDynamicallyQuantizedPartitioner` alongside `XnnpackPartitioner` for handling dynamic quantization ops; enables quantization fusion and constant propagation
examples/models/parakeet/README.md	Adds documentation and example command for using 8da4w dynamic quantization with XNNPACK backend

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

larryliu0820 · 2026-02-04T19:25:19Z

Can you add a CI? I don't know which job specifically but you should be able to run this script https://github.com/pytorch/executorch/blob/main/.ci/scripts/test_model_e2e.sh

mergennachin · 2026-02-04T19:58:17Z

Duplicate #17216

Old PR: #17175 CI: https://github.com/pytorch/executorch/actions/runs/21691143266/job/62550971847?pr=17216

Parakeet: Support quantization on XNNPACK

22e744b

mergennachin requested a review from lucylq as a code owner February 3, 2026 22:58

Copilot AI review requested due to automatic review settings February 3, 2026 22:58

meta-cla bot added the CLA Signed This label is managed by the Facebook bot. Authors need to sign the CLA before a PR can be reviewed. label Feb 3, 2026

mergennachin requested a review from larryliu0820 February 3, 2026 22:58

Copilot started reviewing on behalf of mergennachin February 3, 2026 22:58 View session

mergennachin requested review from Gasoonjia, GregoryComer and JacobSzwejbka February 3, 2026 22:58

Copilot AI reviewed Feb 3, 2026

View reviewed changes

mergennachin requested a review from manuelcandales February 3, 2026 23:10

mergennachin closed this Feb 4, 2026

mergennachin mentioned this pull request Feb 4, 2026

Parakeet: Support quantization on XNNPACK #17216

Merged

mergennachin added a commit that referenced this pull request Feb 4, 2026

Parakeet: Support quantization on XNNPACK (#17216)

f2f337e

Old PR: #17175 CI: https://github.com/pytorch/executorch/actions/runs/21691143266/job/62550971847?pr=17216

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Parakeet: Support quantization on XNNPACK#17175

Parakeet: Support quantization on XNNPACK#17175
mergennachin wants to merge 1 commit intomainfrom
export-D92178323

mergennachin commented Feb 3, 2026 •

edited

Loading

Uh oh!

pytorch-bot bot commented Feb 3, 2026 •

edited

Loading

Uh oh!

github-actions bot commented Feb 3, 2026

Uh oh!

Copilot AI left a comment

Uh oh!

larryliu0820 commented Feb 4, 2026

Uh oh!

mergennachin commented Feb 4, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Conversation

mergennachin commented Feb 3, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

pytorch-bot bot commented Feb 3, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

🔗 Helpful Links

🧪 See artifacts and rendered test results at hud.pytorch.org/pr/pytorch/executorch/17175

❌ 1 New Failure

Uh oh!

github-actions bot commented Feb 3, 2026

This PR needs a release notes: label

Uh oh!

Copilot AI left a comment

Choose a reason for hiding this comment

Pull request overview

Reviewed changes

Uh oh!

larryliu0820 commented Feb 4, 2026

Uh oh!

mergennachin commented Feb 4, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

mergennachin commented Feb 3, 2026 •

edited

Loading

pytorch-bot bot commented Feb 3, 2026 •

edited

Loading

This PR needs a `release notes:` label