Fix KeyError in InsertIOQDQ pass for LLM quantization by mohammed-saalim · Pull Request #17194 · pytorch/executorch

mohammed-saalim · 2026-02-04T06:06:37Z

Summary

This PR fixes a KeyError in the InsertIOQDQ pass that occurrs when quantizing LLMs (such as SmolLM2) for the Qualcomm QNN backend.

Problem

In insert_io_qdq.py, the q_dq_map dictionary was missing entries for dequantize operations. When a node's quantization encoding was already a dequantize operation (e.g., dequantize_per_tensor.default), trying to look it up in the map during the _insert phase caused a KeyError.

Solution

Extended the q_dq_map to include dequantize-to-self (identity) mappings for:

quantized_decomposed.dequantize_per_tensor.default
quantized_decomposed.dequantize_per_tensor.tensor
quantized_decomposed.dequantize_per_channel.default
This allows the pass to correctly handle nodes that have already been processed into dequantized form.

Testing

Verified that the modified file parses correctly via Python's ast module.
Confirmed that q_dq_map now contains the expected 6 keys.
Manual verification on Qualcomm hardware is requested from the maintainers to confirm resolution for the SmolLM2 workflow.
Fixes Qualcomm Quantization and Lowering for LLM fails #16690

cc @cccclai @winskuo-quic @shewu-quic @haowhsu-quic @DannyYuyang-quic @cbilgin

Extend q_dq_map to include dequantize ops mapping to themselves. This fixes KeyError when nodes have dequantize encodings (e.g., dequantize_per_tensor.default) instead of quantize encodings. Fixes pytorch#16690

pytorch-bot · 2026-02-04T06:06:41Z

🔗 Helpful Links

🧪 See artifacts and rendered test results at hud.pytorch.org/pr/pytorch/executorch/17194

📄 Preview Python docs built from this PR

Note: Links to docs will display an error until the docs builds have been completed.

❌ 4 New Failures, 1 Unrelated Failure

As of commit 5ebb788 with merge base 2ace1cc ():

NEW FAILURES - The following jobs have failed:

Copilot code review / Cleanup artifacts (gh)
Process completed with exit code 1.
pull / test-samsung-models-linux / linux-job (gh)
RuntimeError: Command docker exec -t 17211e1757a0e556c0692cc523ff522837f097ccaaf7b207d7d5715d33f922d5 /exec failed with exit code 1
pull / test-samsung-quantmodels-linux / linux-job (gh)
RuntimeError: Command docker exec -t 20884fa91660f79153b0bdfd7fcb1050c8ee7b5461ded1530ed930fb4e1ab2ad /exec failed with exit code 1
pull / unittest-arm-backend-with-no-deps (test_pytest_models_tosa) / linux-job (gh)
RuntimeError: Command docker exec -t 05e3bfe9b447cb990c1a3ac55d82821a40c5e03d804f05dffc59b137052d104d /exec failed with exit code 1

FLAKY - The following job failed but was likely due to flakiness present on trunk:

pull / test-llama-runner-qnn-linux (fp32, qnn_16a16w, qnn) / linux-job (gh) (detected as infra flaky with no log or failing log classifier)

This comment was automatically generated by Dr. CI and updates every 15 minutes.

mohammed-saalim · 2026-02-04T06:10:14Z

While changing the quantization recipe (like using 8-bit KV cache) might change the graph structure, the InsertIOQDQ.py
pass should still be robust enough to handle dequantize operations in the IR without throwing a KeyError. This PR ensures the pass is forward-compatible with models that already have these encodings

mohammed-saalim · 2026-02-04T06:11:18Z

@pytorchbot label "release notes: none"

Copilot

Pull request overview

This PR fixes a KeyError in the InsertIOQDQ pass that occurred when quantizing LLMs (such as SmolLM2) for the Qualcomm QNN backend. The error was caused by missing entries in the q_dq_map dictionary for dequantize operations.

Changes:

Extended q_dq_map with identity mappings for dequantize operations to handle nodes that already have dequantize encodings
Added three new entries mapping dequantize operations to themselves (per-tensor default, per-tensor tensor, and per-channel default)

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Fix KeyError in InsertIOQDQ pass for LLM quantization

5ebb788

Extend q_dq_map to include dequantize ops mapping to themselves. This fixes KeyError when nodes have dequantize encodings (e.g., dequantize_per_tensor.default) instead of quantize encodings. Fixes pytorch#16690

mohammed-saalim requested a review from cccclai as a code owner February 4, 2026 06:06

Copilot AI review requested due to automatic review settings February 4, 2026 06:06

meta-cla bot added the CLA Signed This label is managed by the Facebook bot. Authors need to sign the CLA before a PR can be reviewed. label Feb 4, 2026

Copilot started reviewing on behalf of mohammed-saalim February 4, 2026 06:06 View session

pytorch-bot bot added the release notes: none Do not include this in the release notes label Feb 4, 2026

Copilot AI reviewed Feb 4, 2026

View reviewed changes

nil-is-all added partner: qualcomm For backend delegation, kernels, demo, etc. from the 3rd-party partner, Qualcomm module: qnn Issues related to Qualcomm's QNN delegate and code under backends/qualcomm/ labels Feb 4, 2026

cccclai requested review from chenweng-quic, haowhsu-quic, shewu-quic and winskuo-quic February 4, 2026 18:20

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Fix KeyError in InsertIOQDQ pass for LLM quantization#17194

Fix KeyError in InsertIOQDQ pass for LLM quantization#17194
mohammed-saalim wants to merge 1 commit intopytorch:mainfrom
mohammed-saalim:fix-insert-io-qdq-keyerror

mohammed-saalim commented Feb 4, 2026 •

edited by pytorch-bot bot

Loading

Uh oh!

pytorch-bot bot commented Feb 4, 2026 •

edited

Loading

Uh oh!

mohammed-saalim commented Feb 4, 2026

Uh oh!

mohammed-saalim commented Feb 4, 2026

Uh oh!

Copilot AI left a comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Conversation

mohammed-saalim commented Feb 4, 2026 • edited by pytorch-bot bot Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Summary

Problem

Solution

Testing

Uh oh!

pytorch-bot bot commented Feb 4, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

🔗 Helpful Links

🧪 See artifacts and rendered test results at hud.pytorch.org/pr/pytorch/executorch/17194

❌ 4 New Failures, 1 Unrelated Failure

Uh oh!

mohammed-saalim commented Feb 4, 2026

Uh oh!

mohammed-saalim commented Feb 4, 2026

Uh oh!

Copilot AI left a comment

Choose a reason for hiding this comment

Pull request overview

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

mohammed-saalim commented Feb 4, 2026 •

edited by pytorch-bot bot

Loading

pytorch-bot bot commented Feb 4, 2026 •

edited

Loading