Use set for faster duplicate schema detection #87

Stephen0512 · 2026-01-24T03:38:24Z

This PR improves schema deduplication performance by replacing a list with a set when tracking previously seen (targetNamespace, filename) pairs.

Since this seen list is only used for membership checks in the current method, using a set provides O(1) average-time lookups and better matches the intended usage, while preserving existing behavior.

The issue was identified during an ongoing research project.

Use set for seen schema entries to improve lookup performance

2e72bb1

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Use set for faster duplicate schema detection #87

Use set for faster duplicate schema detection #87

Uh oh!

Stephen0512 commented Jan 24, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Use set for faster duplicate schema detection #87

Are you sure you want to change the base?

Use set for faster duplicate schema detection #87

Uh oh!

Conversation

Stephen0512 commented Jan 24, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant