-
Notifications
You must be signed in to change notification settings - Fork 243
Use faster method to get enum values #1543
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
|
Auto-sync is disabled for ready for review pull requests in this repository. Workflows must be run manually. Contributors can view more details about this message here. |
|
/ok to test |
This comment has been minimized.
This comment has been minimized.
|
@mdboom great find. Is this something where the Cython generated cpp code could be optimized in the currently slow case to avoid all of those Python calls? |
Currently cuda-bindings' enums are based on the stdlib |
|
Backport failed for Please cherry-pick the changes locally and resolve any conflicts. git fetch origin 12.9.x
git worktree add -d .worktree/backport-1543-to-12.9.x origin/12.9.x
cd .worktree/backport-1543-to-12.9.x
git switch --create backport-1543-to-12.9.x
git cherry-pick -x a7284ca5f37abd2ee6267bbf73a83ba63317686e |
|
|
Cython does have a https://cython.readthedocs.io/en/latest/src/userguide/language_basics.html#structs-unions-enums |
|
@mdboom it'd be nice to regenerate and backport this PR to @jakirkham Thanks for reminder. We avoided |
Unfortunately, More info here: #1557 |
|
Thanks Michael and Leo! 🙏 Michael have you shared these issues with the Cython team? |
See background discussion here.
It turns out that instead of
doing this:
reduces the function call overhead of
cuTensorMapEncodeTiledby about 25%. Not all API functions have that many enum arguments (5), so I wouldn't expect such a huge win across the board, but seems very worth doing nonetheless.The reason this works is because
IntEnuminherits fromint, so there is a good fast path in C to grab its value quickly.enum.valuehas to do both a dictionary lookup forvalueand run the Python-written value property code.