-
Notifications
You must be signed in to change notification settings - Fork 243
Description
Python stdlib's enum library has notoriously poor performance. It would be nice to switch to something with fewer performance pitfalls. Measurement with Python 3.15's sampling profiler shows that approximately 23% of the time importing cuda.bindings.driver is spent on creating enums. There are also (smaller) overheads accessing them from Cython, since they are implemented in pure Python. (See #1543 -- there have reportedly been other performance issues over time).
The obvious alternative is to use Cython's cpdef enum syntax. It generates a C-style enum which is theoretically more performant, but for the Python exposure, it uses the same stdlib enum. This means we still have the massive performance penalty at import and on the Python side, and an additional layer of translation between the two. (Not to mention the fact we already have a simple C exposure generated in our cy* layer). Measurement of a prototype shows this actually has even worse performance, with enum creation time becoming 50% of import time.
It should be easy to create something that meets the basic needs of an enum type without these performance overheads. The tricky bit is that the API surface of stdlib enums is surprisingly large and includes some unusual things. We will need to decide which we care about and which we are willing to forego -- maybe doing an analysis of our user's code if possible.
Here is roughly the API surface we would need to cover (feel free to add if I missed something):
- For the enum "container":
- The
__members__attribute is a mapping from name to enum values __contains__works for base values, i.e.5 in container__getitem__works for names, i.e.Container["ENUM_VALUE"]__iter__works to iterate over all enum values__len__returns the number of enumeration values
- The
- For the enum "values":
- A helpful
__repr__, e.g.<Container.VALUE_ONE: 0> - Inherits from
intso numeric operations work, most importantly to bitwise or flags, e.g.BIT_FIELD_1 | BIT_FIELD_2 .valuegives the underlying int,.namegives the name- All of the sibling enumeration values are available as members on each value. This is an unusual thing to do, but it's possible some of our users rely on it. e.g.
y = Container.VALUE_ONE; assert y.VALUE_TWO == Container.VALUE_TWO isinstance(value, Container) == True. This is kind of a weird implementation detail, but again, something our users may rely on.
- A helpful
Though the generated code carefully adds comments to each of the values, these comments don't make it into the extension at all. We have the opportunity to improve that here and make help(value) return a proper docstring for probably no performance penalty (other than binary size).