Add UnionChecker to Optimize Union or |
Type Deserialization for Common Python Types (e.g., int
, Enum
, Literal
)
#166
Labels
Great To Have!
Important / Worth Addressing
help wanted
Extra attention is needed
performance
self-created
Opened by me!
Adding my thoughts here in the form of pseudo-code. I think one approach can be to create a class
UnionChecker
or similar inloaders.py
or a new module. It would be modeled similar toLoadMixIn
and have a method for each common type that would be in aUnion
or|
in a type annotation, such asint, float, enum, datetime, ...
and so on.Again, pseudo-code of how that could look:
Then, in parsers.py where we call
Parser.__contains__
inUnionParser.__call__
:We can replace that check with something like:
This should give a good compromise b/w efficiency and also fix the parsing support for
Union
, which is currently not fully supported in this library, and which I would like to address.Some Benchmarks
Also, just in case someone tells me I'm worried about micro-optimizations which is possible, I made a small benchmark to test validating vs blind parsing for common types like
Enum
andint
:Results show that validating values for types like Enum and int is an overall better approach. There are some types like
float
, for which it might be better to try parsing usingfloat
constructor, but overall validating the value/type first seems a solid approach to have.Hope this helps - I currently don't have time to implement this myself, which is why I'm jotting my thoughts down, in case anyone wants to take a stab at it. I'm also going to update the issue with "help wanted" label. For now, I'm going to focus on addressing other issues with the Dataclass Wizard library, but I'll keep a tab on this issue and will hopefully be back to look at it with fresh eyes sometime soon.
Originally posted by @rnag in #67 (comment)
The text was updated successfully, but these errors were encountered: