-
Notifications
You must be signed in to change notification settings - Fork 30
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
avsc_to_pydantic issue with nullable nested schemas #112
Comments
I am open to make a Pull request for this as long as I have your opinion and guidelines :) |
After some more investigation I think the problem can be more generalized to repetition of field names and class names. {
"type": "record",
"name": "possesions",
"namespace": "my_namespace",
"fields": [
{
"name": "car",
"type": [
"null",
{
"type": "record",
"name": "car",
"fields": [
{
"name": "id",
"type": [
"null",
{
"type": "record",
"name": "id",
"fields": [
{
"name": "vin",
"type": ["null", "string"],
"default": null
},
{
"name": "vrn",
"type": ["null", "string"],
"default": null
}
]
}
],
"default": null
}
]
}
],
"default": null
},
{
"name": "house",
"type": [
"null",
{
"type": "record",
"name": "house",
"fields": [
{
"name": "id",
"type": [
"null",
{
"type": "record",
"name": "id",
"fields": [
{
"name": "land_and_mortgage_register_number",
"type": ["null", "string"],
"default": null
}
]
}
],
"default": null
}
]
}
],
"default": null
}
]
} My idea to fix this would be to change class naming scheme to contain parent names as well. class possesions_car_id(BaseModel):
vin: Optional[str] = None
vrn: Optional[str] = None
class possesions_car(BaseModel):
id: Optional[possesions_car_id] = None
class possesions_house_id(BaseModel):
land_and_mortgage_register_number: Optional[str] = None
class possesions_house(BaseModel):
id: Optional[possesions_house_id] = None
class possesions(BaseModel):
car: Optional[possesions_car] = None
house: Optional[possesions_house] = None while currently the 2nd section named "id" overwrites the 1st one: class id(BaseModel):
land_and_mortgage_register_number: Optional[str] = None
class car(BaseModel):
id: Optional[id] = None
class house(BaseModel):
id: Optional[id] = None
class possesions(BaseModel):
car: Optional[car] = None
house: Optional[house] = None |
Hey @myjniak, Thanks for opening this issue and the clear description! Models currently have to have a unique name, because they are referred to by their name. This is necessary as the models are exported to a single file. In addition, as you mentioned, Pydantic doesn't support having two different submodels with the same name in a single model. So what to do about it? I agree, we shouldn't just ignore the bug. I don't see how aliasing would solve the issue. So a fix indeed could be to rename non-unique model names. Problem is that we don't know upfront which names collide. In addition, current implementation cycles through fields from top level to bottom and fields don't store information about higher level fields. I got it locally to work with just comparing current model with existing model in the class registry and adding a hash to class name based on its children if it's different. As I see it, To add parent info to the model name would require a preprocess iteration over the Avro schema. In this iterations the colliding classes are identified along with their parents. I welcome a PR to implement this. I can provide my WIP if you'd like. Before this is finished, I think pydantic-avro should probably raise an error on name collision. |
When encountering nested schemas, (type "record"), avsc_to_pydantic gives pydantic class attributes and the class name for that attribute exactly the same name.
According to this comment, pydantic might not support that:
pydantic/pydantic#7871 (comment)
you can't name a field to have the same name as a model
Here's a code to reproduce this:
Then we get this in return:
which will result in an exception, when instantiated:
The submodel class names should probably be transparent for the avsc_to_pydantic user, so my proposition to solve this problem would be to add a "Model" postfix to pydantic submodel class names:
If the class names are important for some reason though, then maybe field aliasing could help? Field(alias="something")
Let me know what do you think :)
The text was updated successfully, but these errors were encountered: