You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Currently when designing a schema, the schema author must make the choice when to inline objects vs reference them (a common mistake is to assume inlining when referencing is the default, for objects with identifiers).
This is very natural when working with tree-based serializations (json, yaml). The object representation is isomorphic to the serialization. It is up to the client to dereference. e.g.
(assuming that rel.other_person is not inlined, since inlining this would be lead to multiple paths in the serialization)
This is less natural when coming from an RDF or graph background. Arguably, why should the client care about tree serializations? It's more natural to do this:
It is mostly used internally. For example, the linkml expression language allows the more declarative form.
If we were to widely support ObjectIndex, we need to be careful. This would present a different interface to the pydantic/dataclasses objects, and we are trying to consolidate on pydantic as the one true way. Alternatively we could try and weave the ObjectIndex approach into the pydantic classes. But I think this gets complex quickly. Pydantic "wants" to be isomorphic to the tree-based serializations.
With ObjectIndex, there is the intriguing possibility of having different backing stores rather than in-memory. For example, allowing the client to traverse over a small subset of a billion node graph. But this belongs outside a minimal core, and we quickly get into a lot of issues that have plagued ORM systems about managing caches etc. But it's worth noting that gen-sqla already produces objects that follow this index.
reacted with thumbs up emoji reacted with thumbs down emoji reacted with laugh emoji reacted with hooray emoji reacted with confused emoji reacted with heart emoji reacted with rocket emoji reacted with eyes emoji
-
Currently when designing a schema, the schema author must make the choice when to inline objects vs reference them (a common mistake is to assume inlining when referencing is the default, for objects with identifiers).
This is very natural when working with tree-based serializations (json, yaml). The object representation is isomorphic to the serialization. It is up to the client to dereference. e.g.
(assuming that
rel.other_person
is not inlined, since inlining this would be lead to multiple paths in the serialization)This is less natural when coming from an RDF or graph background. Arguably, why should the client care about tree serializations? It's more natural to do this:
In fact there is support for this mode, via ObjectIndex, but we don't advertise it widely:
https://github.com/linkml/linkml-runtime/blob/main/linkml_runtime/index/object_index.py
It is mostly used internally. For example, the linkml expression language allows the more declarative form.
If we were to widely support ObjectIndex, we need to be careful. This would present a different interface to the pydantic/dataclasses objects, and we are trying to consolidate on pydantic as the one true way. Alternatively we could try and weave the ObjectIndex approach into the pydantic classes. But I think this gets complex quickly. Pydantic "wants" to be isomorphic to the tree-based serializations.
With ObjectIndex, there is the intriguing possibility of having different backing stores rather than in-memory. For example, allowing the client to traverse over a small subset of a billion node graph. But this belongs outside a minimal core, and we quickly get into a lot of issues that have plagued ORM systems about managing caches etc. But it's worth noting that
gen-sqla
already produces objects that follow this index.Beta Was this translation helpful? Give feedback.
All reactions