Should automatic dereferencing of non-inlined objects be supported? #2176

cmungall · 2024-06-26T14:44:44Z

cmungall
Jun 26, 2024
Maintainer

Currently when designing a schema, the schema author must make the choice when to inline objects vs reference them (a common mistake is to assume inlining when referencing is the default, for objects with identifiers).

This is very natural when working with tree-based serializations (json, yaml). The object representation is isomorphic to the serialization. It is up to the client to dereference. e.g.

for person in container.persons.values():
    for rel in person.relationships:
       p2 = container.persons[rel.other_person]
       print(person.name, rel.type, p2.name)
       ...

(assuming that rel.other_person is not inlined, since inlining this would be lead to multiple paths in the serialization)

This is less natural when coming from an RDF or graph background. Arguably, why should the client care about tree serializations? It's more natural to do this:

for person in container.persons.values():
    for rel in person.relationships:
       print(person.name, rel.type, rel.other_person.name)
       ...

In fact there is support for this mode, via ObjectIndex, but we don't advertise it widely:

https://github.com/linkml/linkml-runtime/blob/main/linkml_runtime/index/object_index.py

It is mostly used internally. For example, the linkml expression language allows the more declarative form.

If we were to widely support ObjectIndex, we need to be careful. This would present a different interface to the pydantic/dataclasses objects, and we are trying to consolidate on pydantic as the one true way. Alternatively we could try and weave the ObjectIndex approach into the pydantic classes. But I think this gets complex quickly. Pydantic "wants" to be isomorphic to the tree-based serializations.

With ObjectIndex, there is the intriguing possibility of having different backing stores rather than in-memory. For example, allowing the client to traverse over a small subset of a billion node graph. But this belongs outside a minimal core, and we quickly get into a lot of issues that have plagued ORM systems about managing caches etc. But it's worth noting that gen-sqla already produces objects that follow this index.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Linked data Modeling Language

Should automatic dereferencing of non-inlined objects be supported? #2176

{{title}}

Replies: 0 comments

Select a reply

Linked data Modeling Language

Should automatic dereferencing of non-inlined objects be supported? #2176

cmungall Jun 26, 2024 Maintainer

Replies: 0 comments

cmungall
Jun 26, 2024
Maintainer