Skip to content

Data Scoping

Popax21 edited this page Feb 8, 2023 · 10 revisions

One of the biggest design challenges which Procedurline's (re-)design had to tackle was caching. To be more precise, this is referring to the following aspects:

  • cache keying: when cached data can be utilized, and when the expensive cached operation has to be executed
  • cache invalidation: when cached data should be discarded from the cache, as e.g. the original data might have changed

Procedurline tackles both of these challenges using its data scoping system, consisting of data scopes (represented using the DataScope class in code) and data scope keys (DataScopeKey). A data scope is a named or anonymous abstract collection of "things" something can belong to. For example, there is a scope all player-related things belong to ($PLAYER), one all sprites belong to ($SPRITE), and a global scope everything belongs to ($GLOBAL). "Things" belong to a respective subset of all scopes, which are represented by the scopes being "registered" on data scope key instances.

Scope keys can be in one of two states: valid, or invalid. A scope key can be either invalidated by itself (using DataScopeKey.Invalidate()), or it is automatically invalidated if one of the data scopes it's registered one gets invalidated (using DataScope.Invalidate()). Data scopes, while having the ability to be invalidated, do not have a valid/invalid state. Their invalidation only causes their currently registered keys to become invalidated, keys which are registered after the invalidation are not considered invalid.

TLDR

  • Scope keys represent a collection of data scopes they are registered on, you can register new keys on a scope using DataScope.RegisterKey
  • Scope keys act like cache keys - if an object has a scope key which is equivalent to the one of a cached object, its cached data will be utilized
  • Data scopes can be "transparent" - transparent scopes are not considered when checking for scope key equivalence
  • You can invalidate both data scopes and scope keys - invalidating a data scope will invalidate all keys it is registered on
  • "Registrar invalidation" is a special kind of invalidation which does not discard cached data while notifying scope key users that they should re-register their scope key's scopes - this can be used to "switch" the scopes of an already existing "thing" without affecting the cache
  • Usually, you would create a scope you then register all target objects whose data you will modify on, which you can then invalidate once you change the way you process that data

Primitive interfaces

  • IScopedObject: represents an object which belongs to a certain set of data scopes. Not all "things" as referred to above have to implement this interface, this interface is mostly used to represent that the object is itself aware of which scopes it belongs to, and is capable of registering them on a given scope key. DataScopeKey implements this interface.
  • IScopeRegistrar<T>: represents an external third party capable of assigning objects (also known as "targets") a set of scopes they belong to. They are the more commonly used method to assign and register an object's scope. Procedurline provides the ScopedObjectRegistrar<T> singleton which implements a naive scope registrar which proxies to IScopedObject if the target object implements the interface.
  • IScopedInvalidatable: represents something capable of being invalidated. This interface is intended to be implemented by SOURCES of invalidation, not to provide the ability to manually override a normal object's invalidation mechanism! Doing so WILL cause undefined behavior, as e.g. caches WILL NOT detect the invalidation, causing them to desync! As such, only DataScope and DataScopeKey implement this interface in Procedurline's code.

Data Scopes

DataScope instances encapsulate a scope DataScopeKey instances can be registered on. They keep track of a collection of all scope keys which are registered on them (valid or invalid), and provide mechanisms to invalidate them. You can register a key on them using DataScope.RegisterKey. For debugging purposes, scopes can be named, but this isn't required - anonymous scopes (created by passing null as their name) are allowed as well.

Data scopes can be transparent by setting DataScope.Transparent. Transparent data scopes still allow for key registration and invalidation as usual, and are present in a scope key's set of scopes. However, they are not taken into consideration when comparing keys for equality, and they do not show up in any debug output. As such, special care has to be taken when utilizing them that their presence does not change the way data is processed, as they are not taken into account when caching data.

DataScopeKey implements IDisposable, and should be disposed once no longer in use. Doing so will prevent further key registration, and invalidate all currently registered keys

Data Scope Keys

DataScopeKey instances encapsulate a collection of unique data scopes they are registered on, they are considered equivalent if they are registered on the exact same set of non-transparent scopes. On a more implementation-focused level, scope keys also implement helper methods to e.g. copy a scope key's state onto another instance (DataScopeKey.Copy()), clone a scope key to create a new instance (DataScopeKey.Clone()), reset a scope key, returning it to a clean state (DataScopeKey.Reset()) and registering all scopes it belongs to on another key (DataScopeKey.RegisterScopes(), provided by IScopedObject).

Scope keys are used to solve the issue of cache keying - any data is cached based on the set of non-transparent scopes it belongs to, and cache invalidation - a cache entry is considered to be out of date if its scope key becomes invalidated. This is implemented by DataScopeKey instances implementing the hash code and equality comparisons functions, and by them providing events for when invalidation occurs. Two DataScopeKey instances are considered equivalent when they have the same validity state, and they belong to the exact same set of non-transparent scopes.

Sometimes, one wishes to attach additional objects to the lifespan of a scope key, so that they are disposed once the key looses its validity. For these use cases, DataScopeKey provides the .TakeOwnership() method, which allows it to take ownership of an object implementing IDisposable. When the key now becomes invalid, or it is reset while still being valid, all objects owned by the key become disposed. A key can also take ownership of itself, which will cause it to automatically be disposed once the key looses its validity. Note that key resets will not cause the key to dispose itself, instead it will simply no longer own itself after the reset.

DataScopeKey implements IDisposable, and should be disposed once no longer in use. Doing so will reset them (and as such remove them from all scopes they are registered on), and prevent further utilization of the instance.

Registrar Invalidation

Registrar invalidation is a mechanism provided by Procedurline to handle cases where one wishes to notify users of a change in the set of scopes their object / target is assigned, but without invalidating the data which is already cached under the old scope key. Both DataScope and DataScopeKey implement .InvalidateRegistrars(), which by default acts the same as a regular invalidation, in the sense that all affected scope keys will become invalidated. However, when scope key instances are constructed with InvalidateOnRegistrarInvalidation set to false, registrar invalidation will do nothing other than emit the corresponding event. As such, caching-aware code can then take this as a notification to re-register their scope key's scopes, as those might have changed, but to not discard any cached data under the old set of scopes.

This mechanism can be used to e.g. make a certain target have multiple states it can be in, and to efficiently switch between them. This can be achieved by assigning each state a different data scope, and only assigning the target the scope corresponding to the state it should be currently in. When switching between states, the old state's scope's .InvalidateRegistrars() function is called, which causes the target's scopes to become reassigned (and as such the new state's scope to become registered instead), but also leaves the old state's data in the cache. Something similar to this is implemented by the DataScopeMultiplexer<T> helper class.