In many contexts, it is useful to track changes in a database over time. For some Django applications, the excelent version history of reversion or change-tracking of auditlog are perfect. But for certain applications, a problem with these other plugins is their serialization and movement of data away from the base table (using Type 1 slowly changing dimensions). Django PITA solves the problem when previous versions are fundamental to the meaning of a table's data and should be queryable just as easily as current versions In that case, Type 2 is a better method of version tracking as the data stays within the original table in a row marked that it has been replaced.
Django PITA provides an abstraction layer over Point-in-Time Architecture allowing you to use your models just like regular ones with the opportunity to query past states and versions as easily as with any Django query.
- PointInTimeModel that can be subclassed and used like any other Django model
- Query rows as they were at a particular past time
- Specify a time frame for an object to logically exist and query such "active" rows
- Rollback to a particular time for an object
- FrozenForeignKey for linking to a particular version of a PITA model object.
- Easy integration into existing Django projects.
You can install Django PITA using pip:
pip install django-pita
This has been tested on a Postgres database and should work on MySQL, but sqlite is not recommended as it does not support the constraints used by PointInTimeModels.
from django.utils import timezone
from django.db.models import CharField, TextField
from pita.models import PointInTimeModel
class Article(PointInTimeModel):
header = CharField(max_length=256)
body = TextField()
...
# objects manager behaves like any other Django model
Article.objects.all()
# records manager provides access to past versions
# such as the state of the Article table as of 10 days ago
Article.records.version(version_at=timezone.now() - timezone.timedelta(days=10))
You can specify a time range that you want an Article to be considered active using the pre-defined start_at and end_at attributes of a PointInTimeModel. Note that start_at defaults to creation time and end_at defaults to None (meaning no end).
next_year = timezone.now() + timezone.timedelta(days=365)
a = Article.objects.create(end_at=next_year, header="2024 Anual Report")
b = Article.objects.create(start_at=next_year, header="2025 Anual Report")
# returns queryset that contains Article a and not b
Article.objects.active()
# returns queryset that contains b and not a
Article.objects.active(active_at=next_year)
Sometimes, you may need to rollback changes or even purge (completely remove from database). Each of these gets its own permission that can be granted to administrators as needed (for API use; Django Admin not yet supported). Purge can be useful in cases where you need to remove an accidental historic row that is preventing the deletion of another model object (ie. PROTECTED).
# undoes the last change to Article a
a.rollback_latest()
# returns b to its state 2 days ago
b.rollback_to_at(timezone.now() - timezone.timedelta(days=2))
# permenantly removes a and all its history from the database
a.purge()
Perhaps you need to maintain a link to a particular version of an article even if it changes in the future. You can use FrozenForeignKey for that.
from django.db import models
from pita.models import FrozenForeignKey
class Revision(models.Model):
article = FrozenForeignKey(Article)
notes = models.TextField()
...
draft = Article(header="the next best python package")
revision = Revision(a, notes="Title is missing capitilization")
draft.header = "The Next Best Python Package"
draft.save()
# prints: the next best python package
print(revision.article.header)
Although a new row is created whenever a PointInTimeModel object is changed, the latest version maintains the same primary key in the table. This means other models using a regular ForeignKey to the PITA model will stay linked to the most up-to-date version. This also means that you should not trust the primary key of objects when querying past versions. Instead, refer to row_id which is the same for all versions of an object and is the primary key of the most current version.
Steps to turn a regular model into a PointInTimeModel:
- Inherit PointInTimeModel
- Run python manage.py makemigrations (you can set default for created_at and start_at to timezone.now())
- Use Django shell to loop over the objects in the model and perform the following for obj in MyModel.objects.all(): obj.row_id = obj.id obj._save()
And now you are all set to start using MyModel with version history tracking in the background.
If your project uses the Django Rest Framework for its API, make sure you have djangorestframework installed, or use
pip install django-pita[drf]
to install it as a dependency.
To best interface with a PointInTimeModel using a ModelViewSet-like API, inherit PointInTimeModelViewSet from pita.api. The main difference is that in this viewset, you must specify a model_class and define filter_queryset instead of using the usual queryset or get_queryset.(This is by design because the actual model query hides PITA implementation in get_queryset and the filter_queryset method keeps it clear that you should not typically query the model directly).
from pita.api import PointInTimeModelViewSet
class MyViewSet(PointInTimeModelViewSet):
model_class = Article
def filter_queryset(self, qs):
if self.request.data.get("author") is not None:
return qs.filter(modified_by__id=self.request.data.get("author"))
return qs
Note that filter_queryset will be passed a queryset of model_class objects and should return a queryset just as get_queryset usually does.
The PointInTimeModelViewSet comes with several useful functionalities built-in:
- any GET request can specify active_at and/or version_at url arguments to respectively query the model. (Note that if version_at is unspecified, the current version, ie the objects manager is used)
- User is automatically saved in modified_by model attribute in a POST, PATCH, or PUT request
- rollback and purge actions are defined and restricted to users with the corresponding permissions on the model
The default permission scheme uses django permissions to determine authorization for actions based on the HTTP method (ie GET requres view permission, POST requires add permission, etc). If you would like to override some, but not all permissions. Here is an example of how you could do so by subclassing the default permissions class for your model.
from rest_framework import permissions
from pita.api import get_pita_permissions_class, PointInTimeModelViewSet
class MyViewSet(PointInTimeModelViewSet):
model_class = MyModel
class CustomPermissions(get_pita_permissions_class(MyModel), permissions.BasePermission):
def has_permission(self, request, view):
# handle your custom situations here
if view.action == "my_action":
return request.user.has_perm("my_custom_permission")
# this handles PITA specific actions and defaults the rest to DjangoModelPermissionsStrict
return super().has_permission(request, view)
def get_permissions(self):
permission_classes = [IsAuthenticated, self.CustomPermissions]
return [permission() for permission in permission_classes]