Home

Oct. 26, 2016, 2 min read

Collecting related Objects in Django

When storing client or project related data in a multitenancy application, one may quickly find himself in a situation where an extraction of all data for a single client or project is needed. One use case for this would be to rollback all data for a specific client to some backup state, without affecting any of the other clients. Assuming you have modeled your database accordingly, so that one Model is sort of the "parent" of all its related data, we would have to collect all "child" objects from that parent. This can quickly become a pain to build manually, since not all Django relations are very explicit:

Using ManyToManyFields for example will create intermediate junction tables that are either implicit in the ORM (meaning there is no class for them) or can get even more complicated by "channeling" them through a custom Model class (using the through=MyModel keyword).

Luckily, Django already provides this functionality for us: Ever tried to delete an object inside the admin page, that has one or more objects referencing it? Django will collect all these relationships and display them in a list, as to ensure you are okay with deleting all this related data alongside it (this is the default behaviour of most databases, a cascading-deletion, which ensures referential integrity in the database).

We can reuse this functionality like this:

from django.db.utils import DEFAULT_DB_ALIAS
from django.contrib.admin.utils import NestedObjects

collector = NestedObjects(using=DEFAULT_DB_ALIAS)
collector.collect(queryset)
related_objects = collector.data

This will return a dictionary of Model classes to a list of their instances. It will contain the objects matching the queryset (e.g. for a single client this could be something like: queryset = Client.objects.filter(id=5)) as well as all related objects.

This collector class even resolves ManyToManyFields correctly, which its parent (django.db.models.deletion.Collector) would fail to do. It is possible to instead get a hierarchical list of the dependent objects using: collector.nested() or you could flatten it into a list yourself (which is what I did to be able to export it):

related_objects = [instance for cls, instances in collector.data.items() 
                   for instance in instances]

The result can be passed to the generic serialize function that Django offers:

from django.core import serializers
stringified_data = serializers.serialize("json", related_objects)

Based on that you can build your custom exporter and importer that work with data for only a single tenant :)