Occasionally when using a django ManyToManyField
I find myself wanting to
query the join table directly. Consider for example the models below:
class Category(models.Model):
name = models.TextField()
class Entry(models.Model):
data = models.TextField()
categories = models.ManyToManyField(Category)
Suppose I have an entry
, and want to find entries
which are similar,
defined here as having the most categories in common. I could define a
through table:
class Relationship(models.Model):
entry = models.ForeignKey('Entry2')
category = models.ForeignKey('Category')
class Entry2(models.Model):
data = models.TextField()
categories = models.ManyToManyField(Category, through=Relationship)
However, if all I want to do is query the relationship (rather than store extra data with each relationship, which is the common use case for through tables), there is another way: generating the model on the fly.
Digression: Dynamic classes in python
We take a step back to briefly mention dynamically created classes in python.
Beyond "the usual" way to define classes in python using the class
keyword,
it is also possible to create classes dynamically. The following are
equivalent:
class Spam(object):
parrots = 2
def speak(self):
return "We have %s parrots" % self.parrots
>>> Ham = type(
'Ham',
(object, ),
dict(
parrots = 2,
speak = lambda self: "We have %s parrots" % self.parrots
)
)
>>> s = Spam()
>>> s.parrots
2
>>> s.speak()
'We have 2 parrots'
>>> h = Ham()
>>> h.parrots
2
>>> h.speak()
'We have 2 parrots'
Everything in python is an object, and just like we instantiate s
by calling
Spam()
, we can instantiate classes by calling type()
. A good explanation
of this (along with loads of other good stuff) can be found in this
stackoverflow answer.
Back to django
Now that we know how to dynamically create classes, we can apply this knowledge to django models. Before continuing, we need a few more pieces.
Models for existing tables
Normally when you create a django model, django will create database tables
for you. However, it is also possible to create django models that "sit on
top of" already existing database tables. This is accomplished by specifying
a db_table
property on the Meta
class:
class Relationship(models.Model):
entry = models.ForeignKey('Entry')
category = models.ForeignKey('Category')
class Meta:
db_table = 'app_entrey_categories'
Now for the fun bit: we can read the model metadata and create this model on the fly, at runtime.
from django.db import models
def create_m2m_model(cls, field_name):
""" dynamically create a model to query m2m relations directly
Example:
m2m = create_m2m_model(Entry, 'categories')
m2m.objects.count()
37
m2m.objects.all()[0].category
<Category: Pirates>
"""
field = cls._meta.get_field(field_name)
subfield1 = field.m2m_field_name()
subrelation1 = cls
subcolumn1 = field.m2m_column_name()
subfield2 = field.m2m_reverse_field_name()
subrelation2 = field.rel.to
subcolumn2 = field.m2m_reverse_name()
table_name = field.m2m_db_table()
module = cls.__module__
# we only want to use this for reads, not writes
def save(self, *args, **kwargs):
pass
attrs = {
'save': save,
subfield1: models.ForeignKey(subrelation1, db_column=subcolumn1),
subfield2: models.ForeignKey(subrelation2, db_column=subcolumn2),
}
model = type(
table_name,
(models.Model,),
dict(
__module__ = module,
Meta = type(
'Meta',
(),
dict(
managed = False,
db_table = table_name,
)
),
**attrs
)
)
return model
Some explanation
We use the _meta
property of our django model to introspect the
relationship. Note: _meta
is technically an undocumented api and as
such, is subject to change. Use at your own risk.
We find the types involved in the relationship, the name of the join table
and the columns used. We then create a model class that is mapped to that
table. Note the Meta
class being created as well, as another instance of
type
. The only extra piece is the __module__
attribute, which is required
by django. We set it explicitly to the module of the model.
Addendum
How did I use this to find similar entries?
import operator
from django.db.models import Count, Q
EntryCategories = create_m2m_model(Entry 'categories')
categories = entry.categories.all()
if categories:
filters = reduce(
operator.or_, (Q(category=category) for category in categories))
related_qs = EntryCategories.objects.exclude(entry=entry.pk
).filter(filters
).values('entry').annotate(count=Count('category')
).order_by('-count'
)[:5]
# we can't just add another .values() clause to the qs
related_pks = [value['entry'] for value in related_qs]
related = entry.objects.filter(pk__in=related_pks)
Comments !