Fork me on GitHub

Django has a sophisticated system to trigger certain logic when you save, delete, change many-to-many relations, etc. Very often people make use of such signals and for some edge-cases these are indeed the only effective solution, but there are only a few cases where using signals is appropriate.

Why is it a problem?

Signals have a variety of problems and unforeseen consequences. In the below sections, we list a few.

Signals can be circumvented

One of the main problems with signals is that signals do not always run. Indeed the pre_save and post_save signals will not run when we save or update objects in bulk. For example if we create multiple Posts with:

Post.objects.bulk_create([
    Post(title='foo'),
    Post(title='bar'),
    Post(title='qux')
])

then the signals do not run. The same happens when you update posts, for example with:

Post.objects.all().update(views=0)

Often people assume that signals will run in that case, and for example perform calculations with the signals: they recalculate a certain field, based on the updated values. Since one can update a field without triggering the the corresponding signals, then this results in an inconsistent value. Signals thus give a false sense of security that the handler will indeed update the object accordingly.

Signals are request unaware

Every now and then people try to implement a signal that needs a request object to perform a certain piece of logic. For example one might want to send an email each time a certain model is created to the person that made the HTTP request.

Django's signal processing however does not capture the request. This makes sense since these signals get fired by models that are created, updated, removed, etc. and models are request unaware as well.

The fact that these are request unaware makes it harder to implement certain behavior. For example a signal that updates the owner field of a model object with the currently logged in user. Strictly speaking it is possible to implement this, for example by inspecting the call stack. If the signal is triggered by a view, eventually one will find a call to that view, and thus one can obtain the request object. Another way to do this would be to implement middleware that keeps track of the user that makes the request, and then uses that, but that will introduce a global state antipattern.

This thus means that while technically there are some ways to make signals request-aware these solutions are often ugly and furthermore can fail if a management command triggers the change.

Signals can raise exceptions and break the code flow

If the signals run, for example when we call .save() on a model object, then the triggers will run. Contrary to popular belief, signals do not run asynchronous, but in a synchronous manner: there is a list of functions and these will all run. A second problem is that these signals might raise an error, and this will thus result in the function that triggered the views, raising that error. Developers often do not take this into account.

If such error is raised, then eventually the .save() call will raise an error. Even if the developer takes this into account, it is hard to anticipate on the consequences: if there are multiple handlers for the same signal, then some of the handlers can have made changes whereas others might not have been invoked. It thus makes it more complicated to repair the object, since the handlers might already have changed the object partially.

Signals can result in infinite recursion

It is also rather easy to get stuck in an infinite loop with signals. If we for example have a model of a Profile with a signal that will remove the User if we remove the Profile:

from django.db import models
from django.db.models.signals import pre_delete
from django.dispatch import receiver

class Profile(models.Model):
    user = models.OneToOneField(
        settings.AUTH_USER_MODEL,
        on_delete=models.CASCADE
    )

# …

@receiver(pre_delete, sender=Profile)
def delete_profile(sender, instance, using):
    instance.user.delete()

If we now remove a Profile, this will get stuck in an infinite loop. Indeed, first we start removing a Profile. This will trigger the signal to run, which will remove the related user object. But Django will look what to do when removing the user, and it thus will first remove the Profile again triggering the signal. It is easy to end up with infinite recursion when defining signals. Especially if we use signals on two models that are related to each other.

Signals run before updating many-to-many relations

The pre_save and post_save signals of an object run immediately before and after an object is saved to the database. If we have a model with a ManyToManyField, then when we create that object and the signals run, the ManyToManyField is not yet populated. This is because a ModelForm first needs to create the object, before that object has a primary key and thus can start populating the many-to-many relation. If we for example have two models Author and Book with a many-to-many relation, and we want to use a signal that counts the number of books an Author has written, then the following signal will not work when we create an Author, and the form also to specify the books:

from django.db.models.signals import pre_save
from django.dispatch import receiver

@receiver(pre_save, sender=Author)
def save_author(sender, instance, created, raw, using, update_fields):
    instance.num_books = instance.books.count()

Regardless whether we use a pre_save or post_save signal, at that moment in time instance.books.all() is an empty queryset.

Signals make altering objects less predictable

Even if only one handler is attached to the the signal, and that handler can never raise an error, the handler still is often not an elegant solution. Another developer might not be aware of its existence, since it has only a "weak" binding to the model, and thus it makes the effect of saving an object less predictable.

Signals do not run in data migrations

One can construct a data migration [Django-doc], such migration could populate a database table, for example:

from django.apps import apps as global_apps
from django.db import migrations

def forwards(apps, schema_editor):
    ModelName = apps.get_model('app_name', 'ModelName')
    ModelName.objects.create(
        first_name='foo', last_name='Bar'
    )

class Migration(migrations.Migration):
    operations = [
        migrations.RunPython(forwards, migrations.RunPython.noop),
    ]
    dependencies = [
        ('app_name', '1234_other_migration'),
    ]

here we thust create a ModelName record. We however use a historical model: that means the ModelName with the fields defined on the model at the state after applying all dependencies, not per se the current ModelName model. Regardless, whether that model is equivalent to current model, it is quite limited. One of the consequences is that signals are not attached to that model, and thus if there is for example a post_save signal defined on ModelName, than these signals, nor the historical nor the current ones will run, and hence the anticipated effects of these signals will thus not take place.

Signals do not run when other programs make changes

Finally other programs can also make changes to the database, and thus will not trigger the signals, and this eventually could lead to the database being in an inconsistent state. Another program could for example create a new book for an author, but might not update the field in the Author model that keeps track of the number of books written by that author. It will be quite hard to "translate" all the handlers in Django to other programs that interact with the same database.

What can be done to resolve the problem?

Often it is better to avoid using signals. One can implement a lot of logic without signals.

Calculating properties on-demand

The most robust way to count the number of Books of an Author is not to store the number of books in a field, but use .annotate(…) [Django-doc] to each time annotate the Authors with the number of Books they have written. We thus can make a query that looks like:

from django.db.models import Count

Author.objects.annotate(
    num_books=Count('books')
)

Often if the number of Books is not that large, this will still scale quite well. It is more robust: if somehow another program removed a book, or a view was triggered that somehow circumvented the update logic, it will still work with the correct amount of books.

Here of course we each time recalculate the number of Books per Author when we query. If the number of Books and Authors grows, then this can become a performance bottleneck.

Encapsulating update logic in the view/form and ModelAdmin

Another option might be to encapsulate the handler logic in a specific function. For example if we want to count the number of books of an Author each time we save/update a Book, we can implement the logic:

def update_book(book):
    author = book.author
    author.num_books = author.books.count()
    author.save()

and then we can call this function in the views where we create/update the book. For example:

def my_view(request):
    if request.method == 'POST':
        form = BookForm(request.POST, request.FILES)
        if form.is_valid():
            book = form.save()
            update_book(book)
            # …
        # …
    # …

we can also construct a mixin that we can use in class-based views and the ModelAdmin:

from django.contrib import admin

class MyModelAdmin(admin.ModelAdmin):
    
    def save_model(self, request, obj, form, change):
        update_book(obj)
        super().save_model(request, obj, form, change)

If the task takes too much time, you can set up a queue where a message is queued that will then trigger a task to update the data. This is however not something specific to encapsulate logic into a function: if you work with signals, then these signals can go in timeout as well, and thus render the server irresponsive.

Extra tips

Signals can still be a good solution if you want to handle events raised by a third party Django application. In many cases, this is the only effective way to handle certain events. For example the auth module provides signals when the user logs in, logs out, or fails to log in [Django-doc] these signals are typically more reliable, since these are not triggered by the ORM. Often for third party applications signals are an effective way to communicate with these applications.

If you define signals, you should be aware that these will not run in data migrations. You will thus have to write the logic that is otherwise done by the signals in the data migration file as well.