Install Celery with Django on Heroku and create a Periodic Task

Previously, I wrote about How to install Celery on Django and Create a Periodic Task. This post extends that post to the particular case of how to install it on Heroku.

You can also know more about how to deploy a Django app on Heroku.

Create a Task and a Task History model

First of all, as in production on Heroku you won’t use the logger, we’ll create a Django model to store information about the periodic tasks, like the time they were performed.

Create a model in your app, in myapp/models.py:

<pre><code>
File: marinamele_taskhistory_models.py
————————————–

# -*- coding: utf-8 -*-
from django.db import models
from django.utils.translation import ugettext_lazy as _
import jsonfield

class TaskHistory(models.Model):
# Relations
# Attributes – mandatory
name = models.CharField(
max_length=100,
verbose_name=_(&quot;Task name&quot;),
help_text=_(&quot;Select a task to record&quot;),
)
# Attributes – optional
history = jsonfield.JSONField(
default={},
verbose_name=_(&quot;history&quot;),
help_text=_(&quot;JSON containing the tasks history&quot;)
)
# Manager
# Functions

# Meta &amp; unicode
class Meta:
verbose_name = _(&#39;Task History&#39;)
verbose_name_plural = _(&#39;Task Histories&#39;)

def __unicode__(self):
return _(&quot;Task History of Task: %s&quot;) % self.name

</code></pre>
<p>

In this file:

  • There is the function ugettext_lazy, used to translate text
  • The package jsonfield, which can be installed with pip install django-jsonfield, allows to define a Django model field that stores a JSON object.
  • the name of the TaskHistory instance will be the name of the task to monitor.
  • history is a JSON field, that will be updated every time a task is performed

Then, open the file myapp/tasks.py and write:

# -*- coding: utf-8 -*-
from celery.task.schedules import crontab
from celery.decorators import periodic_task
from celery.utils.log import get_task_logger
from datetime import datetime
from myapp.models import TaskHistory

logger = get_task_logger(__name__)

# A periodic task that will run every minute (the symbol &quot;*&quot; means every)
@periodic_task(run_every=(crontab(hour=&quot;*&quot;, minute=&quot;*&quot;, day_of_week=&quot;*&quot;)), ignore_result=True)
def scraper_example():
logger.info(&quot;Start task&quot;)
now = datetime.now()
date_now = now.strftime(&quot;%d-%m-%Y %H:%M:%S&quot;)
# Perform all the operations you want here
result = 2+2
# The name of the Task, use to find the correct TaskHistory object
name = &quot;scraper_example&quot;
taskhistory = TaskHistory.objects.get_or_create(name=name)[0] taskhistory.history.update({date_now: result})
taskhistory.save()
logger.info(&quot;Task finished: result = %i&quot; % result)
</code></pre>
<p>

Note that:

  • scraper_example is a periodic task that will run every minute
  • The results of this task are ignored with ignore_result=True (I think it is the default behavior for a periodic task, but just to be sure…)
  • taskhistory is an instance of the TaskHistory model with the same name as the periodic task
  • The history field of the taskhistory is updated and saved everytime a task is performed

Finally, in your myapp/admin.py file write:

from django.contrib import admin
from myapp import models
class TaskHistoryAdminModel(admin.ModelAdmin):
    list_display = (“name”,)
    class Meta:
        models.TaskHistory
admin.site.register(models.TaskHistory, TaskHistoryAdminModel)

 

And now, let’s see how to run this periodic task, and see that it actually works! In your terminal, type:

python manage.py celeryd -B -l info

to start celery and celeybeat. The -l info asks the workers to log every message with a priority superior or equal to “info”. In another tab, write:

python manage.py runserver

Go to the Django admin, in my case it was at http://127.0.0.1:8000/admin. Go to  Myapp –> TaskHistory –> scraper_example to see the changes that your task is doing into the database.

Ok, now, let’s try this on Heroku… uf uf!

Django and Celery on Heroku

First, log in to your Heroku account. If you want to have two or more Heroku accounts on your computer, check this post! If you want how to set a Django app on Heroku, check this other post!

We need to install some addons in Heroku. The first one is CloudAMQP, which is a hosted RabbitMQ service.

$ heroku addons:add cloudamqp

In my case, I had to verify the account on the heroku website, which implied to give them my credit card (although the requested service was free).

Note that if you signed for a free account, you can have up to 3 open connections at the same time. To be consistent, open your settings.py file and limit your connections with:

BROKER_POOL_LIMIT = 3

Next, we need to find the RabbitMQ user, host, password and vhost to build the BROKER_URL setting variable.

You can obtain the BROKER_URL with

$ heroku config | grep CLOUDAMQP_URL

and then extract the user, host, password and whost. Another option is to use a Browser: go to your heroku dashboard, https://dashboard.heroku.com/apps, and click at the app you are working on. You will see a list of your addons installed. Click on CloudAMQP Litte and you will see directly your BROKER_URL.

If you are using more than one settings.py files for development, production and testing, you should have the old BROKER_URL variable into the common settings file (the one corresponding to the local development or testing envirnoments). Move the old BROKER_URL into the development and testing settings files and add the new one into the production setting file.

Then, edit your Procfile and add the line:

worker: python manage.py celery worker -B -l info

And…. let’s try it! Commit your changes and push them to Heroku:

$ git add .

$ git commit -m “Celery for Heroku”

$ git push heroku master

$ run python manage.py migrate

$ run python manage.py syncdb

Finaly, to actually start the worker specified in the Procfile you need to type:

$ heroku ps:scale worker=1

Check out the admin interface on Heroku, http://yourapp.herokuapp.com/admin/, and go to TaskHistories. Select the instance scraper_example and observe the results. Every minute a new line is written! Wonderful that it worked 🙂

Moreover, if you run

$ heroku ps

you should see something like:

=== web (1X): gunicorn yourproject.wsgi

web.1: up 2014/02/14 19:41:21 (~ 20m ago)

=== worker (1X): python manage.py celery worker -B -l info

worker.1: up 2014/02/14 20:01:42 (~ 1s ago)

BUT!! After checking that everything works correctly, stop the worker, because otherwise, with this worker active, you will end up paying!

$ heroku ps:scale worker=0

And just to make sure, if you type:

$ heroku ps

you should only see

=== web (1X): gunicorn yourproject.wsgi

web.1: up 2014/02/14 19:41:21 (~ 25m ago)

Finally, you can also see the logging on Heroku with

$ heroku logs -n 200

where -n 200 specifies the number of logs to be returned (by default 100).

If you want to filter your logs you can use something similar to:

$ heroku logs –source app –ps worker.1

which will give you only the app logs performed by the worker.1.

Hope it was helpful!

And please, give a +1 if you liked it! 🙂

Google+TwitterLinkedInFacebookReddit

Please, add +Marina Mele in your comments. This way I will get a notification email and I will answer you as soon as possible! :-)