Previously, I wrote about How to install Celery on Django and Create a Periodic Task. This post extends that post to the particular case of how to install it on Heroku.
You can also know more about how to deploy a Django app on Heroku.
Create a Task and a Task History model
First of all, as in production on Heroku you won’t use the logger, we’ll create a Django model to store information about the periodic tasks, like the time they were performed.
Create a model in your app, in myapp/models.py:
<pre><code>
File: marinamele_taskhistory_models.py
————————————–
# -*- coding: utf-8 -*-
from django.db import models
from django.utils.translation import ugettext_lazy as _
import jsonfield
class TaskHistory(models.Model):
# Relations
# Attributes – mandatory
name = models.CharField(
max_length=100,
verbose_name=_("Task name"),
help_text=_("Select a task to record"),
)
# Attributes – optional
history = jsonfield.JSONField(
default={},
verbose_name=_("history"),
help_text=_("JSON containing the tasks history")
)
# Manager
# Functions
# Meta & unicode
class Meta:
verbose_name = _('Task History')
verbose_name_plural = _('Task Histories')
def __unicode__(self):
return _("Task History of Task: %s") % self.name
</code></pre>
<p>
In this file:
- There is the function ugettext_lazy, used to translate text
- The package jsonfield, which can be installed with pip install django-jsonfield, allows to define a Django model field that stores a JSON object.
- the name of the TaskHistory instance will be the name of the task to monitor.
- history is a JSON field, that will be updated every time a task is performed
Then, open the file myapp/tasks.py and write:
# -*- coding: utf-8 -*-
from celery.task.schedules import crontab
from celery.decorators import periodic_task
from celery.utils.log import get_task_logger
from datetime import datetime
from myapp.models import TaskHistory
logger = get_task_logger(__name__)
# A periodic task that will run every minute (the symbol "*" means every)
@periodic_task(run_every=(crontab(hour="*", minute="*", day_of_week="*")), ignore_result=True)
def scraper_example():
logger.info("Start task")
now = datetime.now()
date_now = now.strftime("%d-%m-%Y %H:%M:%S")
# Perform all the operations you want here
result = 2+2
# The name of the Task, use to find the correct TaskHistory object
name = "scraper_example"
taskhistory = TaskHistory.objects.get_or_create(name=name)[0]
taskhistory.history.update({date_now: result})
taskhistory.save()
logger.info("Task finished: result = %i" % result)
</code></pre>
<p>
Note that:
- scraper_example is a periodic task that will run every minute
- The results of this task are ignored with ignore_result=True (I think it is the default behavior for a periodic task, but just to be sure…)
- taskhistory is an instance of the TaskHistory model with the same name as the periodic task
- The history field of the taskhistory is updated and saved everytime a task is performed
Finally, in your myapp/admin.py file write:
And now, let’s see how to run this periodic task, and see that it actually works! In your terminal, type:
python manage.py celeryd -B -l info
to start celery and celeybeat. The -l info asks the workers to log every message with a priority superior or equal to “info”. In another tab, write:
python manage.py runserver
Go to the Django admin, in my case it was at http://127.0.0.1:8000/admin. Go to Myapp –> TaskHistory –> scraper_example to see the changes that your task is doing into the database.
Ok, now, let’s try this on Heroku… uf uf!
Django and Celery on Heroku
First, log in to your Heroku account. If you want to have two or more Heroku accounts on your computer, check this post! If you want how to set a Django app on Heroku, check this other post!
We need to install some addons in Heroku. The first one is CloudAMQP, which is a hosted RabbitMQ service.
$ heroku addons:add cloudamqp
In my case, I had to verify the account on the heroku website, which implied to give them my credit card (although the requested service was free).
Note that if you signed for a free account, you can have up to 3 open connections at the same time. To be consistent, open your settings.py file and limit your connections with:
BROKER_POOL_LIMIT = 3
Next, we need to find the RabbitMQ user, host, password and vhost to build the BROKER_URL setting variable.
You can obtain the BROKER_URL with
and then extract the user, host, password and whost. Another option is to use a Browser: go to your heroku dashboard, https://dashboard.heroku.com/apps, and click at the app you are working on. You will see a list of your addons installed. Click on CloudAMQP Litte and you will see directly your BROKER_URL.
If you are using more than one settings.py files for development, production and testing, you should have the old BROKER_URL variable into the common settings file (the one corresponding to the local development or testing envirnoments). Move the old BROKER_URL into the development and testing settings files and add the new one into the production setting file.
Then, edit your Procfile and add the line:
worker: python manage.py celery worker -B -l info
And…. let’s try it! Commit your changes and push them to Heroku:
$ git add .
$ git commit -m “Celery for Heroku”
$ git push heroku master
$ run python manage.py migrate
$ run python manage.py syncdb
Finaly, to actually start the worker specified in the Procfile you need to type:
$ heroku ps:scale worker=1
Check out the admin interface on Heroku, http://yourapp.herokuapp.com/admin/, and go to TaskHistories. Select the instance scraper_example and observe the results. Every minute a new line is written! Wonderful that it worked 🙂
Moreover, if you run
$ heroku ps
you should see something like:
=== web (1X): gunicorn yourproject.wsgi
web.1: up 2014/02/14 19:41:21 (~ 20m ago)
=== worker (1X): python manage.py celery worker -B -l info
worker.1: up 2014/02/14 20:01:42 (~ 1s ago)
BUT!! After checking that everything works correctly, stop the worker, because otherwise, with this worker active, you will end up paying!
$ heroku ps:scale worker=0
And just to make sure, if you type:
$ heroku ps
you should only see
=== web (1X): gunicorn yourproject.wsgi
web.1: up 2014/02/14 19:41:21 (~ 25m ago)
Finally, you can also see the logging on Heroku with
$ heroku logs -n 200
where -n 200 specifies the number of logs to be returned (by default 100).
If you want to filter your logs you can use something similar to:
$ heroku logs –source app –ps worker.1
which will give you only the app logs performed by the worker.1.
Hope it was helpful!
And please, give a +1 if you liked it! 🙂
Marina Mele has experience in artificial intelligence implementation and has led tech teams for over a decade. On her personal blog (marinamele.com), she writes about personal growth, family values, AI, and other topics she’s passionate about. Marina also publishes a weekly AI newsletter featuring the latest advancements and innovations in the field (marinamele.substack.com)