published: January 8, 2011 — tags: django, i18n, jinja2, python

Jinja2 internationalization in Django.

It took me a while to understand what is going on under the hood of Django and Jinja2 when it comes to i18n (internationalization). I'm sure this piece of information will be useful to anyone using Jinja2 templates inside Django who wants to support more than one language.

The problem.

It's fairly easy to globalize your sites using Django with its own templating system. You are walked through the process in detail by well-written documentation. Django's design itself is the reason for this clarity. Jinja2 on the other hand is a very well optimized and fast templating language that is more appealing to many people that know python... until they start reading the documentation on creating extensions. Filters are easy and similar to Django's, tags are more tough, but what about i18n?

In Jinja's manual there's very brief information on how to get i18n working. You just have to feed a GNUTranslations object to Jinja's Environment, but where to get it from? Creating it on your own is a really bad idea because you would have to specify languages and paths to directories with *.mo files on your own, while Django does the same thing in parallel. The good news is: Django provides GNUTranslations for you via its own subclass: DjangoTranslation. And here the good news end, because there is one such object created for every installed language. You could probably merge them into one catalog, but why go through all this trouble? In fact you don't have to. The best approach is to re-use systems already implemented in Django with as little effort as possible.

Before we get to the point.

For the next part to have any sense at all, let's make some assumptions first. While writing this, I am using Django-1.2, Jinja2-2.5.5 and Coffin-0.3.3 (although the last component isn't required).

In the following example I will be using 2 languages: the default 'en-us' and additionally 'pl'. The minimum contents of the settings.py file would be something like this:

# we don't want settings to call gettext - we only want to extract strings
def gettext(s): s
LANGUAGE_CODE = 'en-us'
LANGUAGES = (
    ('en-us', gettext('English')),
    ('pl', gettext('Polish')),
)
USE_I18N = True
USE_L10N = True       # not required for this demo but important

MIDDLEWARE_CLASSES = (
    'django.middleware.common.CommonMiddleware',
    'django.contrib.sessions.middleware.SessionMiddleware',
    'django.contrib.auth.middleware.AuthenticationMiddleware',
    'django.middleware.locale.LocaleMiddleware',
)
INSTALLED_APPS = (
    'django.contrib.auth',
    'django.contrib.sessions',
    'coffin',         # only if you use it for setting up Jinja2
)

Additionally for polish language we should create a translation file called django.po² in the locale/pl/LC_MESSAGES sub folder of the django project: and fill it with the following contents using UTF-8 encoding (as with every other file):

msgid ""
msgstr ""
"Project-Id-Version: irrelevant\n"
"Report-Msgid-Bugs-To: \n"
"POT-Creation-Date: 2010-12-25 13:28+0100\n"
"PO-Revision-Date: 2010-12-25 13:28+0100\n"
"Last-Translator: irrelevant\n"
"MIME-Version: 1.0\n"
"Content-Type: text/plain; charset=UTF-8\n"
"Content-Transfer-Encoding: 8bit\n"
"Plural-Forms:  nplurals=3; plural=(n==1 ? 0 : n%10>=2 && n%10<=4 && (n%"
"100<10 || n%100>=20) ? 1 : 2);\n"

#, python-format
msgid "Hello %(name)s!"
msgstr "Witaj %(name)s!"

msgid "Hello!"
msgstr "Witam!"

msgid "stranger"
msgstr "nieznajomy"

#, python-format
msgid "There is %(count)s apple."
msgid_plural "There are %(count)s apples."
msgstr[0] "Jest %(count)s jabłko."
msgstr[1] "Są %(count)s jabłka."
msgstr[2] "Jest %(count)s jabłek."

The msgid strings are specified only for the purpose of this example. If you already have a django.po file in that folder, you can add the lines starting from msgid "Hello %(name)s!" at the end of it and run the command: python manage.py compilemessages -l pl.

Normally you would aid the process of creating/updating translation files by using a gettext extractor. Django provides a friendly access to xgettext. You can use python manage.py makemessages -l pl to extract strings requiring translation to polish language and the file locale/pl/LC_MESSAGES/django.po will be updated. After you edit this file, run python manage.py compilemessages -l pl to create django.mo - a file ready to be used by gettext.

Although manage.py makemessages will work for django and django-templates, it has no idea about Jinja2 and difference in its i18n tags. It will not work with trans tags. Therefore I'd rather use a more easily configurable tool that is well supported by Jinja. I'm talking about Babel, which you can install using easy_install Babel. If you're not using django-templates at all (except perhaps for the built-in ones), you could as well switch to Babel completely for all message extraction - even javascript.¹

To use Babel create a config file, for example babel.cfg, with this contents:

[python: **.py]

[jinja2: **/templates/**.html]
encoding = utf-8
babel_style = False       
# babel_style is off, because we won't be using babel's gettext

# actually you don't even need to write the following part:
[extractors]
jinja2 = jinja2.ext:babel_extract

Then, still inside your project's root directory, run the following commands:

pybabel extract -F babel.cfg -o messages.pot .
pybabel init -D django -i messages.pot -d locale -l pl
    # edit /locale/pl/LC_MESSAGES/django.po filling it with translations
    # then remove the fuzzy markers:    #, fuzzy ...
    # finally run this:
pybabel compile -D django -i locale/pl/LC_MESSAGES/django.po -d locale -l pl

Once the django.po file has been created do not run pybabel init anymore, or it will overwrite all your changes. Instead run pybabel update with same arguments. In any case create backups of the *.po files after major changes rather than spend all day cursing when they vanish.

The explicit and transparent way to get i18n.

Django provides its own wrappers around *gettext functions which have some very nice properties. They pick the right DjangoTranlations object according to the language specified in a thread-local variable or fall back to default settings.LANGUAGE_CODE or don't do any translation if settings.USE_I18N is set to False. If you want to use translated strings inside your views, you should include those functions in your files:³

from django.utils.translation import ugettext, ungettext
_ = ugettext

The second line is there just to save typing later on. Note that I'm using unicode versions and not plain gettext and ugettext - using them would mean getting encoding/decoding exceptions in templates sooner or later. It's best not to use the non-unicode versions at all.

The simplest way to use gettext in Jinja2 templates is to add those functions to the context. Simple as that. To illustrate it start shell from within your project folder using python manage.py shell and try it out. I will be using the translation file specified above, activate(language) function to switch current language and a helper function x(string, keywords...) which will print a rendered template from string. An example session might look like this:

>>> from django.utils.translation import ugettext, ungettext, get_language, activate
>>> from jinja2 import Environment
>>> env = Environment(extensions=['jinja2.ext.i18n'])
>>> def x(s, **kw):
...   kw.update(dict(gettext=ugettext, ngettext=ungettext))
...   print env.from_string(s).render(kw)
...
>>> x("{{ _('Hello!') }}")
Hello!
>>> get_language()
'en-us'
>>> activate('pl')
>>> get_language()
'pl'
>>> x("{{ _('Hello!') }}")
Witam!
>>> x("{{ _('Hello %(name)s!')|format(name=who) }}", who='Mike')
Witaj Mike!
>>> x("{% trans name=who %}Hello {{ name }}!{% endtrans %}", who='Mike')
Witaj Mike!
>>> x("{{ ngettext('There is %(count)s apple.', 'There are %(count)s apples.', n) % {'count':n} }}", n=1)
Jest 1 jabłko.
>>> x("{{ ngettext('There is %(count)s apple.', 'There are %(count)s apples.', n) % {'count':n} }}", n=2)
Są 2 jabłka.
>>> x("{{ ngettext('There is %(count)s apple.', 'There are %(count)s apples.', n) % {'count':n} }}", n=5)
Jest 5 jabłek.
>>> x("{% trans count=n %}There is {{ count }} apple.{% pluralize %}There are {{ count }} apples.{% endtrans %}", n=5)
Jest 5 jabłek.
>>> x("{% trans count=n %}There is {{ count }} apple.{% pluralize %}There are {{ count }} apples.{% endtrans %}", n=1)
Jest 1 jabłko.
>>> activate('en-us')
>>> x("{% trans count=n %}There is {{ count }} apple.{% pluralize %}There are {{ count }} apples.{% endtrans %}", n=1)
There is 1 apple.

Works just like it was supposed to. What's even more fascinating, you don't have to type gettext - underscore automatically gets interpreted as gettext. Using this in actual views is straightforward - just like that or, like me, use Coffin, a Jinja2 adapter for Django which adds useful filters, interoperability and even loads Jinja's i18n extension by itself when it detects that settings.USE_I18N is True. You will however still need to add gettext and ngettext to the context.

This is not a big deal - I use a wrapper around render_to_response anyway, because it would really kill me, if I had to write render_to_response('template.html', { 'request':request, 'gettext':ugettext, 'ngettext':ungettext, 'is_auth':request.user.is_authenticated(), 'var':something }) all the time. Instead I simply write r2r('template.html', request, var=something). There's really nothing bad about doing things easier. My r2r function is imported from another file, but for the sake of clarity, I'll define it in-line. This is how a simple internationalized view could look like:

# -*- encoding:utf-8 -*-
from django.utils.translation import ugettext, ungettext
from coffin.shortcuts import render_to_response

def r2r(template, request, **context):
    context['request'] = request
    context['is_auth'] = request.user.is_authenticated()
    context['gettext'] = ugettext
    context['ngettext'] = ungettext
    return render_to_response(template, context)

def index(request):
    return r2r('index.html', request)

As for the index.html template, it could look like this (provided you have a useful base.html template to extend:

{% extends "base.html" %}
{% block main %}
<p>{% trans name=request.user.username|default(_('stranger'), true) -%}
Hello {{ name }}!{% endtrans %}</p>
{% endblock main %}

This will print one of the following messages (assuming Mike is the only registered user):

Hello Mike!
Hello stranger!
Witaj Mike!
Witaj nieznajomy!

The hyphen at the end of the trans tag block is esential for it to work, because we didn't create an entry in django.po starting with the newline character. You can of course avoid having to think about it if you write the whole trans block in one line or use gettext extractors like Babel.

The other way round.

The former example didn't require us to do any changes to Jinja's Environment. We could have approached the problem differently though. Instead of adding gettext to template's context explicitly each time, we could have constructed a fake GNUTranslations class, which would only define gettext and ngettext, each pointing at django.utils.translations functions, and proceed exactly like Jinja's manual suggested. There is however no one good place where this code should be put. urls.py looks like a good choice because it is always run. If you're using Coffin, you can add this to urls.py:

from django.utils.translation import ugettext, ungettext

class JinjaTranslations:
    def gettext(self, message): 
        return ugettext(message)
    def ngettext(self, singular, plural, number): 
        return ungettext(singular, plural, number)

from coffin.common import env
env.install_gettext_translations(JinjaTranslations(), newstyle=False)

and without Coffin just get Jinja's Environment (env) some other way. That's it, you are done. No need to add anything to template context by hand like in the previous solution. You also get something else - ability to use new style gettext which is available since Jinja 2.5. It's off by default, but you need to only set newstyle=True in the call to env.install_gettext_translations to get it working. When it is enabled, instead of writing:

{{ _('Hello %(name)s!')|format(name=who|e) }}
{{ ngettext('There is %(count)s apple.', 
   'There are %(count)s apples.', n)|format(count=n) }}

we would have to write:⁴

{{ _('Hello %(name)s!', name=who|e) }}
{{ ngettext('There is %(count)s apple.', 
   'There are %(count)s apples.', n, count=n) }}

I must admit that new style looks cleaner, but it comes with a price tag attached:

you will have to use only named, keyword arguments (which is actually good, considering different word order in many languages, but is only useful if the translation string has at least two spots to fill),
the only way to get the new style working is by using env.install_gettext_translations(..., newstyle=True) as described above,
you won't be able to use old style gettext with format filter to fill missing pieces any more. That would raise an exception. Therefore some people would rather use old style maintaining same notation whether they have a normal string or a translation string.

Javascript string extraction with Babel won't be shown here but it's easy. Django stores translations for javascript files in separate djangojs.po files, so that they can be served to scripts requiring translations rather than handing them the whole translation file. So you probably could write a separate config file babeljs.cfg that would contain the line: [javascript: **.js] and use similar commands to the ones shown above. To simplify the task one would make a bash script that runs all of them. The only problem that I have encountered trying it out is that line continuation won't work (while it does in python), for example the following javascript code will be stored as msgid "lo!":
```
s2 = gettext('Hel'\
             'lo!');
```
↩
Translation files are stored in locale folder of your project or app. In locale there are folders for each language you are writing translation into, called pl, es, it but also pt_BR or de_AT if there's a variant, a flavour of the language. In sub folders of them labelled LC_MESSAGES you'll find translation files called domain.po and their compiled versions domain.mo, where domain can be any name (in system locales it's usually the application name), but Django uses simply django. ↩
except for definitions of models and forms in models.py or for any other static data. Instead use for them ugettext_lazy and ungettext_lazy which can be imported from the django.utils.translation module. ↩

There's an error in official documentation, because keyword argument has to be supplied no matter what, if it exists in the translation string, and the number taken as pluralization parameter has nothing to do with it. Have a look and compare:

{# this will work provided such translation exists #}
{{ ngettext('There is an apple.', 'There are apples.', 3) }}
.
{# this will work #}
{{ ngettext('There is %(count)s apple.', 'There are %(count)s apples.', 3, count=3) }}
.
{# these will both raise an exception! #}
{{ ngettext('There is %(count)s apple.', 'There are %(count)s apples.', count=3) }}
{{ ngettext('There is %(count)s apple.', 'There are %(count)s apples.', 3) }}

↩

Share this! Other articles

Comments

Add your comment