Internationalization of Elixir applications with Gettext and Transifex

Our Forza Football app is translated into many languages, and therefore our push notifications have to be translated as well. This is an interesting problem to solve when the languages to translate to become more than you can handle by yourself.

Flags

One of the most important features of our Forza Football app is sending notifications to subscribed users when events such as goals happen in football matches (we wrote about our notifications before). Since our users are based in many different countries, we need to translate such notifications in many different languages. In this post, I will talk about how we tackle this problem in Elixir (that’s what we wrote our push system in) and about the tools we use to do so.

Gettext for Elixir

The main tool we’re using for translation is Gettext for Elixir. This library is an implementation of GNU Gettext for Elixir applications; in short, Gettext is a system for internationalization of software based on having source strings in the source code that are extracted to translation files where the translations for different languages live. The Gettext for Elixir README explains this in more depth if you’re interested.

When we’re building a notification payload to send to a user, we have code that looks like this:

import Pushboy.Gettext, only: [gettext: 2]

gettext "%{minute}′ Red Card - %{player_name} (%{team_name})",
        minute: minute,
        player_name: player_name,
        team_name: team_name

Don’t worry about the name Pushboy, that’s just how our “pusher” app is called. gettext/2 in the code above is a macro automatically defined in our Pushboy.Gettext module by Gettext:

defmodule Pushboy.Gettext do
  use Gettext, otp_app: :pushboy
end

The source string ("%{minute}′ Red Card ...") is how we will identify this translation in the future, and Gettext will find the translation for the right language at runtime based on such source string.

Extracting translations

Once we’re ready for internationalizing our code, the first thing we do is extract the strings that need translation out of the source code. Gettext provides a Mix task for this:

$ mix gettext.extract
Extracted priv/gettext/default.pot

This task reads all the calls to Gettext macros (such as gettext/2 above) at compile-time and extracts them to POT files (with a .pot extension) that look like this:

# This would go in priv/gettext/default.pot

#: lib/pushboy/event/red_card.ex:86
msgid "%{minute}′ Red Card - %{player_name} (%{team_name})"
msgstr ""

POT files are template files that are only meant to hold a list of all the strings to translate. In the file shown above, msgid is the identifier of the string to translate (which is the string itself) and msgstr is where the translation goes. However, translations are not stored in POT files, because this POT file is not specific to any language. Instead, translated strings are stored into PO (.po) files; each PO file is stored in a directory specific to a language. For example, if we wanted to translate our string to Italian, we would have a file that looks like this:

# This would go in priv/gettext/it/LC_MESSAGES/default.po

#: lib/pushboy/event/red_card.ex:86
msgid "%{minute}′ Red Card - %{player_name} (%{team_name})"
msgstr "%{minute}′ Cartellino rosso - %{player_name} (%{team_name})"

Gettext reads such PO files at compile time to make the lookup of translations as fast as possible. As you can see, the things between %{ and } are interpolation variables: they’re not meant to be translated, and they will be replaced at runtime with some dynamic value.

With the POT file and the PO file above, our translation would look like this:

iex> import Pushboy.Gettext, only: [gettext: 2]
iex> Gettext.put_locale(Pushboy.Gettext, "it")
iex> gettext "%{minute}′ Red Card - %{player_name} (%{team_name})",
...>         minute: 38,
...>         player_name: "Cristiano Ronaldo",
...>         team_name: "Real Madrid"
"38′ Cartellino rosso - Cristiano Ronaldo (Real Madrid)"

Translating into different languages

While we solved the problem of how we internationalize our application, we still haven’t solved the problem of how we translate our strings. Our app supports many more languages that our team members speak, so translating in-house is not an option.

To fix this problem, we use Transifex. Transifex is a website that provides both an easy-to-use interface for organizing and editing translations as well as support for outsourcing translations. Basically, they partnered up with a few translation services so that you can “order” translations from such services and get strings translated to different languages of your choice by users that speak those languages.

While we use external translation services for most languages, we also have native speakers for around six languages in the company; with Transifex, we can easily let these native speakers contribute to the translations. This is possible since Transifex’s interface is straightforward and less tech-savvy users can use it intuitively.

Transifex interface

Gettext integration

Transifex integrates with many translation platforms, and Gettext is one of such platforms. You can upload PO/POT files to Transifex and download translations as PO files. Transifex has the concept of “resources”, which are different “domains” of translations (for example, one resource could be notifications, while another one could be error messages, and so on). Luckily, Transifex resources map exactly to Gettext domains: in Gettext, you can use the dgettext/3 macro to extract a translation to a different domain (gettext/2 uses the "default" domain), and each domain ends up in a different PO(T) file (we had default.pot in the examples above).

The workflow is roughly this: first, we extract strings to translate from our source code into a POT file with the mix gettext.extract task, as shown above. Then, we upload this POT file to Transifex in order to update, add, or remove the strings to translate.

Uploading POT files to Transifex

After that, we wait for our coworkers to translate what they can and for the translation services to take care of translating. Once all the translations are ready, we download the PO files for all the languages we need from Transifex.

Command-line tools

The workflow described above works but is pretty tedious, involves a lot of manual interactions, and becomes slower the more languages we need to translate to. Lucky for us, Transifex provides a nifty command-line tool called tx: this tool allows us to “push” source strings to translate when we run mix gettext.extract and to “pull” translated strings into PO files once they’re available.

tx can be configured via the .tx/config file in the root of your project. Ours looks like this:

[main]
host = https://www.transifex.com

[pushboy.default]
type = PO
source_file = priv/gettext/default.pot
source_lang = en
file_filter = priv/gettext/<lang>/LC_MESSAGES/default.po

In this file, we configure the pushboy.default resource (which maps to our "default" Gettext domain) and instruct the tx tool that:

we want to use the Gettext format (PO and POT)
our source strings to translate live in priv/gettext/default.pot
our source language is English
the translations that we’ll download should end up inside priv/gettext, under the language they’re translated to (<lang> is replaced by tx)

With the configuration file above, we can now push new, edited, or removed source strings to translate with $ tx push --source (--source ensures we only push the updated POT) and pull updated translations with tx pull.

Cleaning up Transifex PO files

We’re pretty strict when it comes to style in our codebase, and we found that the PO files that Transifex generated for us were not perfect for our taste. Luckily, Gettext provides tools to easily parse and modify PO/POT files. What we ended up having is a new Mix task, mix translations.pull, that abstracts the interaction with tx away.

In such task, we first call tx pull, then we iterate over all pulled PO files and we “reformat” them how we want:

defp reformat_po_file(path) do
  reformatted_po =
    path
    |> Gettext.PO.parse_file!()
    |> reformat_headers()
    |> remove_top_of_the_file_comments()

  File.write!(path, Gettext.PO.dump(reformatted_po))
end

# We get rid of the Last-Translator header
defp reformat_headers(%Gettext.PO{headers: headers} = po) do
  new_headers = Enum.reject(headers, &String.starts_with?(&1, "Last-Translator"))
  %Gettext.PO{po | headers: new_headers}
end

# We get rid of comments that Transifex leaves at the top of the PO file
defp remove_top_of_the_file_comments(%Gettext.PO{} = po) do
  %Gettext.PO{po | top_of_the_file_comments: []}
end

The complete workflow

Our complete workflow is currently the following:

after modifying our source code, we run mix gettext.extract and get a bunch of updated POT files
we run tx push --source to update the strings to translate on Transifex
we wait for our coworkers and for the translation services to translate the updated strings
we run mix translations.pull and get updated translations

We really enjoy this workflow as it allows us to programmatically update, push, and pull translations, and at the same time it allows us to scale really easily when adding new languages (as nothing changes in the workflow, we just need to order translations for those languages as well on Transifex).

Conclusion

We took a look at how we take care of translating push notifications for our Forza Football app in several languages in a way that is easy to use, fast, and scales well with the number of translations and languages we have.