Internationalization of Elixir applications with Gettext and Transifex
Our Forza Football app is translated into many languages, and therefore our push notifications have to be translated as well. This is an interesting problem to solve when the languages to translate to become more than you can handle by yourself.
One of the most important features of our Forza Football app is sending notifications to subscribed users when events such as goals happen in football matches (we wrote about our notifications before). Since our users are based in many different countries, we need to translate such notifications in many different languages. In this post, I will talk about how we tackle this problem in Elixir (that’s what we wrote our push system in) and about the tools we use to do so.
Gettext for Elixir
The main tool we’re using for translation is Gettext for Elixir. This library is an implementation of GNU Gettext for Elixir applications; in short, Gettext is a system for internationalization of software based on having source strings in the source code that are extracted to translation files where the translations for different languages live. The Gettext for Elixir README explains this in more depth if you’re interested.
When we’re building a notification payload to send to a user, we have code that looks like this:
import Pushboy.Gettext, only: [gettext: 2]
gettext "%{minute}′ Red Card - %{player_name} (%{team_name})",
minute: minute,
player_name: player_name,
team_name: team_name
Don’t worry about the name Pushboy
, that’s just how our “pusher” app is called. gettext/2
in the code above is a macro automatically defined in our Pushboy.Gettext
module by Gettext
:
defmodule Pushboy.Gettext do
use Gettext, otp_app: :pushboy
end
The source string ("%{minute}′ Red Card ..."
) is how we will identify this translation in the future, and Gettext will find the translation for the right language at runtime based on such source string.
Extracting translations
Once we’re ready for internationalizing our code, the first thing we do is extract the strings that need translation out of the source code. Gettext provides a Mix task for this:
$ mix gettext.extract
Extracted priv/gettext/default.pot
This task reads all the calls to Gettext macros (such as gettext/2
above) at compile-time and extracts them to POT files (with a .pot
extension) that look like this:
# This would go in priv/gettext/default.pot
#: lib/pushboy/event/red_card.ex:86
msgid "%{minute}′ Red Card - %{player_name} (%{team_name})"
msgstr ""
POT files are template files that are only meant to hold a list of all the strings to translate. In the file shown above, msgid
is the identifier of the string to translate (which is the string itself) and msgstr
is where the translation goes. However, translations are not stored in POT files, because this POT file is not specific to any language. Instead, translated strings are stored into PO (.po
) files; each PO file is stored in a directory specific to a language. For example, if we wanted to translate our string to Italian, we would have a file that looks like this:
# This would go in priv/gettext/it/LC_MESSAGES/default.po
#: lib/pushboy/event/red_card.ex:86
msgid "%{minute}′ Red Card - %{player_name} (%{team_name})"
msgstr "%{minute}′ Cartellino rosso - %{player_name} (%{team_name})"
Gettext reads such PO files at compile time to make the lookup of translations as fast as possible. As you can see, the things between %{
and }
are interpolation variables: they’re not meant to be translated, and they will be replaced at runtime with some dynamic value.
With the POT file and the PO file above, our translation would look like this:
iex> import Pushboy.Gettext, only: [gettext: 2]
iex> Gettext.put_locale(Pushboy.Gettext, "it")
iex> gettext "%{minute}′ Red Card - %{player_name} (%{team_name})",
...> minute: 38,
...> player_name: "Cristiano Ronaldo",
...> team_name: "Real Madrid"
"38′ Cartellino rosso - Cristiano Ronaldo (Real Madrid)"
Translating into different languages
While we solved the problem of how we internationalize our application, we still haven’t solved the problem of how we translate our strings. Our app supports many more languages that our team members speak, so translating in-house is not an option.
To fix this problem, we use Transifex. Transifex is a website that provides both an easy-to-use interface for organizing and editing translations as well as support for outsourcing translations. Basically, they partnered up with a few translation services so that you can “order” translations from such services and get strings translated to different languages of your choice by users that speak those languages.
While we use external translation services for most languages, we also have native speakers for around six languages in the company; with Transifex, we can easily let these native speakers contribute to the translations. This is possible since Transifex’s interface is straightforward and less tech-savvy users can use it intuitively.
Gettext integration
Transifex integrates with many translation platforms, and Gettext is one of such platforms. You can upload PO/POT files to Transifex and download translations as PO files. Transifex has the concept of “resources”, which are different “domains” of translations (for example, one resource could be notifications, while another one could be error messages, and so on). Luckily, Transifex resources map exactly to Gettext domains: in Gettext, you can use the dgettext/3
macro to extract a translation to a different domain (gettext/2
uses the "default"
domain), and each domain ends up in a different PO(T) file (we had default.pot
in the examples above).
The workflow is roughly this: first, we extract strings to translate from our source code into a POT file with the mix gettext.extract
task, as shown above. Then, we upload this POT file to Transifex in order to update, add, or remove the strings to translate.
After that, we wait for our coworkers to translate what they can and for the translation services to take care of translating. Once all the translations are ready, we download the PO files for all the languages we need from Transifex.
Command-line tools
The workflow described above works but is pretty tedious, involves a lot of manual interactions, and becomes slower the more languages we need to translate to. Lucky for us, Transifex provides a nifty command-line tool called tx
: this tool allows us to “push” source strings to translate when we run mix gettext.extract
and to “pull” translated strings into PO files once they’re available.
tx
can be configured via the .tx/config
file in the root of your project. Ours looks like this:
[main]
host = https://www.transifex.com
[pushboy.default]
type = PO
source_file = priv/gettext/default.pot
source_lang = en
file_filter = priv/gettext/<lang>/LC_MESSAGES/default.po
In this file, we configure the pushboy.default
resource (which maps to our "default"
Gettext domain) and instruct the tx
tool that:
- we want to use the Gettext format (PO and POT)
- our source strings to translate live in
priv/gettext/default.pot
- our source language is English
- the translations that we’ll download should end up inside
priv/gettext
, under the language they’re translated to (<lang>
is replaced bytx
)
With the configuration file above, we can now push new, edited, or removed source strings to translate with $ tx push --source
(--source
ensures we only push the updated POT) and pull updated translations with tx pull
.
Cleaning up Transifex PO files
We’re pretty strict when it comes to style in our codebase, and we found that the PO files that Transifex generated for us were not perfect for our taste. Luckily, Gettext provides tools to easily parse and modify PO/POT files. What we ended up having is a new Mix task, mix translations.pull
, that abstracts the interaction with tx
away.
In such task, we first call tx pull
, then we iterate over all pulled PO files and we “reformat” them how we want:
defp reformat_po_file(path) do
reformatted_po =
path
|> Gettext.PO.parse_file!()
|> reformat_headers()
|> remove_top_of_the_file_comments()
File.write!(path, Gettext.PO.dump(reformatted_po))
end
# We get rid of the Last-Translator header
defp reformat_headers(%Gettext.PO{headers: headers} = po) do
new_headers = Enum.reject(headers, &String.starts_with?(&1, "Last-Translator"))
%Gettext.PO{po | headers: new_headers}
end
# We get rid of comments that Transifex leaves at the top of the PO file
defp remove_top_of_the_file_comments(%Gettext.PO{} = po) do
%Gettext.PO{po | top_of_the_file_comments: []}
end
The complete workflow
Our complete workflow is currently the following:
- after modifying our source code, we run
mix gettext.extract
and get a bunch of updated POT files - we run
tx push --source
to update the strings to translate on Transifex - we wait for our coworkers and for the translation services to translate the updated strings
- we run
mix translations.pull
and get updated translations
We really enjoy this workflow as it allows us to programmatically update, push, and pull translations, and at the same time it allows us to scale really easily when adding new languages (as nothing changes in the workflow, we just need to order translations for those languages as well on Transifex).
Conclusion
We took a look at how we take care of translating push notifications for our Forza Football app in several languages in a way that is easy to use, fast, and scales well with the number of translations and languages we have.