tl;dr: I recently spoke at html5devconf about how Coursera internationalizes (i18n) and localizes (i10n) our web app. This blog recaps the talk.
Coursera’s has the mission to make online education as accessible and affordable as possible. Over half of our learners are outside of the US, making i18n critical to how we fulfill our mission.
This means that our team has had to learn a lot about i18n and l10n.
i18n: how you design your tech stack so all strings can be replaced with different language strings.
l10n: refers to when you actually translate the strings of your UI into a different language.
Let’s talk about the first 6 steps needed to begin building your first localized app.
Step 1. Choose an i18n library
There are a number of i18n libraries to choose from. A compact i18n library will do two things for you:
- replaces all wrapped strings with a translated string
- determines how you will organize your translated strings
The best library for your team depends on some of your other library choices. Angular, react, webpack, requirejs and even jquery all have i18n libraries that they integrate with.
If you don’t want to spend too much time on this decision, you can’t go wrong with the simplicity of polyglot. You can always move to another library once you learn more.
We started off with requirejs i18n plugin and some other in-house tools and soon we will be transitioning into a custom webpack plugin.
Step 2. Determine the user’s language
Next, you’ll need to determine what language to serve up to your users.
There are three different techniques you can use to determine your users’ language. At Coursera, we use a mix of all three:
- ask your users directly in their account settings
- use the accept-language http request header
- use IP geo detection to map IP to country to language
Asking your users directly gives you the cleanest signal of what language to serve. However, this will only work for users who can manage to find your account settings page.
The Accept-Language http request header gives you a list of languages the user may be ok with receiving. This is the fastest way to make a decision, though we have found a few edge cases (some people read multiple languages, sometimes the header is empty, etc etc).
When all else fails, using MaxMind’s geo IP database can provide you a lookup of IP to country to language.
While none of the three techniques are foolproof, the combination of them will give you a good guess at what language your user is comfortable reading.
Step 3. Configure your library
Next, you’ll need to decide how to configure your i18n library to load up the required language strings.
There are two approaches you can take: dynamically load language specific strings on the client or pre-compile language specific versions of your application.
We used to serve up strings dynamically, but have found that this can be slow as the number of strings you have grow. We are moving towards building separate language bundles of our app, and serving this directly once we determine (on the server) what language a user needs.
Here is an example of our webpack config that we use to build a our app with bundled chinese strings:
1 2 3 4 5 6 7 8 9 10
We then serve up the Chinese bundle of our web app when we believe it is preferred over our English bundle.
Step 4. Extract all the strings
This step takes three actions on your part. Import the strings into your views, wrap your strings with your translation function, and extract your strings into the appropriate JSON folder/file structure.
Here is a react component before extraction:
1 2 3 4 5 6 7 8 9
1 2 3 4 5 6 7 8 9 10 11
our i18n plugin has us store many language specific json files in the app/i10n/hello folder. here is an example of app/10n/hello/piglatin.js
1 2 3
Step 5. Find translators
Translating your strings seems simple but can get complicated very quickly, especially when you have strings with variables that need interpolation.
Start out simple and use Google translate. Your quality may vary from not-so-great to barely-understandable, depending on the language you are translating, but it will give your team data to start playing around with right away.
Once you start to productionize your work, turn to your co-workers, friends, family and even your own users for translations! Coursera went through all these stages and we are excited to have over 30,000 users be a part of our mission as they help us translate our platform.
Scaling this process up takes an investment in automation. Early on, you can just send JSON files around and have them translated, but as interpolation and community size scale up, you’ll want to use tools like Transifex.
Step 6. Broadcast language support
This last step is less a step and more about advice that will help you market (read: SEO) your support of other languages to new users.
When serving your HTML pages in another language, take advantage of the lang attribute.
Also, add link metatags to point to alternative language versions of the page you are serving:
1 2 3 4
Lastly, have static language alternatives of your site. We follow John Resig’s advice on using subdomains to organize our localized alternatives.
Not only are these alternatives friendly to search engines but they are also friendly to your users as well.
The above 6 steps should be enough to get feedback from your users and to help you determine if this is a worthwhile investment for your product. If the answer is yes, you’ll quickly learn about why i18n is hard.
As we spend more time with our i18n work, we will consider writing up our experience with some of the trickier i18n subjects like interpolation, dates, currency, string fragmentation, translation memory usage, pluralization and many others! (currently we are using format.js to help solve some of these).
Lastly, If you think you don’t have the time or resources, please reconsider, we originally had an intern start these efforts for us!