Coding for the world, part 2: Keeping it together—or is separation a thing?

We are living in odd times, and keeping it together is not always easy. I hope all my readers are doing well. As much as I would like to have the power to make things smoother around the world, I am afraid I don’t. My superpower is concentrating on a rather smaller scale issue: helping you to get the code right for localization. Let’s get started.

Keeping it together

Today it is all about keeping content together: all the content, even a colon or full stop. For localization, it is very important to have all the content of a string together and not pieced together after translation by code.

Different languages have different punctuation marks and rules. 

The Spanish translation of “Where?” is “¿Dónde?”; note the inverted question mark in the beginning of the translation. The Spanish reader likes to know right from the start what kind of sentence they encounter. In French there is a non-breaking space before the question mark: “Où ?”. Some languages don’t use spacing at all, such as Japanese where “Add user” becomes “ユーザーの追加”. Or the dash used in French to indicate date ranges is a regular dash surrounded by spaces: “2 - 5 juin 2022”, whereas in German an en dash (which is slightly longer than the regular dash) with no spaces is used: “2.–5. Juni 2022”.

Let’s check out a code example:

English:

"WhatsNextHeaderForGetStarted": "What’s next?",

Greek translation:

"WhatsNextHeaderForGetStarted": "Τι ακολουθεί;",

As you may notice, no question mark at all. If you add the question mark AFTER localization and send out something like this:

"WhatsNextHeaderForGetStarted": "What’s next",

The Greek would look like this in the UI and wouldn’t read native to the Greek user:

Τι ακολουθεί?

In Greek, the English semicolon (;) is used as a question mark (?) and the colon (:) and semicolon (;) are a raised point (·) instead.

Here is another example. Suppose you have “Comment:” in the UI which is followed by a comment a user made. Now say you would like to have “Comment:” translated into French. You see within the code you already have “Comment” available in French from some previous UI localization project.

The English looks like this:

"FormsColumnButtonComment": "Comment",

And the French like this:

FormsColumnButtonComment": "Ajouter un commentaire",

Why send “Comment:” out for localization, when you can just reuse the same string and add a colon through coding? Which would look like this in the UI:

Ajouter un commentaire:

Whereas, in this context, the French user expects to see:

Commentaire :

What happened? With your shortcut in using an existing similar translation and adding a colon through code, you introduced the following errors in the French UI:

  • The existing string is for a button and the French translates to “Add a comment”, but you needed a noun. The call to action will read awkward and will cause confusion.
  • The French add a space before the colon. Seeing no space makes it grammatically incorrect, and this is a basic mistake in French grammar.

Speaking of punctuation: Don’t forget about double-byte characters, which is an entirely different world and might be worth covering in a future blog post. However, punctuation is just the starting point. Think of syntax, the order of words within a sentence. Word order can change dramatically in other languages. 

English is an SVO language and its syntax follows the formula Subject-Verb-Object. Some languages follow SOV, such as Korean, and other languages follow VSO, such as Arabic. 

Example

  • SVO: I saw Lyly.
  • SOV: I Lyly saw.
  • VSO: Saw I Lyly.

To make it even more dramatic, French adds the adjective after the noun and German tends to split up verbs.

Here is an example: you would like to have “Update billing” translated, but also have “Update software” and “Update page”. You now just take “Update” and send it out for translation in addition to three additional strings: “billing”, “software” and “page”.

Once you receive all the translations back, you just code the needed elements together. This works beautifully in English, but is a disaster for other languages. The German translation would need to read “Billing update”; for the Spanish the translation of the verb depends on the gender of the noun; and the Japanese would need to remove spacing between both words. In addition, “Update” without context can be a verb or a noun—a big difference in translation!

Separation is a thing when it comes to UI elements like controls

Separating from things can cause anxiety or relief and of course it depends on what we are separating from and when and how we do it. Separating from a baby tooth when you are in kindergarten might be less memorable than separating from a tooth when in college. Let’s see what separation can mean to you when it comes to coding and UI elements.

“Wait. Stop,” you might think. “We just learned that separating content is a bad thing.” Let me tell you, it is different when it comes to UI elements.

UI elements such as controls can be a bit tricky at times. UI string text and UI elements need to be separated. If the string text and the UI element are forming a unit, meaning a sentence structure, many linguists will have issues in making sense of the UI and giving it a native sound.

I always understand matters better with an example. Here it comes:

Good

UI element, good example 1

Poor

UI element, poor example 2

For the Poor example, the position of the UI controls is not something that can be easily moved during translation, so the linguists need to work with the sentence structure of the English, which can result in very awkward readings for the end user. Think of “Saw I Lyly”!

Pluralization is also a critical factor here: using “day(s)” doesn’t work for localization. Think of “knife” and “knives”: not every language can just add “(s)” to indicate a plural and a singular. In addition, some languages have up to six different plurals with different inflections of adjectives, nouns and verbs. (More about this in Part 3: It is all about those clues.)

Instead of using plural and singular: 

UI element example 3, poor

Keep it neutral:

UI element example 4, good

The same applies to other UI elements, such as dropdown menus.

Good

UI element example 5, good

Good

UI element example 6, good

Poor

UI element example 7, poor

Bottom line: You as a developer don’t really need to know how many different variations of language rules you code for. All you need to remember is to keep content together and to keep UI controls out of sentence structure. This might result sometimes in more lines of code, but it is surely easier to have the linguists handle their language than you thinking of each and every variation that might exist. Even if the language set you code for is known, there can always be additional languages added in future versions coming along with their own set of grammar and syntax rules. Leave it to the linguists, you can trust them. You might even have a globalization or localization team in your company that can support you and provide libraries, best practices, and code snippets.

See you in the next part when we talk about “It is all about those clues”. Until then, sunny greetings from the linguist that you can trust!

Note: Thanks to Carlos Barbero-Cortés of DocuSign for consultation and feedback.

Additional resources

Bettina Becker
Author
Bettina Becker
Sr. Language Manager
Published
Related Topics