Skip to content

Fix translation logic to properly credit translator.

Johnny Yeng requested to merge a2902793/webapp:develop into develop

Fix https://gitlab.com/gitlab-org/gitter/webapp/issues/2404

Current Problem

The code mentioned below is in server/handlers/root.js L44-47
-
Back in the days when all translation.js files were all ISO-639-1 tags ar.js, bg.js, cs.js, de.js...etc, users with the same language code but different region code in their accepted-language would be redirected to the same language code page.ar-AE, ar-BH, ar-DZ ... would all serve ar.js.,

Example:
accepted-language: ar-AE,ar;q=0.9,en;q=0.8

var locale = req.i18n.getLocale();
//locale = ar
var requested = req.headers['accept-language'] || ''; //ar-AE,ar;q=0.9,en;q=0.8
requested = requested.split(';')[0] || ''; //ar-AE,ar
requested = requested.split(/,\s*/)[0]; //ar-AE
requested = requested.split('-')[0]; //ar
//requested = ar
if (locale !== requested) { //true
    //show "Want this in..."
}

since locale equals requested, credits are shown properly.



However, After the first translation.js with “same language code but includes a subtag” (zh-TW) was added, things got a little different.

Example:
accepted-language: zh-TW,zh;q=0.9,en;q=0.8

var locale = req.i18n.getLocale();
//locale = zh-TW
var requestedHeader = req.headers['accept-language'] || ''; //zh-TW,zh;q=0.9,en;q=0.8
requested = requested.split(';')[0] || ''; //zh-TW,zh
requested = requested.split(/,\s*/)[0]; //zh-TW
requested = requested.split('-')[0]; //zh
//requested = zh
if (locale !== requested) { //false
    //show "Want this in..."
}

since locale = zh-TW doesn't equal to requested = zh, page serves correctly but still appears "Want this in..."




My Solution

I think the reason why parsing accepted-language was because you wanted to use the local name feature in the langs package. So for instance, showing Want this in 中文 ? instead of Want this in Chinese ?. langs package uses ISO-639-1 language code, in other words, does not support languages with other varieties.

As to why I think supporting languages beyond ISO-639-1 is important? Languages such as English are Phonograms, meaning the letters making up a word or a sentence are all made of same characters , a-z. In contrast, languages like Chinese are Logograms, meaning the character represents a word or a phrase.
The reason why I'm saying this is, it may not be a big difference between en-US and en-GB, since it varies only in some phrase usage. However, zh-CN and zh-TW may both be Chinese but vary a lot, not only do their characters look different but the usage of words and phrases differ too. An example is the word collaboration in zh-CN , they use 协作 . Character-wise, it's already different, the first Chinese characters is the simplified version of the traditional Chinese one (simplified: , tradional: ). Not to mention that we don't even use 協作 for collaboration in zh-TW , we use 合作 .
My proposal is that we do a check on the served locale, if the current served locale does have a "-" sign, this indicates that a subtag version translation is requested and is served, hence the action for splitting the locale into vanilla language code for langs package to return local name feature is not needed.
+if (!locale.includes("-")) {
    requested = requested.split('-')[0]
+}




Future Support for Other Subtag Translation

When req.i18n.getLocale() encounters a language tag with region code, it will check if such translation.js exists, if not, it fallbacks to the vanilla language (it-IT.js doesn't exist, use it.js). If someday, maybe say someone commits a Spanish translation with some Latin American slangs or words, and named the translation file es-419.js , this scenerio is considered "language code exists but includes a subtag", when users with es-419 visits the homepage,

var locale = req.i18n.getLocale();
//locale = es-419
var requested = req.headers['accept-language'] || ''; //es-419,es;q=0.9
requested = requested.split(';')[0] || ''; //es-419,es
requested = requested.split(/,\s*/)[0]; //es-419
if (!locale.includes("-")) { //es-419.js exists, es-419.js is served, locale = es-419
    requested = requested.split('-')[0]; //es-419
}
//requested = es-419
if (locale !== requested) { //true
    //show "Want this in..."
}

since locale = es-419 equals to requested = es-419, page serves correctly, "Translation kindly done by..." is showing.


I added some test translations js files like es-419 and it-CH to simulate future translations with subtags. Those with no subtag versions will serve plain language translation, those with subtag will properly show correct content with credits. This is what the Traditional Chinese page look like before
and after
var requested = req.headers['accept-language'] || '';
requested = requested.split(';')[0] || '';
requested = requested.split(/,\s*/)[0];
+if (!locale.includes("-")) {
    requested = requested.split('-')[0]
+}

As discussed in February 6, 2020 4:06 PM

Edited by Tomas Vik

Merge request reports