Fix translation logic to properly credit translator.
Fix https://gitlab.com/gitlab-org/gitter/webapp/issues/2404
Current Problem
The code mentioned below is in server/handlers/root.js L44-47
-
Back in the days when all translation.js files were all ISO-639-1 tags ar
.js, bg
.js, cs
.js, de
.js...etc, users with the same language code but different region code in their accepted-language would be redirected to the same language code page.ar-AE, ar-BH, ar-DZ
... would all serve ar.js
.,
Example:
accepted-language: ar-AE,ar;q=0.9,en;q=0.8
var locale = req.i18n.getLocale();
//locale = ar
var requested = req.headers['accept-language'] || ''; //ar-AE,ar;q=0.9,en;q=0.8
requested = requested.split(';')[0] || ''; //ar-AE,ar
requested = requested.split(/,\s*/)[0]; //ar-AE
requested = requested.split('-')[0]; //ar
//requested = ar
if (locale !== requested) { //true
//show "Want this in..."
}
since locale
equals requested
, credits are shown properly.
However,
After the first translation.js with “same language code but includes a subtag” (zh-TW) was added, things got a little different.
Example:
accepted-language: zh-TW,zh;q=0.9,en;q=0.8
var locale = req.i18n.getLocale();
//locale = zh-TW
var requestedHeader = req.headers['accept-language'] || ''; //zh-TW,zh;q=0.9,en;q=0.8
requested = requested.split(';')[0] || ''; //zh-TW,zh
requested = requested.split(/,\s*/)[0]; //zh-TW
requested = requested.split('-')[0]; //zh
//requested = zh
if (locale !== requested) { //false
//show "Want this in..."
}
since locale = zh-TW
doesn't equal to requested = zh
, page serves correctly but still appears "Want this in..."
My Solution
I think the reason why parsing accepted-language was because you wanted to use the local name feature in the langs
package. So for instance, showing Want this in 中文 ?
instead of Want this in Chinese ?
. langs package uses ISO-639-1 language code, in other words, does not support languages with other varieties.
As to why I think supporting languages beyond ISO-639-1 is important?
Languages such as English are Phonograms, meaning the letters making up a word or a sentence are all made of same characters , a-z. In contrast, languages like Chinese are Logograms, meaning the character represents a word or a phrase.The reason why I'm saying this is, it may not be a big difference between
en-US
and en-GB
, since it varies only in some phrase usage. However, zh-CN
and zh-TW
may both be Chinese but vary a lot, not only do their characters look different but the usage of words and phrases differ too. An example is the word collaboration in zh-CN
, they use 协作
. Character-wise, it's already different, the first Chinese characters is the simplified version of the traditional Chinese one (simplified: 协
, tradional: 協
). Not to mention that we don't even use 協作
for collaboration
in zh-TW
, we use 合作
.
+if (!locale.includes("-")) {
requested = requested.split('-')[0]
+}
Future Support for Other Subtag Translation
When req.i18n.getLocale()
encounters a language tag with region code, it will check if such translation.js exists, if not, it fallbacks to the vanilla language (it-IT.js
doesn't exist, use it.js
). If someday, maybe say someone commits a Spanish translation with some Latin American slangs or words, and named the translation file es-419.js
, this scenerio is considered "language code exists but includes a subtag", when users with es-419
visits the homepage,
var locale = req.i18n.getLocale();
//locale = es-419
var requested = req.headers['accept-language'] || ''; //es-419,es;q=0.9
requested = requested.split(';')[0] || ''; //es-419,es
requested = requested.split(/,\s*/)[0]; //es-419
if (!locale.includes("-")) { //es-419.js exists, es-419.js is served, locale = es-419
requested = requested.split('-')[0]; //es-419
}
//requested = es-419
if (locale !== requested) { //true
//show "Want this in..."
}
since locale = es-419
equals to requested = es-419
, page serves correctly, "Translation kindly done by..." is showing.
I added some test translations js files like es-419 and it-CH to simulate future translations with subtags. Those with no subtag versions will serve plain language translation, those with subtag will properly show correct content with credits. This is what the Traditional Chinese page look like before
var requested = req.headers['accept-language'] || '';
requested = requested.split(';')[0] || '';
requested = requested.split(/,\s*/)[0];
+if (!locale.includes("-")) {
requested = requested.split('-')[0]
+}
As discussed in