Many languages translation files, 'locale/*/gitlab.po', haven't been updated for a long time

Summary

GitLab use locale/*/gitlab.po to store the translated strings for each language, and the translation work happened on the Crowdin platform, https://translate.gitlab.com/project/gitlab-ee. If I understand correctly, the basic workflow is:

  1. locale/gitlab.pot is updated by a commit;
  2. The gitlab.pot is synced to the Crowdin platform by the git push trigger;
  3. Each translation team for each language work begin to work on the new/old strings;
  4. After a while, the GitLab Crowdin Bot will sync newly approved translations, locale/{lang}/gitlab.po back as a commit to a Merge Request.
  5. The MR get merged, and we are all happy.

However, something might be wrong in the process, and it caused many languages translations haven't been updated for quite a while.

2018-06-29_01-29-13_的屏幕截图

Let's take zh_CN as an example, if we check the locale/zh_CN git history on master-i18n branch of gitlab-ee:

2018-06-29_01-56-22_的屏幕截图 https://gitlab.com/gitlab-org/gitlab-ee/commits/master-i18n/locale/zh_CN

As we can see, the file hasn't been updated since 5 Apr, however, there definitely are many updates during those period.

Even the GitLab Crowdin Bot detected the update:

2018-06-29_02-03-38_的屏幕截图 https://gitlab.com/gitlab-crowdin-bot

In the above screenshot, we can see that the bot detected the updates and did 3 commits for locale/zh_CN/gitlab.po, 3ff3c924, 9f973878 and 032575cf.

Then what happened to those commits? and why those updates are not in master or master-i18n?

If we look deeper, we will find those commits belong to the merge request !6187 (closed), which is closed because the static analysis was failed.

It might make sense at first, that if the static analysis is failed then we shouldn't merge the MR, however, if we check the log of static analysis, it shows the failure was caused by errors in locale/uk/gitlab.po and locale/fr/gitlab.po, not something wrong with locale/zh_CN/gitlab.po.

Some static analyses failed:

**** bin/rake lint:all failed with the following error(s):


1275 files inspected, 0 lints detected
scss-lint found no lints
Errors in `/builds/gitlab-org/gitlab-ee/locale/uk/gitlab.po`:
  and 1 fixed vulnerability
    <та %d виправлена вразливість> is using unknown variables: [%d]
Errors in `/builds/gitlab-org/gitlab-ee/locale/fr/gitlab.po`:
  Last %d day
    <La veille> is missing: [%d]

https://gitlab.com/gitlab-org/gitlab-ee/-/jobs/77637287

There are many languages in those commits in MR !6187 (closed), and only 2 languages failed the static analysis. By closing the MR, not merging it, we lost all those good updates for the languages other than uk and fr, including zh_CN in our case. The Merge Request prior to MR !6187 (closed) is !5281 (closed), which created at 6 Apr and is also closed, therefore, many translations updates in the period are lost as well.

I think we have a workflow problem in the translation file synchronization, and there are many issues caused by this problem.

  • https://gitlab.com/gitlab-org/gitlab-ce/issues/47978
  • https://gitlab.com/gitlab-org/gitlab-ce/issues/47979
  • https://gitlab.com/gitlab-org/gitlab-ce/issues/48017
  • https://gitlab.com/gitlab-org/gitlab-ce/issues/48124
  • https://gitlab.com/gitlab-org/gitlab-ce/issues/48233
  • https://gitlab.com/gitlab-org/gitlab-ce/issues/48271
  • https://gitlab.com/gitlab-org/gitlab-ce/issues/48327
  • https://gitlab.com/gitlab-org/gitlab-ce/issues/48459
  • https://gitlab.com/gitlab-org/gitlab-ce/issues/48468
  • https://gitlab.com/gitlab-org/gitlab-ce/issues/48519
  • https://gitlab.com/gitlab-org/gitlab-ce/issues/48524
  • https://gitlab.com/gitlab-com/support-forum/issues/3599

https://gitlab.com/gitlab-org/gitlab-ce/merge_requests/20048 try to address this issue by not regenerating .po file, however, that is not the root cause of the problem. Yes, if we don't regenerate the .po file, then we won't have the fuzzy matching in the .po file, however, if we sync the updates back from Crowdin correctly, then there shouldn't be fuzzy at first place, as those strings have been translated already. Until we sync back the translations from Crowdin, those strings will not be translated, and we will have mixed English and localized strings in the UI.

How about let GitLab Crowdin Bot creating branches/merge request per language based? So, the updates of the language will not be stopped by another language, and the static analysis failure caused by this language can be fixed in the following updates of the language.

Assignee Loading
Time tracking Loading