Skip to content

Draft: Internationalization

Benjamin Winger requested to merge bmwinger/openmw:l10n into master

Following up https://forum.openmw.org/viewtopic.php?f=2&t=7224, I've created a proof of concept implementation of internationalization/localization using project fluent.

Implementation Details

Currently localizes a handful of CLI help message strings for the bsatool and openmw executables.

As there is no project fluent C++ library yet, I've instead included a small Rust library with a C++ interface via cxx.

Locale detection is done using locale_config. Unfortunately, this doesn't support getting the system locale encoding (Rust usually only uses utf8, and I guess the developers didn't anticipate it being used in concert with C++ code), so I've disabled the use of unicode-isolating characters for the moment (as far as I'm aware, it's just something used to support right-to-left text). The encoding could be provided to the rust code from the C++ side.

fluent-templates is used as a high-level interface to fluent-rs. It provides two different loaders. One is a static loader which compiles the localizations into the executable, which is what I've set up now as it seems appropriate for those localizations builtin to openmw, as then there's no need to install the .ftl files. There's also a runtime loader which could eventually be used for loading localizations that come with mods.

Note that at the moment, the only supported arguments are strings due to having to pass the arguments through the C++/Rust bridge, and it would take a little more work to fully support FluentValue's options.

Currently I've set the base/fallback language to be en, as I don't know what English variant OpenMW is officially using, if any.

I've also included the Cargo.lock file, as this is standard for applications (it's omitted for libraries). It's inclusion means that you won't get unexpected issues due to updates (and dependencies which don't respect semver), but it also means it will need to be updated periodically so that you actually get dependency updates. It could be omitted instead if that is preferable.

Why Project Fluent?

Alternatives:

  1. gettext: Mature and widely available, however it uses the source text as the message identifier.
  2. International Components for Unicode (ICU4C, in the case of the C/C++ library): Between Fluent and gettext in age, and seems about as capable. Frankly I haven't looked at this much until just now, as I had difficulty finding the relevant parts of their documentation. For future reference, a decent overview of the format/capabilities can be found here.

Both ICU and Fluent use string identifiers to reference messages, so I think gettext can probably be ruled out.

Between the two of them, their capabilities seem quite similar. E.g: they both support complex selectors, though I find fluent's syntax more intuitive.

In general, I've found the fluent documentation somewhat more approachable, as unlike ICU they separate their translation documentation from their API documentation. I've also yet to find a good example project using ICU.

There are two notable features which fluent has which I don't think ICU supports:

  1. doc comments, and the doc comment information gets displayed in pontoon.
  2. Terms, Attributes, Parameterization and Pattern matching on attributes. A good example of this can be found in this comment. Basically, between these localization messages can share information using a functional interface. One major goal of Project Fluent is to be as flexible as possible from the side of the translations.

There's also a comparison in the project fluent documentation here which has some more detailed information about the differences.

On the other hand, Fluent's lack of a C++ library may mean that integration with ICU would be easier. In general, my conclusions at the moment are that Fluent is nicer to use, but ICU doesn't seem to be that lacking in functionality and may have fewer hoops to jump through to integrate with OpenMW and its build systems (I say not having tried to do so...).

Another thing to keep in mind is that, in the long run, this may not just affect the localization for OpenMW, but also any mods, as is would be possible for OpenMW to support mod localization by including ICU/Fluent FTL files alongside the plugins (there are some additional considerations; I'll create an issue about it soon, however I'll mention now that it may be necessary to test how much flexibility is available with respect to loading messages and overriding previously defined ones (e.g. if each plugin has its own localization file). Fluent supports overriding and custom message bundle creation, but I'm not sure about ICU).

Building

This is currently causing a number of build failures.

  • Android: Rust library could probably be cross-compiled using cross.
  • Windows MSVC: Looks like a possible path issue related to cxxbridge's generated c++ files, I'll spend some more time looking into it later if this approach is approved; debugging is slow when all you have to work with is CI output.
  • Windows Ninja: I don't really understand how this build is set up; some insight would be useful. It appears to be invoked with cmake, however fails to find the rule for libl10n.a in components/l10n/CMakeLists.txt.

Merge request reports