Skip to content

Convert translationof to proper nodes

Jonas Hahnfeld requested to merge hahnjo/lilypond:convert-translationof into master

As laid out in the mailing list thread on the translated documentation (https://lists.gnu.org/archive/html/lilypond-devel/2023-01/msg00047.html) it is possible to replace the custom setup using @translationof with an approach using the original @nodes.

In the new setup, all references (including the @menus) are made to the original English node names, while texi2html and texi2pdf automatically use the (translated) section names as labels for all internal references within the same document. For cross-references to other manuals, it is now required to specify the translated label to display (via the *named macros), instead of relying on generated xref-maps.

A large fraction of the changes was created with the script convert-translationof.py.


For comparison, PDF is the easier of the two formats: Mostly the produced files are visually identical. Minor differences can be observed in the following:

  • In the German manuals, the front page and the corresponding index entry changes slightly because of the corrected translation.
  • In the German LM, the English section on Installing now has fixed internal references to the Tutorial ("Tutorium"). This is because the contained @ref commands were already referring to the English node names, which didn't work for partially translated manuals. The other translations are not affected because it is conveniently also called "Tutorial" in the respective languages, and the French manual has a proper translation linking to "Tutoriel".
  • In the NR, all translations have references fixed in the included English appendices (see previous point).
  • Same story in the Catalan Usage Manual.

The comparison of the HTML pages produced by texi2html-1.82 (the version used for the official documentation) is a bit trickier. First of all, the anchor names of each @node change to the original English names - this is good because it allows jumping to the same part of the page across translations. In addition, the name attributes present in the table of contents change, which should be irrelevant for our purposes. To reduce the number of differences, it is possible to create a copy of two out-www directories to compare and use sed to remove parts of the generated files that are known to differ:

dir=out-www.00-master
cp -a $dir $dir.sed
find $dir.sed/ -name "*.html" | xargs -n1 sed -i '/<ul class="toc">/q' # remove all lines following the first occurrence
find $dir.sed/ -name "*.html" | xargs -n1 sed -i '/^<a name="/d' # remove all lines starting with an anchor

Afterwards it is feasible to compare two out-www directories using diff -ur out-www.00-master.sed/ out-www.01-convert-translationof.sed/ -x "*.ly" -x "*.pdf". Of the remaining changes, a large fraction are "fixed" links to the IR or the Snippets that now correctly refer to files based on the English node names. It is not fully clear why this was wrong before or what changes (because the references to the two manuals, that do no have @translationof, are not touched at all), but I observe that it happens for all nodes that have a translation in the current manual. Also apostrophes in the labels are replaced by the escape sequence &rsquo;. These instances can be removed by running:

find $dir.sed/ -name "*.html" | xargs -n1 sed -i "s/&rsquo;/'/g"
find $dir.sed/ -name "*.html" | xargs -n1 sed -i '/^<a href="..\/internals\//d'
find $dir.sed/ -name "*.html" | xargs -n1 sed -i '/^<a href="..\/snippets\//d'
find $dir.sed/ -name "*.html" | xargs -n1 sed -i '/^<a href="internals-big-page/d'
find $dir.sed/ -name "*.html" | xargs -n1 sed -i '/^<a href="snippets-big-page/d'

This finally leaves the changes described above for the German front page and indices, and a number of fixed links in the Japanese and Chinese translations (which we do not produce in PDF form).

I also had a quick look at the documentation produced by texi2html-5.0 (which works since some time but is not actively used): The generated HTML files show many more differences, mostly because texi2html-5.0 apparently emits the raw @ref{...} into the HTML code if it cannot find the referenced @node. And there apparently existed many such cases in the Japanese and Chinese translation that already referenced the English nodes names, which worked due to a quirk in lilypond-texi2html.init with texi2html-1.82.


Regarding the organization of this merge request: It contains only the changes to convert @translationof to proper nodes. I keep the removal of the xref-maps for a follow-up, but just to give an idea: It should allow us to remove 200-300 LOC from lilypond-texi2html.init (finally bringing us below 1000 LOC), quite a bit of complexity from the documentation build, and of course the script extract_texi_filenames.py.

To make reviewing somewhat feasible, each language is currently treated in two commits: one with the automatic changes from the script convert-translationof.py and one with some manual adaptions. Please note that only a full conversion of each translation will build, so right now every second commit is "broken". I plan to squash the two commits per language when the changes are ready to be merged. (done now)

Edited by Jonas Hahnfeld

Merge request reports