Skip to content

Implement Argo pre-processing for anchor IDs in translated docs

Goal

Implement an automated post-processing step to add English anchor IDs to translated headings in documentation files to maintain consistent cross-language linking while allowing properly translated heading text.

Current state

This issue was identified on the CI pipeline run for our Japanese docs MR. The errors occurred because:

  • In the Japanese version, we're linking to English anchor IDs
  • Markdownlint expects anchors to match headings in the current document
  • Since headings are in Japanese, their auto-generated IDs differ from English ones

Currently, when documentation is translated, Markdownlint reports errors (rule MD051) because the Japanese documentation contains links with English anchor IDs that don't match the headings in the document. For example:

doc-locale/ja-jp/ci/yaml/_index.md:29:17 MD051/link-fragments Link fragments should be valid [Context: "[グローバルキーワード](#global-keywords)"]

The problem is that auto-generated anchor IDs would be different between languages:

  • English: ## Global keywords → #global-keywords
  • Japanese: ## グローバルキーワード → #グローバルキーワード

We need a solution that preserves the English anchor IDs while allowing the heading text to be properly translated:

  • Japanese with English anchor: ## グローバルキーワード {#global-keywords}

Discussed solutions

We explored three (3) possible approaches and agreed to follow:

  1. Use explicit English heading IDs in Japanese markdown
    • ## グローバルキーワード {#global-keywords}
    • https://docs.gitlab.com/ja-jp/ci/yaml/#global-keywords
    • Benefits: consistent anchors across languages, better supports Docs Global Gateway initiative: Docs Global Gateway (&107)

Rejected options: 2. Use Japanese auto-generated anchor IDs: https://docs.gitlab.com/ja-jp/ci/yaml/#グローバルキーワード

  • Issues with this option: URL escape sequences make links unreadable when copied (#%E3%82%B0%E3%83%AD%E3%83%BC%E3%83%90%E3%83%AB%E3%82%AD%E3%83%BC%E3%83%AF%E3%83%BC%E3%83%89), harder to maintain cross-language references
  1. Modify anchor links to point to English document
    • Issues with this option: poor user experience (jolting users between languages), requires redirects, not recommended

Decided implementation approach

Implement pre-processing in Argo, through an enhancement to the GitLab Agent. The solution should add English anchor IDs to headers in translated content before translation occurs, using Hugo shortcodes ({#anchor-id}).

While from the localization management perspective, it is typically recommended to avoid modifying source files in the middleware / on route between source system and translation system, this specific customization requires minimal development effort, and can be easily reversed (two-way door).

Edited by Oleksandr Pysaryuk