[Idea pitch] Building a target updates system on GitLab

Summary

Most of our translated content currently lives in two places: the translation memory (TM) and GitLab. While we have historically treated the TM as the single source of truth (SSOT), our long-term goal is for GitLab to be that SSOT and for the TM to mirror what’s in GitLab, not the other way around.

Moving fully to that model, however, implies a significant infrastructure investment (Argo–Phrase–GitLab orchestration, TM sync logic, automation, etc.) that still needs to be evaluated for effort, cost, and risk.

GitLab already gives us strong advantages for iterating on translations: we get version control on both source and target files, clear review history, CODEOWNERS, and merge policies. But our current technical stack is still primarily source-driven: Argo watches source files, creates translation requests, and routes them through Phrase/TM; it does not systematically understand or protect target-side edits.

Problem

Because we don’t have system-level tracking of target updates, we regularly run into two issues:

  1. Target overwrites from outdated TM entries
    • Target files are edited directly in GitLab (SEO tweaks, copy improvements, bug fixes, etc.).
    • Later, an updated English file is sent through the standard translation workflow, which re-leverages TM entries that do not include those GitLab-only edits.
    • When the Translation MR is created and merged, the “new” translation overwrites the improved target content that was previously edited directly in GitLab.
  2. High manual overhead for preserving target updates
    • If a Content Manager or localization expert wants to improve copy or add SEO optimizations in localized files, they must:
      • Manually track what changed and where.
      • Locate the corresponding files/projects in Argo–Phrase.
      • Manually search and update each affected segment in the TM one by one, to avoid regression on the next translation request.

Proposal

Build a target-aware monitoring and protection layer that treats GitLab target files as the SSOT, prevents overwrites, and (over time) keeps the TM in sync with GitLab.

  1. Short-term: monitor and protect target edits

    • Implement a mechanism that can detect and record:
      • When a target update happened
      • What changed (content diff)
      • Where (file, path, project, locale)
      • Who made the change (author / role)
    • Use that information to block or surface conflicts when a Translation MR would overwrite newer target content in GitLab, for example by:
      • Automatically flagging potential target-update conflicts on Translation MRs.
      • Forcing an explicit decision (keep GitLab version / keep incoming translation / manual merge) instead of silently overwriting.

    The intent is not to forbid direct target edits in GitLab, but to make them first-class citizens in the workflow so they are never silently lost.

  2. Long-term: Keep TM and other assets in sync with GitLab

    • Once we have reliable detection and conflict-prevention in place, extend the system so that confirmed target updates can:
      • Propagate into TM / TB / QA scripts / AI prompts automatically or via a controlled queue (Argo, dashboards, or labels).
      • Feed into AI-driven translation engines and QA so that upstream assets always learn from corrections done in GitLab MRs.

Why now

  • We are scaling continuous localization across both tech docs and about.gitlab.com, which increases the frequency of Translation MRs and the risk surface for silent target overwrites.
  • We already have manual prototypes and partial solutions (labels, trackers, merge-conflict-based workflows) around propagating target updates to TM.
  • Formalizing this into a single, system-level solution would reduce manual toil, protect quality improvements made by CMs and reviewers, and move us closer to a GitLab-as-SSOT model for localized content.
Edited by Maria Jose Salmeron Ibáñez