Claude Code POC: git-based translation memory and target updates for marketing YML
### Executive summary
In a single Claude Code session, a Python script mined the full git history of the [about-gitlab-com](https://gitlab.com/gitlab-com/marketing/digital-experience/about-gitlab-com) project and produced a database of 146,236 individual string changes by human editors, including 2,917 changes by the seven Content Managers and Language Specialists - Emi, Noriko, Megumi, Hendrik, Maud, Laura, Patrick).
Each record captures the _before_ value, the _after_ value, the _English source_ at the time of the edit, the change type, and a clickable link to the MR where the change was made.
{width="900" height="148"}
I also used this database as _translation memory_ to retranslate a file from English to Japanese via the Claude API, passing all JA strings previously curated (target-updates) by Emi as protected segments. The output matched the live Japanese file at 95% accuracy. The remaining 5% were technical things the LLM was not aware of, like`fix-links` script.
This can change how we think about target updates, if such database can in fact be used as a _TM_, the manual sync step between git and a TMS is potentially not needed. The only reason we have to watch for target updates is because we have two systems: GitLab / git and TMS.
### The problem the POC was solving
The about-gitlab-com repo contains \~4,000 non-English .yml files across 6 locales. Some strings in those files were hand-edited by language specialists and content managers. The question was, **which strings were edited by humans?**
#### Step 1: find all human commits in all languages first
The script asked git for every commit that touched the `content/` folder, then discarded any commit from `gitlab-argo-bot`. What remained was humans. Merge commits were also skipped, because those don't contain actual content changes.
Result: **3,034 human commits** identified, spanning **August 2024 to April 2026**, roughly 20 months of git history.
#### Step 2: reduce the output to non-English changes only
For each of those 3,034 commits, the script asked git: _what files did this commit change?_ Criteria were, must be under `content/` but not `content/en-us/` ; must not be under `content/<locale>/blog/` or `content/<locale>/the-source/`
This eliminated 1,986 commits from developers fixing configs, writing blog posts, etc. **1,048 commits** remained that actually touched non-English yml.
#### Step 3: extract English, target-_before_ and target-_after_
For each of those 1,048 commits and each yml file it touched, the script retrieved two snapshots from git:
- **Before**: the file as it existed in the parent commit, just before this change
- **After**: the file as it existed after the commit
Then I realized I forgot to ask Claude Code to also include the English source text, and it had to re-run the entire script again. Result - it extracted English and now thre is `english_source` column in csv / Gsheet
#### Step 4: parse yml and flatten it
A yml file is a nested structure, sections inside sections. To compare them meaningfully, the script flattened both the _before_ and _after_ versions into a dictionary of `dotted.key.path + value`.
For example, this text:..
```yaml
content:
- componentContent:
header: AIの探索を行動に移す
```
...turned into this:
```markdown
content.0.componentContent.header -> AIの探索を行動に移す
```
#### Step 5: classify each change
(this was arbitrary classification by LLM, just for a demo's sake)
For each changed string, a heuristic analysis was run. If the text had changed, similarity / fuzziness was measured, and all changes were categorized into the buckets of `correction` (one word swapped, a punctuation fix, a small tweak), `revision` (if substantially rewritten, and`transcreation`. (if more than 50% of characters were new)
To re-iterate, the thresholds (85% fuzzy or 50% fuzzy) are heuristics that are also used in translation memory regardless of target language or any other logic than arithmetics of word count. It is not linguistic standards. Also, it works for long strings but can mis-classify short ones.
#### Step 6: output
Everything was written to two files on my local:
**`human_edits_database.json`**: full nested structure. One entry per commit, with author, date, subject, and for each file a list before/after values, and classification.
**`language_specialist_edits.csv`**: one row per changed string. Columns: `sha`, `date`, `author_name`, `author_email`, `commit_subject`, `mr_link`, `locale`, `file`, `yaml_key`, `change_type`, `english_source`, `before`, `after`. **The `mr_link` column contains a clickable hyperlink to the GitLab MR where the change was made.**
#### Results
- **146,236** individual string changes captured across all 6 locales by any user, not only CMs
- **2,914** target changes by CMs and language specialists specifically
- As an example spot check, Emi's latest MR https://gitlab.com/gitlab-com/marketing/digital-experience/about-gitlab-com/-/merge_requests/4751/diffs fully captured all string changed.
| {width="900" height="448"} | {width="900" height="298"} |
|---------------------------------------------------------------------------------------------------------------------------------------------------------|---------------------------------------------------------------------------------------------------------------------------------------------------------|
### Tracking target changes and passing them to TMS
A typical flow is, a Translation MR merges translations, then language specialists or CMs catch quality issues, and open follow-up MRs to fix them (for example, [!4756](https://gitlab.com/gitlab-com/marketing/digital-experience/about-gitlab-com/-/merge_requests/4756) correcting formality register in German after [!4467](https://gitlab.com/gitlab-com/marketing/digital-experience/about-gitlab-com/-/merge_requests/4467)), and the only record those corrections exist is the git commit.
Our database captures all of these corrections. The question of how to operationalize that is a process decision. We could load corrections into Phrase as confirmed segments somehow, or automatically alert the vendor when a correction is detected on a file that is in scope. etc.
If there were no TMS, there is no second system to sync with. The database built from git history is the only translation memory, and every agentic translation run will read from it. The transfer target updates step does not exist in this architecture.
### Keeping the csv database of target updates up to date
I had an agent create a second script, `scan_incremental.py`, that scans only commits newer than a given SHA and writes the results to separate files (`new_edits.csv` and `new_language_specialist_edits.csv`)
It captured a few changes from Emi's 3 latest MRs on 2026-04-06. (right as I was running this POC)
{width="900" height="240"}
### Google Sheets
- [language_specialist_edits](https://docs.google.com/spreadsheets/d/1Bm4khcowoIDb8c33KEHHMkhwvs5PxS9qTtLhzgf9dMw/edit?gid=1274019884#gid=1274019884): full database of all language specialist edits (2,917 rows, August 2024 to April 2026)
- [new_language_specialist_edits](https://docs.google.com/spreadsheets/d/18_aSSpz-K-b-caqkSNzfThNykczGLcMQ3nL1LEmo6oo/edit?gid=1345357774#gid=1345357774): incremental result
### LLM translates and uses target updates as translation memory
A second POC tested whether the database of human edits could serve directly as translation memory in an LLM-based workflow, without a TMS like Phrase.
The script `translate_poc.py`, written by the agent during the Claude Code session, took the English source `content/en-us/assessments/ai-modernization-assessment/results-exploring.yml` and asked Claude to translate it to Japanese, **with one constraint:** the 34 strings Emi had curated for that file (33 strigns from MR [!4751](https://gitlab.com/gitlab-com/marketing/digital-experience/about-gitlab-com/-/merge_requests/4751), and 1 string from MR [!4765](https://gitlab.com/gitlab-com/marketing/digital-experience/about-gitlab-com/-/merge_requests/4765), both visible in the [language_specialist_edits sheet](https://docs.google.com/spreadsheets/d/1Bm4khcowoIDb8c33KEHHMkhwvs5PxS9qTtLhzgf9dMw/edit?gid=1274019884#gid=1274019884) filtered to `ja-jp` / `results-exploring.yml`) were passed as protected segments and the LLM was instructed to use them verbatim as a TM.
Result was, LLM used target updates as a TM. It made a few mistakes, for example in the locale prefixes that our [`fix-links.mjs`](https://gitlab.com/gitlab-com/marketing/digital-experience/about-gitlab-com/-/blob/main/scripts/fix-links.mjs) would usually fix after translation. LLM just did not know about that script
Hypothesis was proven - in a no-TMS workflow, the database of human edits can indeed function like a translation memory. The "TM" is a CSV file derived from git history, and the "TMS" is a Claude API call.
### Open questions
This POC demonstrates target updates harvesting, and using them as translation memory.
A TMS also provides other things that have value:
* an IDE for a translator / reviewer with source on the right, target on the left.
* some built in QA checks
* there is also project management and coordination of human translators, and different-than-git audit trails (you can compare versions of files from workflow step\[s, etc.)
* glossary enforcement is a separate question, too. Although the French tech docs POC did show that a termbase can be fed directly to the LLM as context, which handled it without a TMS. https://gitlab.com/gitlab-com/localization/docs-site-localization/-/work_items/903
Worth exploring how they fit or nor into marketing yml translation workflows
issue