Skip to content

Gitlab has problems handling ansi files e.g. windows-1252 characters. This also results in merges or "online edits" deleting chars

Summary

Gitlab seems to "forget" special chars when converting/preparing Ansi windows-1252 files for viewing. In the original files, the chars are still present.

When working with these commits on gitlab, e.g. using the "edit file" function or doing merge requests with online edits, the "invisible" chars will then be removed.

When converting between utf8 and windows-1252, you can see the problems within the gitlab diff.

In some cases, some windows-1252 files are correctly shown and handled by gitlab.

Steps to reproduce

  1. create an example windows-1252 file with all special chars e.g. from wikipedia - ensure that its stored in windows-1252 ansi format.
  2. commit it
  3. convert it e.g. via notepad++ to utf8
  4. commit it and push to gitlab
  5. see diff online and watch chars appearing (THE BUG)
  6. convert back to windows-1252
  7. commit & push
  8. go to this commit on gitlab
  9. edit file and add some lines - commit
  10. fetch edit-commit and open file in editor - special chars were removed (FOLLOW UP BUG)

Example Project

See my repository https://gitlab.com/HannoHugenberg/encoding/commits/GitlabUI_Ansi_win1252 on branch GitlabUI_Ansi_win1252.

See this commit, to show the hiding of the "special" chars when converting from utf8-bom to windows-1252

see this commit to show the removal of "hidden special chars" when using gitlabs online "edit and commit" function

see this commit for a windows-1252 file WITHOUT problems

The same has happend on our on premise installation, when an "online edit with following merge commit" removed some special chars from our files. This resulted in bad text in a customer installation.

What is the current bug behavior?

non-ascii chars of windows-1252 are not rendered or dropped, resulting in bad behaviours during editing the files AND bad "Request - approve - merge" workflow, since we can not guarantee that the visible changes are correct!!

A -maybe- related issue is our merge request bug-report #16556 (comment 234245949) ZD: https://gitlab.zendesk.com/agent/tickets/135763 (internal use only)

the files are also in the windows-1252 format

What is the expected correct behavior?

all chars should be visible

Relevant logs and/or screenshots

It is reproducible by gitlab.com, see the linked repository and example commits.

Output of checks

This bug happens on GitLab.com

Edited by Hanno Hugenberg