8-bit UTF-8 data included in HTML part in commit email
Unicode character U+2212, encoded in UTF-8, is included in generated email without further encoding.
Summary
<span class="deleted-file">
−
ansible/upgrade-scripts/11-etcdconfd
</span>
The minus sign character (U+2212 alone on a line) is encoded in UTF-8 (three bytes: 0xE2 0x88 0x92), but the Content-Transfer-Encoding header for both message part and the containing email are set to 7bit.
Steps to reproduce
Push a file deletion into a project; email notifications will include the incorrectly encoded MIME part.
What is the current bug behavior?
Raw 8-bit data (UTF-8 encoded text) is sent, described as 7bit data.
What is the expected correct behavior?
Since RFC 2822 data is required to be 7-bit, the character should be converted to appropriate HTML markup as the message is assembled (−, −, or −).
Results of GitLab environment info
GitLab CE 10.5.0 (34d57661)