Atom feeds render (and sanitize) Markdown fields
What does this MR do and why?
We currently output several fields like issuable.title, issuable.description, project.description, work_item_detail.title, work_item_detail.description directly into Atom feeds, without rendering the content to HTML (and therefore without any sanitisation).
Fix that, add specs to prevent regressions, and also make a bunch of our assertions in these specs use the DOM and not string matching or regexes (!) on XML.
References
See https://gitlab.com/gitlab-org/gitlab/-/work_items/594830+ (staff-only). Fix in public confirmed at https://gitlab.com/gitlab-org/gitlab/-/work_items/594830#note_3192999560 (staff-only).
Screenshots or screen recordings
Here I have a merge request with:
- Title:
Add new file <script>hello</script> <script>hi</script> - Description (entered using plain-text editor):
<marquee>Description</marquee> with <script>hello</script> <script>hi</script>.
It renders this way today on the web interface:
(The title is rendered incorrectly; see Stop stripping HTML from issuable titles (#582606 - closed); to be fixed by Use regular non-Markdown single line pipeline f... (!224839 - merged).)
The description rendering is entirely as expected: the <script>hello</script> is sanitised out, the <s and >s render as visible < and > characters.
Now let's look at the before and after renders of the merge requests Atom feed with just this MR:
| Before | After |
|---|---|
![]() |
![]() |
(The feed token visible in these screenshots was reset after taking them, even though it's extremely unlikely you could access a Docker container on my laptop.)
Note the critical difference:
- The Before side has
<content type="html"><marquee>Description</marquee> with <script>hello…in it, which represents an actual<script>tag in thecontentnode's text!- The
<marquee>tag we wrote is present as<marquee>, too; this is live. type="html"means you should consider the content of the tag (i.e. after unescaping one level of entities) as HTML. Thus this represents a real payload of the following:<marquee>Description</marquee> with <script>hello</script> <script>hi</script>.
- The
- The After side has
<content type="html"><p data-sourcepos="1:1-1:91" dir="auto">Description with &lt;script&gt;…:- A paragraph tag is rendered, as we'd expect for rendered user content.
- The
<marquee>is sanitised out; we don't permit this. - The first
<script>tag as written in the description is sanitised out during render (since it's input as a literal<script>). - The not-actually-
<script>tag (input as<script>) is present as<script>in thecontentnode's text. - As above,
type="html"means the real payload is the following:<p data-sourcepos="1:1-1:91" dir="auto">Description with <script>hi</script>.</p> - This is identical to the GitLab render:

Let's look at how these render differently in an actual RSS reader (Thunderbird).
| Before | After |
|---|---|
![]() |
![]() |
- The Before side happily renders the
<marquee>: this input is entirely user-controlled. It's animating. - The After side does not: this is rendered the same as it is in GitLab.
- Note that the title now matches the rendering of the title in GitLab. When Use regular non-Markdown single line pipeline f... (!224839 - merged) is merged, this will be fixed here too.
How to set up and validate locally
- Set up an MR in a project on your GDK with the same content I have above.
- Go to the merge requests feed for that project:

- Note that the content tag contains
<marquee>— this means that the content of thecontenttag is the text<marquee>, which in turn will be interpreted as HTML and rendered. - Check this branch out.
- Refresh the feed.
- Note the marquee is gone completely from
content; it got sanitised out, as the field is now properly rendered. - Note that it's still in the
descriptiontag, but represented as text: the content of the XML tag is&lt;marquee&gt;…, which represents the HTML<marquee>, which is in turn the text<marquee>, and not a tag. This behaviour is actually unchanged in this MR: we just truncate the rawdescription(not rendered), to 240 characters for the<description>tag in Atom feeds, and do not render it, but we also don't declaretype="html"— it's always treated as plain text. This is fine, and we don't really want to change it; truncating HTML is not easy.
MR acceptance checklist
Evaluate this MR against the MR acceptance checklist. It helps you analyze changes to reduce risks in quality, performance, reliability, security, and maintainability.






