Don't redact Markdown documents that don't contain private information

Currently our Markdown pipeline more or less works as follows:

  1. Render the Markdown to HTML, store the result in a database column
  2. The next time we want to get this data, just get the HTML instead of the Markdown
  3. Parse the HTML string into an HTML document
  4. Redact the document
  5. Serialize the document back to a String

In most cases a document won't contain any information that has to be redacted (e.g. a link to a private issue). If we were to store some kind of boolean (e.g. has_private_references) for every field we could change the setup to the following:

  1. Render Markdown to HTML, cache it
  2. On the next request, if has_private_references is false we just display the HTML string as-is, without parsing
  3. If it's true, parse and redact

This could cut down response timings of most issues/merge requests by quite a bit. One caveat is that if you refer to a public resource (e.g. an issue), which is then made private, the link will stay visible. However, since the resource was public to begin with I don't think this is a very big issue.

cc @DouweM @rspeicher

Assignee Loading
Time tracking Loading