Generating of merge request (diff) comments based on output of CI builds

Problem

Currently GitLab uses Danger as an addition to our various linters and test suite. Danger is mostly used to remind authors to take certain steps based on the changes, such as adding a changelog entry.

A problem that Danger suffers from, or really any other solution that uses the API to "push" data into an MR, is that these solutions won't work for forks. To communicate with the API we need a token, and we can expose these in one of two ways:

Using secret variables, meaning they are not available for forks.
Including them directly in the CI YAML. This means they can (and are guaranteed to) be abused.

A third option would be to come up with some kind of complicated system where CI build tokens can be used, but only to submit comments on MRs. This is such a specific use case that I don't think this would be very useful.

There are also some problems specific to Danger, such as testing being painful, and it integrating less well with editors (due to it using its own file names no editor recognises by default).

Proposal

Instead of trying to make Danger work for forks, I propose a more generic solution. Whenever a build runs and produces an artifact (e.g. comments.json), GitLab will process this artifacts file. The contents of this file will then be used to produce merge request (diff) comments, if there is an associated MR for the build/pipeline. This would be similar to other features such as SAST.

The file would just be JSON, containing an array of objects in (roughly) the following format:

[
  {
    "type": "note", 
    "note": "This is a regular note",
    "discussion": false
  },
  {
    "type": "note", 
    "note": "This is a discussion note", 
    "discussion": true
  },
  {
    "type": "diff_note",
    "note": "This is a note on a diff",
    "file": "app/models/user.rb",
    "line": 14
  }
]

Supporting discussion notes is deliberate. Combined with requiring all discussions to be resolved before a merge, this allows us to create what essentially are MR todos. It also allows developers to discuss the comment, without having to worry about unrelated comments getting interleaved.

The user associated with these notes would be a built-in system user, much like the ghost user we already have. This system user would be created using a migration for existing installations, or on the fly when it is used for the first time, again similar to the ghost user. The name would be "System user", and the avatar would be the GitLab logo. Since you can't manage the user profile (except using a Rails console), this probably requires some sort of if user.system? check whenever we generate avatars.

We can't use real users, as letting people configure this could lead to abuse (e.g. I could set Linus Torvalds as the user). Building some kind of confirmation system for this would be complex.

Parsing of the comments file is ideally done asynchronously using Sidekiq. The job for this would just be given the build ID, and maybe the MR ID (if we can't infer this based on the build). This allows us to process large files without blocking CI.

With this setup, we'd replace Danger with one or more CI builds that perform the work Danger currently does. This allows us to run it for forks, and automatically fail the builds if certain requirements are not met (e.g. a changelog is missing), making it a little bit harder to ignore these failures. It also allows customers to use this, with any programming language of choice.

Drawbacks

We have to essentially reinvent part of Danger, then replace our Danger integration with this new setup.

Related resources

Edited Sep 19, 2018 by Yorick Peterse