Create a changelog feature in GitLab and use it for GitLab Releases
As part of [the Q4 FY212 OKR](https://gitlab.com/gitlab-com/gl-infra/mstaff/-/issues/19), we will be focusing on implementing changelog generation in GitLab itself. This epic will track the work necessary as part of this OKR, as well as provide an overview for what we have today and where we want to move to. Related epic: https://gitlab.com/groups/gitlab-org/-/epics/4983 <details> <summary>Table of contents</summary> [[_TOC_]] </details> * :white_check_mark: Completed * DRI: @yorickpeterse * [Issue board](https://gitlab.com/gitlab-com/gl-infra/delivery/-/boards/2189805) ## The problem Many projects need to record the changes made between releases, in a way more detailed than a release announcement blog post. The approach taken by projects varies, but tends to either be completely manual, or semi-automated. As a release manager, generating a changelog involves multiple steps. In case of GitLab this is automated using custom tooling, tooling others can't use. If done manually, release managers may not have the information necessary to produce an accurate changelog. Even if they do, this process is time consuming and prone to error. As a developer, adding changelogs involves a few steps that can be overlooked easily. Different projects using different workflows also makes it more difficult to contribute changes, especially for beginners. As a user, the different approaches lead to different output formats for changelogs. This requires that the user familiarises themselves with these different formats. Some formats may not include all the information the user wants. Users may wish to use the changelogs to: * See if there are any new features worth trying out * Find out if that one bug they have been dealing with for 6 months has been fixed * See how many performance improvements have been made that perhaps aren't highlighted in the changelog * Use the changelog to build a release post themselves, perhaps for some social media platform In this epic we propose building a solution into GitLab. This solution makes it easier to: 1. Contribute changelog entries as a developer 1. Generate changelogs as a release manager 1. Consume changelogs as a user By building this into GitLab, all users of GitLab can benefit from this functionality. In addition, the Delivery team no longer needs to maintain its own custom changelog generation code. ## Proposal We will move to a setup where changelog entries are generated based on commit titles. The output will remain a changelog file. Commits are excluded by default, and can be included by adding a tag to the message. Commits can further be classified as a feature, bug, etc; again by adding a tag to the commit. To make all this easier, we'll also extend GitLab to support the following: * Editing of commit messages from a merge request * Adding these tags straight from the UI We'll need the help from Gitaly to add an API of sorts to update messages of existing commits. We'll need help from frontend and UX, to help build the UI for editing commit messages when viewing merge requests. We likely also need to extend some of our review tooling and process, in order to make this process as pain-free as we possibly can. For example, we could run [Vale](https://github.com/errata-ai/vale) or [gitlint](https://github.com/jorisroovers/gitlint) against commit messages to help developers with the writing process. ## Technical details To enrich changelog output, such as by marking a change as EE specific or marking it as a "feature", we will use [Git trailers](https://git-scm.com/docs/git-interpret-trailers). Git trailers are built-in into Git, making it easier to interact with/manage this data as a developer. Since these tags are usually placed at the end of a commit body, they don't reduce the number of characters one can fit in a subject line (assuming they follow the 50 character rule). ### Opt-in for changelogs Commits are not included by default, as not every commit warrants a changelog entry. To include a commit in the changelog, one would add the `Changelog: true` tag to the commit message. This can be added manually, or using `git interpret-trailers` (`git commit` doesn't support this at the moment per https://gitlab.com/gitlab-org/git/-/issues/52). ### Commit metadata To categorise commits as features, bugs, etc, one would add a `Type:` tag. The exact values possible and the names to use in changelog files can be specified in a configuration file. As an example, this is what a feature commit may look like: ```gitcommit Expose creation/update times for issue links The issue links API now exposes the fields created_at and updated_at for each issue link. This allows clients to determine when an issue link is created or updated. See https://gitlab.com/gitlab-org/gitlab/-/issues/283948 and https://gitlab.com/gitlab-com/gl-infra/delivery/-/issues/1250 for more information. Changelog: true Type: added ``` To mark a change as an EE-only change, one would add the `EE: true` tag. Linking to merge requests would be done by adding the `Merge-request: X` tag, with X being either the full URL of a merge request, its ID, or the short reference (`gitlab-org/gitlab!48051` for example). Upon generating the changelog, GitLab will resolve this to a full URL. If the tag is left out but the MR can be derived from the commit, GitLab may include the merge request (this depends on how expensive it is to get this information). This metadata could also be used to automatically add labels to merge requests, but this won't be part of our first iteration. ### API interface The generation process is done in a synchronous API call. We won't be deferring anything to Sidekiq for this initial iteration. ### Configuration Configuring this process is done using a YAML configuration file, located in the repository at `.gitlab/changelog.yml`. This file can specify data such as the following: * What `Type:` values can be used, and their human-readable names (used as sections in the changelog) * Where to store the generated changelog file (`CHANGELOG.md` by default) * Whatever we need to mark changes as EE-only (not sure about this yet, this will require some additional thinking) In the future we may support additional options, such as changing the output format. ## Success criteria 1. All GitLab projects released using Release Tools use the changelogs feature, provided they need to produce a changelog. 1. The changelog feature is adopted by the community The first one is obvious, as we want to use the feature as part of our release process. The second one is a little more difficult to measure, and determining the success here will largely depend on community feedback. ## Future plans In the future we may add additional input sources, such as merge request titles or changelog entry files. What inputs exactly we add will depend on feedback we get from users of the changelog feature. In addition, we may support different output formats, such as the GNU changelog format. ## Other approaches We considered other approaches, such as using changelog files or merge request titles as input. These approaches either introduce considerable problems (e.g. not supporting our security releases workflow), or don't end up improving the developer experience enough to justify the effort (such as using changelog entry files as input). ### Merge request titles as input An alternative to the above proposal is to use merge request titles as input, instead of commit titles. When we want to generate a changelog, we provide the ref of the last release tag, and the ref we will tag for the new release. GitLab takes this range of commits, then determines what the merge requests are that those commits originated from. Optionally, it reduces the list of merge requests to those deployed to a certain environment (production in our case). Using these merge requests, we use their titles to add changelog entries. Within the changelog, entries are grouped based on the presence of certain merge request labels. Using this approach, we don't need to change the way we write commit messages; instead we need to change how we write merge request titles. This approach introduces a few challenges: 1. Security releases take place on a private mirror, thus the above process is only aware of security merge requests. If we ever want to include regular merge requests (which is rare, but has happened), they won't make it into the changelog. 1. Each merge request maps to a single changelog entry. If an MR introduces multiple commits that warrant their own changelog entries, the MR author has no choice but to create one merge request for every such commit. This can complicate both the developer and reviewer workflow. 1. We may not always be able to determine what the source merge request is of a commit. This can happen if the commit SHA changes (e.g. after cherry picking it), or if it's rebased/rewritten in some other way. 1. Not all projects use merge requests (as often as we do), meaning they likely wouldn't be able to use this. In addition to these challenges, using merge request titles wouldn't improve the quality of our commit messages. This means that they don't become more useful when debugging something. Because of this, we believe that starting with commit titles as input allows us to achieve the best results. Support for merge requests is something we could add later if desired, building on the foundation necessary for using commit titles as changelog input.
epic