The golden standard of a GraphQL API is not to version (or need to version) your API, but there are some cases where breaking changes may need to be made. One example is removing (backwards-compatible) deprecated parts of the schema.
GitLab currently can not deliver breaking changes to its GraphQL API without breaking integrations without warning.
We'll announce upcoming breaking changes at least three months before making changes to the GraphQL schema, to give integrators time to make the necessary adjustments. Changes go into effect on the first day of a quarter (January 1st, April 1st, July 1st, or October 1st). For example, if we announce a change on January 15th, it will be made on July 1st.
Seeding this issue with my thoughts at the moment:
Announcements
Simpler
Probably a good place to start from
More prone to breaking integrations if people can't update their apps in time
Puts more onus on integrators
Unmaintained integrations will die over time
Would require maintaining a branch of the deprecations, and ideally deploying that to somewhere where the schema can be previewed against production data and integrations tested (in the case of integrations with GitLab.com)
I'm not sure where the announcements would take place and uncertain who else at GitLab we would need in order to help with this
Versioning
Is opt-in for integrators, and it's clear that they are ready for the changes
Would allow older apps to work after the change if the apps are not updated in time
Maintaining separately versioned schemas sounds complicated, especially as the number of versions grow. We'd probably have to separate schemas entirely and backport security patches etc.
@.luke I think announcements are fine in general, but they don't handle this case very well:
Addressing inconsistencies where deprecation isn't possible
If the incompatibility we're choosing to introduce is such that there's no way for the clients to gracefully migrate, I think we're better off offering two versions of the API. Otherwise, the migration window is effectively 0 - people can't really start working on preparing for the new schema until it's released.
In general, I'd like all our changes to allow a window where both old and new queries work in production. How long that period should be is up for debate, of course, but if we're committed to making breaking changes without it, I think it's reasonable to provide two versions of the graphql API (just as we did for v3 and v4 REST API) for that time.
@.luke@nick.thomas Up to now, we've been deprecating fields within the same schema. Would it be possible to stick to that (for most cases). Though we should set a deadline (issue?) in which we actually remove the deprecated fields.
From the example in &1838 (comment 212602089), we can't change the type of a field like that without breaking the clients. But could we still use deprecation coupled with announcements?
What do we think about introducing a new field for the updated value.
Initial state
field:state,IssueStateEnum
Updated Enum type
field:state,Deprecated::IssueStateEnum,deprecated: "Please use `issue_state`"field:issue_state,IssueStateEnum
Remove old implementation
field:issue_state,IssueStateEnum
Optionally we could also update the redundant naming here:
field:issue_state,IssueStateEnum,deprecated: "Please use state"field:state,IssueStateEnum
Looking at it, it looks somewhat cumbersome, but I don't think it's worse than maintaining a copy of the entire schema and having versioned endpoints.
@reprazent what if we want to end up with state having the new type? I'm not sure that we want to have fields named things like issue_state forever just because a field with a subtly different type got there first. But then the alternative seems to be that we do the same dance twice: state -> issue_state -> state. (The second step is simpler, though, because it would just be an alias that we intend to remove.)
what if we want to end up with state having the new type? I'm not sure that we want to have fields named things like issue_state forever just because a field with a subtly different type got there first. But then the alternative seems to be that we do the same dance twice: state -> issue_state -> state. (The second step is simpler, though, because it would just be an alias that we intend to remove.)
@smcgivern Yep, that's what I had in mind, sadly that doubles our turnaround time, and consumers need to adjust their implementation twice .
@reprazent yes, I'm fine with that kind of deprecation :). Both fields work for the deprecation period, and it gives users a period in which they can seamlessly migrate to the new field.
I'm not sure I'm qualified to give the best input on strategy, but for timings, our natural cadences are:
A monthly release.
An annual major release.
However, 2 isn't particularly useful for users of GitLab.com, because if we tied removals to our annual release, people would have between one and eleven months to address the deprecation - potentially even less, considering auto-deploy.
So I think the simplest strategy for us to follow correctly, and for API users to be able to understand, is to set a minimum N month window where we support the old field. If we started with three months, does that sound reasonable? We would need to make sure we take care of the nuances of GitLab.com deployments in this documentation, too.
I acknowledge that this is slightly worse for API users than GitHub's schedule, because we could potentially remove fields in every monthly release, as opposed to just four times a year in their case. But our internal processes are different to theirs, and this feels like a more natural fit. If it turns out to be a problem, we can always change our approach.
@reprazent@smcgivern We could make it a requirement that we mention in our deprecation_reason strings what milestone that part of the schema will be up for removal in.
With a cop to ensure that our deprecation reasons end all end with these messages.
To use an existing deprecated field in our schema as an example:
field:designs,::Types::DesignManagement::DesignCollectionType,null: true,method: :design_collection,deprecation_reason: 'use design_collection. This field will be removed in 14.0'
If we started with three months, does that sound reasonable?
@smcgivern This seems a bit short to me. Perhaps six months instead? In the scenario of the integration being an app, the maintainer needs to become aware of the deprecation, make the changes, and have that update go out to all users. It's possible for this to take place in three months, but with six months it decreases the time pressure!
It looks like we waited two major releases before removing REST API v3. Two years is a long time, but at least it follows the semantics of breaking changes on major versions only.
@ntepluhina do you have any insight or opinion on how long a deprecated part of a GraphQL schema should be supported for before being removed? Do you know of any other GraphQL API consumers you could ask their opinion of also? How would they expect to know about deprecated schema changes?
@.luke I'm OK with any reasonable-ish window. I think six months is probably on the higher end of what I'd like (at least to start with, before we implement this), but I'm perfectly happy with using that as our baseline
We could make it a requirement that we mention in our deprecation_reason strings what milestone that part of the schema will be up for removal in.
With a cop to ensure that our deprecation reasons end all end with these messages.
@.luke That sounds excellent, I was also thinking about creating and linking an issue from the deprecation_reason perhaps we could render that as a link in our docs?
@.luke@smcgivern For newly introduced features, or fields that we know are barely used, we could allow shorter timeframes. Should we specify a minimum (3 releases) and a maximum (6 releases).
I don't think it's reasonable to possibly remove deprecated fields on every release. I think that would put an undue burden on customers and integrators. They would constantly have to keep on top of it every month.
The benefit of doing it on major release boundaries is that you know, on 13.0, I've got to have my integration upgraded. And it may be easier to maintain backward compatibility - running against a 12.x self-hosted system I'm going to get these fields, running against a 13.x system these others are valid.
But once a year is too slow. Quarterly sounds reasonable, and where if you make a change now, it doesn't get removed until the end of the next quarter. So I agree with @reprazent, minimum of 3 releases and a maximum 6 releases, on a well defined schedule.
I like @.luke's cop idea. I also think it would be cool to be able to do a query and get back a list of the deprecations scheduled, pulled from our schema...
But once a year is too slow. Quarterly sounds reasonable, and where if you make a change now, it doesn't get removed until the end of the next quarter. So I agree with @reprazent, minimum of 3 releases and a maximum 6 releases, on a well defined schedule.
@digitalmoksha By well defined schedule, do you mean having a release that can remove deprecated fields every 3 releases and defining these in advance?
For example, every third release from a major release: 12.0, 12.3, 12.6, 12.9... Then the developers decide which of those releases (minimum 3 ahead) a specific deprecated field should be removed based on usage and when it was introduced.
For example, every third release from a major release: 12.0, 12.3, 12.6, 12.9... Then the developers decide which of those releases (minimum 3 ahead) a specific deprecated field should be removed based on usage and when it was introduced.
OK. I'm not in love with that from a project management perspective, because I think that it will take a really long time to get through a double rename (#32292 (comment 223220882)), but I appreciate that it's much friendlier for API consumers.
If we're all in favour of that, then I guess we can go ahead and create an MR describing this?
(Note that clients wanting to support GitLab.com and other instances may simply have to handle both cases, as they can't guarantee upgrades will happen on any particular schedule.)
every third release from a major release: 12.0, 12.3, 12.6, 12.9...
Exactly. I agree, a double rename takes a while, which is a bummer. It might help a little if we make sure our deprecation notice mention that we're doing the rename, as opposed to eventually removing the field outright.
I worry about those who do a lot of scripting using our API. It's a terrible feeling to have created a script to automate a process that either you personally or your business relies on, and then have it break when you need it, and you have to spend time trying to figure out what's gone wrong and what the fix is.
I think we're a pretty big consumer of our own API (in regards to tooling). @yorickpeterse@marin wdyt about a schedule like this?
I would say that generally three months is more than enough time for users to adjust their scripts. If three months is not enough they will likely need many more months, and waiting for that will slow us down. Also no matter how long we wait, somebody will always be too late.
In terms of process, the only way you can do this without versioning is introducing new fields that eventually replace old ones; as discussed above. This can be annoying at times, but it's the only backwards compatible way of doing this.
I worry about those who do a lot of scripting using our API. It's a terrible feeling to have created a script to automate a process that either you personally or your business relies on, and then have it break when you need it, and you have to spend time trying to figure out what's gone wrong and what the fix is.
Could we allow integrators to add themselves to an announcement list? Some GraphQL client libraries will print deprecation warnings to developers, but there are plenty of scenarios where a developer still wouldn't see those warnings, or may not be using those libraries. For anyone seriously integrating with our GraphQL API I'd imagine they would really appreciate being able to have an announcement pushed to them if they've signalled they want to know breaking changes.
What do you think about this? We could use an issue (or an epic, not sure which is better), as the push communication mechanism? We call it something like: "News: GraphQL Developer Updates". We lock the issue so that we're the only ones that can add updates to it (we don't want comments flooding users who just want to stay up to date), and anyone can add themselves to be notified. It pushes to people when we update it, supports linking to issues and MRs directly, etc We can publish the link where ever we need to for discoverability, and it keeps us from having to maintain another newsletter or other mechanism.
Maybe we already do something like this, though nothing springs to mind. Anyway, just a thought for something low friction...
I like the idea of using something that already exists, like issues or epics. If it's possible to use epics in this way then that would be ideal, as we'll probably be making issues for quarterly milestones to track which deprecated parts of the schema to remove. If people can watch an Epic that's locked (I'm not sure if that's possible) then it would be a natural fit. We could announce using it, but it would also gather everything together.
Having breaking changes every quarter breaks semver every quarter. I understand that we are itching to not drag this on, but inducing breaking changes outside of major releases will cause a lot of pain. I do not believe we should be even considering this, and we should keep in line with the current policy because a lot of thought has went into it gitlab-com/Product#50 (closed) .
That second link is to a cogent case against strict semantic versioning. Given that the semver spec uses RFC 2119 terms (MUST, SHOULD, etc.) it's not clear to me what this actually means in practice - and before I read that page, I thought I knew how our versioning worked.
From the release side having major or minor release does not matter much, it feels like purely marketing.
So I'm not very clear on what your position is.
As I said above, if we only did 'breaking changes' (and Jeremy Ashkenas's Gist linked in the docs I quoted is a great argument against that as any kind of useful term) in major releases, we actually make it worse for consumers: we can introduce a breaking change at any point up to the major release, instead of having a predictable period of time.
Major - For significant changes, or when any backward-incompatible changes are introduced to the public API.
That second link is to a cogent case against strict semantic versioning.
I do not recall why 3f57022f added it, but the discussion in gitlab-com/Product#50 (closed) should point out to where we are at right now with our policy. I do not think I ever even acknowledged that link existed. I'll submit an MR to remove it as this is definitely no longer our stance.
I said that in response to timing of our major releases. Whether we do a major release May 22nd, every month or we decide to cram breaking changes into patch releases matters little to the actual process of releasing.
So I'm not very clear on what your position is.
My position does not matter as I am following what our agreed definition in the table states.
If you want to change that policy, start a conversation with Product to get that change implemented and we will release according to that policy.
@smcgivern If the API is part of our application, and our API versioning assumes compatibility with certain releases, bumping the API version means a breaking change. Breaking change is managed by the versioning policy. I am sorry if I am causing confusion.
@marin thanks. Sorry it took a while to respond, I was trying to capture this issue as best I could for Product. (In the end, I just opted for quoting what Luke already said.)
@smcgivern I like the part where we have a minimum of 6 months to allow users time to adapt. I also like the strict scheduling of removal of deprecations.
If the yearly major version bump is set in stone, I'd prefer breaking semver over dragging deprecations with us for over a year.
@smcgivern After mulling this over for a week, FWIW I'm going to throw my hat into the "not breaking semver" ring. I feel that maintaining a public API comes with a reasonably great responsibility for minimising the disruption made to people who rely on it. Especially until we have a community of integrators that we can ask for feedback from, erring on the side of semver is probably wise.
@.luke thanks for the update, I think that's also a reasonable position. When would you say a change has to be announced? Semver doesn't care; if we introduced something in 15.11 and break it in 16.0, that is still following semver. But that doesn't work very well if people update frequently, and certainly doesn't work well for GitLab.com, so I think we still need an extra layer somewhere.
@smcgivern It's a really good question, and it's a tough one because we're adding more time before a change can happen. To lay out some options I've made a spreadsheet comparing a >=3 vs >=6 release warning of changes vs the number of releases we'd need to wait before the change could happen if we were to follow semver, and if we were to do twice-yearly breaking changes. (Apologies that my spreadsheet-fu is very poor!).
One thing I'm wondering is if we were to follow semver, would users tolerate a shorter announcement period than what we're proposing in gitlab-com/Product#507 (closed) where the proposal is for a minimum 6-month announcement period, or whether following semver doesn't change the annoucement period length at all.