Parallel Maven uploads lead to 409 Conflict responses
🔥 Problem
In #367356 (comment 1341809603), we discovered that the Maven Repository could receive parallel publications for the exact same package name and version.
That could lead to upload the same file (eg. same filename) at the same time. The problem is that maven clients will upload a .sha1
digest after upload a file.
We have these events:
- Client A: uploads file
foobar.txt
for packagetest
, version1.2.3
. - Client B: uploads file
foobar.txt
(different contents) for packagetest
, version1.2.3
. This is now the most recentfoobar.txt
file. This is the file that will considered for checking.sha1
digests. - Client B: uploads file
foobar.txt.sha1
. It matches thesha1
of (2.), the backend will reply200 Ok
- Client A: uploads file
foobar.txt.sha1
. This is thesha1
of (1.). It doesn't match thesha1
of (2.), the backend will return409 Conflict
.💥
This issue is to discuss if the backend should handle (4.) in a different way or not.
🏟 Additional considerations
- Parallel maven publications is not a common thing to do but I can see it happening with 2 CI pipelines running and publishing the exact same package name+version.
- Maven already has a workaround for this situation: snapshot versions. Snapshot versions will add a suffix that is set by using a timestamp. This means that two parallel snapshot uploads can perfectly be handled.
- Is it technically wrong to answer
409
in 4? Well, the backend always work with the most recent files, always. If it receives asha1
from an older file, then there is something fishy (parallel upload) about the upload. It could be ok to keep it as it is. - Measuring impact. At the time of this writing, in the last 24 hours, gitlab.com received:
-
72 538
successfulsha1
uploads (204 Created
) -
48
conflict errors insha1
uploads (409 Conflict
) -
409 Conflict
represents about0.07%
of all uploads.
-
- Please note that the upload of the
.sha1
file will fail but the publication maven command will still go on and even be successful. In other words, a failingsha1
upload does not interrupt the maven package publication. - Given the challenges on the possible solutions (see below), I'm not sure that this issue is worth solving.
🚒 Solution
Things are not simple as maven clients will not send an identifier to link all uploads of the same package publication. In fact, they do this but only for snapshot versions.
Here are some leads for the solution:
- Inspect older files to see of the uploaded
sha1
exists. - Use something like a lease so that parallel uploads can't happen.
The concerns I have with the above solutions is that we could have edge cases. Verifying a .sha1
is a useful feature. When the backend returns an error for a .sha1
, it indicates a very precise situation: my most recent file doesn't match the signature you sent. This means that we could have an issue with the most recent file.