Skip to content

Fix Debian background job

What does this MR do and why?

quoting @10io for #340307 (closed):

Tests on staging showed that background jobs for the debian registry are not working as expected with object storage.

The main takeaway is that the background job code is not compatible with object storage. Next, the current code will not properly handle large files (see this bug)

Here are the details:

  • 🐛 When object storage is enabled, the first background job (that processes the .changes file) is not working. (blocker)
    • This is coming on the actions in the package file object. We update the package reference but we can't simply package_file.update!(package: package). This will break the link between the package_file and the physical file in object storage.

^ Fixed (using ::Packages::UpdatePackageFileService).

  • I see several uses of #use_file. This is known to leave the package_file in an improper state that can't be saved. Basically, if package_file.file.use_file is called, then package_file.save! or package_file.update! can't be called (triggers a validation error).
  • 🔧 Solution: package file updates where the package id or file_name are modified must be handled in a custom way. This is the result of the work on a similar nuget bug. In short, use this. Also use use_open_file which is better than use_file.

^ Fixed (#use_file replaced by #use_open_file).

  • 🐛 When an error occurs during the first background job, the files are left behind in the incoming package. Those files takes physical space in object storage.
    • The situation is the following, when uploading a debian package, multiple files are uploaded and the last upload is the .changes file.
    • All the package files are put in the same package: a versionless package named incoming.
    • The problem with this approach is that if the worker fails, it will delete the .changes file but not the others.

^ not fixed in this MR. Proposal: cleanup old incoming files with a cronjob (see Prune dangling incoming files from #300917 (closed))

  • 🐛 When a package is uploaded and it references a codename not present in the project distributions, the second job is still enqueued.
    • I'm not sure why the background job doesn't simply fail if the codename is not found.

^ Added to #300917 (closed).

  • 🐛 The process changes worker moves the package files in a way that will not support large files. Similar to the nuget bug we got a few weeks ago.
    • 🔧 Solution: same as above, switch to the new service that updates the package files in a way that is optimized for large files
  • 📈 Code cleanup/refactoring. Debian package files are using a custom object storage path generator.
    • This is not necessary and actually can be risky. That change is not compatible with the service that updates the package files.

^ Fixed (using ::Packages::UpdatePackageFileService, and custom object storage path removed).

Edited by Mathieu Parent

Merge request reports