Skip to content

LFS uploader should not overwrite filename

Kamil Trzciński requested to merge lfs-uploader-filename-fix into master

What does this MR do?

LFS uploader should not overwrite filename

It seems that having a LfsUploader#filename has very nasty side effects. When it is set it breaks CarrierWave behavior of using the presence of data in database (nil or not nil) to verify whether this is object has file or it does not have.

Overwriting filename with precomputed content instead of relaying of what is provided by uploader leads to the case that when we create a new object we always save the filename, even when we didn't create the corresponding file on corresponding file storage. This leads to the cases that CarrierWave thinks that this file does actually exist and it should be present. This makes to have disreptancy between DB and FS/OS view as we then don't know whether the file should or should not exist.

The fix is to get rid of buggy behavior, use UploadedFile to pass correctly
the expected filename under which file should be saved and use a model to verify if this matches our expected filename before saving. This makes some fixes not needed, as then we have consistent view between DB and FS/OS whether the file is saved and should
be present.

Ideal example of checking the buggy behavior is the creation of LfsObject:

l=LfsObject.create(oid: '12312315', size: 200)
   (0.6ms)  BEGIN
  LfsObject Exists (0.3ms)  SELECT  1 AS one FROM "lfs_objects" WHERE "lfs_objects"."oid" = '12312315' LIMIT 1
  SQL (0.2ms)  INSERT INTO "lfs_objects" ("oid", "size", "file", "created_at", "updated_at") VALUES ($1, $2, $3, $4, $5) RETURNING "id"  [["oid", "12312315"], ["size", 200], ["file", "2315"], ["created_at", "2018-02-23 10:54:58.553354"], ["updated_at", "2018-02-23 10:54:58.553354"]]
   (1.7ms)  COMMIT
=> #<LfsObject:0x0055d792ebbbc0
 id: 26,
 oid: "12312315",
 size: 200,
 created_at: Fri, 23 Feb 2018 10:54:58 UTC +00:00,
 updated_at: Fri, 23 Feb 2018 10:54:58 UTC +00:00,
 file: "2315",
 file_store: nil>

In above case, we did not pass the file, so the file DB column should be nil. Instead it is overwritten and build dynamically from oid. This results in into a problem where when we save the object the model has file_changed?, but actually we never changed the file as we did not pass the object to save it.

The proper behavior is:

l=LfsObject.create(oid: '12312317', size: 200)
   (0.2ms)  BEGIN
  LfsObject Exists (0.3ms)  SELECT  1 AS one FROM "lfs_objects" WHERE "lfs_objects"."oid" = '12312317' LIMIT 1
  SQL (0.2ms)  INSERT INTO "lfs_objects" ("oid", "size", "created_at", "updated_at") VALUES ($1, $2, $3, $4) RETURNING "id"  [["oid", "12312317"], ["size", 200], ["created_at", "2018-02-23 11:03:07.533519"], ["updated_at", "2018-02-23 11:03:07.533519"]]
=> #<LfsObject:0x0055d794dc3270
 id: 28,
 oid: "12312317",
 size: 200,
 created_at: Fri, 23 Feb 2018 11:03:07 UTC +00:00,
 updated_at: Fri, 23 Feb 2018 11:03:07 UTC +00:00,
 file: nil,
 file_store: nil>

Here, we didn't pass the file, so we didn't ask for saving file and the file column in DB is nil.

Does this MR meet the acceptance criteria?

  • Changelog entry added, if necessary
  • Tests added for this feature/bug
  • Review
    • Has been reviewed by Backend
  • Internationalization required/considered

What are the relevant issue numbers?

Fixes https://gitlab.com/gitlab-org/gitlab-ee/issues/4980

Merge request reports