Skip to content
Snippets Groups Projects

Add local project uploads cleanup task

Merged Michael Kozono requested to merge mk/add-local-project-uploads-cleanup-task into master

What does this MR do?

Adds a rake task that can list, or delete move to lost-and-found, orphaned local project upload files.

Are there points in the code the reviewer needs to double check?

Why was this MR needed?

It is necessary for https://gitlab.com/gitlab-org/gitlab-ce/issues/49630 and https://gitlab.com/gitlab-org/gitlab-ee/issues/6012

Screenshots (if relevant)

Example output:

$ sudo gitlab-rake gitlab:cleanup:project_uploads
I, [2018-07-27T12:08:28.671559 #89817]  INFO -- : Looking for orphaned project uploads to clean up. Dry run...
E, [2018-07-27T12:08:28.689869 #89817] ERROR -- : Skipping... Unable to parse project upload path: "/opt/gitlab/embedded/service/gitlab-rails/public/uploads/test.out"
I, [2018-07-27T12:08:28.754259 #89817]  INFO -- : Found correct path! /opt/gitlab/embedded/service/gitlab-rails/public/uploads/qux/foo/bar/89a0f7b0b97008a4a18cedccfdcd93fb/foo.txt
I, [2018-07-27T12:08:28.755624 #89817]  INFO -- : Can be moved to parent: /opt/gitlab/embedded/service/gitlab-rails/public/uploads/foo/bar/89a0f7b0b97008a4a18cedccfdcd93fb/foo.txt
I, [2018-07-27T12:08:28.760257 #89817]  INFO -- : Can be moved to lost and found: /opt/gitlab/embedded/service/gitlab-rails/public/uploads/foo/bar/1dd6f0f7eefd2acc4c2233f89a0f7b0b/image.png
I, [2018-07-27T12:08:28.764470 #89817]  INFO -- : To cleanup these files run this command with DRY_RUN=false

$ sudo gitlab-rake gitlab:cleanup:project_uploads DRY_RUN=false
I, [2018-07-27T12:09:24.944414 #89936]  INFO -- : Looking for orphaned project uploads to clean up...
E, [2018-07-27T12:09:24.962321 #89936] ERROR -- : Skipping... Unable to parse project upload path: "/opt/gitlab/embedded/service/gitlab-rails/public/uploads/test.out"
I, [2018-07-27T12:08:28.754259 #89817]  INFO -- : Found correct path! /opt/gitlab/embedded/service/gitlab-rails/public/uploads/qux/foo/bar/89a0f7b0b97008a4a18cedccfdcd93fb/foo.txt
I, [2018-07-27T12:08:28.755624 #89817]  INFO -- : Moved to parent: /opt/gitlab/embedded/service/gitlab-rails/public/uploads/foo/bar/89a0f7b0b97008a4a18cedccfdcd93fb/foo.txt
I, [2018-07-27T12:08:28.760257 #89817]  INFO -- : Moved to lost and found: /opt/gitlab/embedded/service/gitlab-rails/public/uploads/foo/bar/1dd6f0f7eefd2acc4c2233f89a0f7b0b/image.png

Does this MR meet the acceptance criteria?

What are the relevant issue numbers?

https://gitlab.com/gitlab-org/gitlab-ce/issues/49630 and https://gitlab.com/gitlab-org/gitlab-ee/issues/6012

Edited by Michael Kozono

Merge request reports

Loading
Loading

Activity

Filter activity
  • Approvals
  • Assignees & reviewers
  • Comments (from bots)
  • Comments (from users)
  • Commits & branches
  • Edits
  • Labels
  • Lock status
  • Mentions
  • Merge request status
  • Tracking
  • Author Maintainer

    @ash.mckenzie Could you please give an initial review?

    (This is working for me locally, but I still have it WIP since I need to add documentation, and I plan to test it on staging.)

    Edited by Michael Kozono
  • Michael Kozono assigned to @ash.mckenzie

    assigned to @ash.mckenzie

  • Michael Kozono changed the description

    changed the description

  • As we're deleting files it's good we have a record to stdout, but am wondering if we should at minimum have a timestamp included.. which then begs the question is it easier to use a logger? It would be even handier to have this written out somewhere too, jusssst in case :)

  • Looking good @mkozono :)

  • assigned to @mkozono

  • Michael Kozono added 1 commit

    added 1 commit

    • 3b80ea17 - Add local project uploads cleanup task

    Compare with previous version

  • Michael Kozono changed the description

    changed the description

  • Michael Kozono added 1 commit

    added 1 commit

    • f9bb53d3 - Add local project uploads cleanup task

    Compare with previous version

  • Author Maintainer

    Ok I addressed some feedback and responded on some, and I've added documentation. I also added a test for skipping an unexpected file path, and I moved the delete logic into the class.

    I used a logger, and now running the rake task outputs to both Rails.logger and stdout. I've updated the screenshot in the description.

    Back to you @ash.mckenzie!

    Edited by Michael Kozono
  • Michael Kozono assigned to @ash.mckenzie

    assigned to @ash.mckenzie

  • Michael Kozono added 1 commit

    added 1 commit

    • 0574e0f0 - Add local project uploads cleanup task

    Compare with previous version

  • Michael Kozono added 1 commit

    added 1 commit

    • dcf6a1d6 - Add local project uploads cleanup task

    Compare with previous version

  • Michael Kozono added 1 commit

    added 1 commit

    • 3e1d9522 - Add local project uploads cleanup task

    Compare with previous version

  • Author Maintainer

    In case you are wondering, the latest squashed changes were:

    • Change variable name from finder to cleaner
    • Change default from remove: true :cold_sweat: to remove: false (this mistake happened because I originally named the argument dry_run)
    • Change logger argument in initialize from required to optional (which was my original intention)

    Also, my staging testing isn't going well.

    @ash.mckenzie I know this is hard to read, but since my Unix knowledge is not strong, I can't explain this weirdness:

    • On GSTG, find took 2 hours, and returned only 39 files
    • On Azure staging, find took 10 mins, and returned only 27 files

    Could you please give it a look?

    # find command hangs
    mkozono@sidekiq-besteffort-01-sv-gstg.c.gitlab-staging-1.internal:~$ sudo /usr/bin/ionice -c Idle find -L /opt/gitlab/embedded/service/gitlab-rails/public/uploads -type f ! \( -path /opt/gitlab/embedded/service/gitlab-rails/public/uploads/-/* -prune \) ! \( -path /opt/gitlab/embedded/service/gitlab-rails/public/uploads/@hashed/* -prune \) ! \( -path /opt/gitlab/embedded/service/gitlab-rails/public/uploads/tmp/* -prune \) -print0
    <waited 1 min>
    ^C
    
    # simpler find command hangs
    mkozono@sidekiq-besteffort-01-sv-gstg.c.gitlab-staging-1.internal:~$ sudo find -L /opt/gitlab/embedded/service/gitlab-rails/public/uploads -type f
    <waited 1 min>
    ^C
    
    <snip>
    
    # Find in subdirectory works
    mkozono@sidekiq-besteffort-01-sv-gstg.c.gitlab-staging-1.internal:~$ sudo /usr/bin/ionice -c Idle find -L /opt/gitlab/embedded/service/gitlab-rails/public/uploads/mkozonotest -type f
    /opt/gitlab/embedded/service/gitlab-rails/public/uploads/mkozonotest/qa-test-gcp-migration/e7f8b7ec9b98c7a274e6bc0f53702f0d/bob.gif
    /opt/gitlab/embedded/service/gitlab-rails/public/uploads/mkozonotest/qa-test-gcp-migration/cc0691eed0191a9fc6cfb70b9f20e9b7/_.jpg
    /opt/gitlab/embedded/service/gitlab-rails/public/uploads/mkozonotest/qa-test-gcp-migration/5cf9a57f599059e6de4d346057831e60/Screen_Shot_2018-06-04_at_09.04.59.png
    /opt/gitlab/embedded/service/gitlab-rails/public/uploads/mkozonotest/qa-test-gcp-migration/9851cafa822ddfbe38d73ca9dc42b505/rails_sample.jpg
    
    <snip>
    
    # Tried find in all uploads again, and waited longer, and received some output
    mkozono@sidekiq-besteffort-01-sv-gstg.c.gitlab-staging-1.internal:~$ sudo /usr/bin/ionice -c Idle find -L /opt/gitlab/embedded/service/gitlab-rails/public/uploads/ -type f
    /opt/gitlab/embedded/service/gitlab-rails/public/uploads/mkozonotest/qa-test-gcp-migration/e7f8b7ec9b98c7a274e6bc0f53702f0d/bob.gif
    /opt/gitlab/embedded/service/gitlab-rails/public/uploads/mkozonotest/qa-test-gcp-migration/cc0691eed0191a9fc6cfb70b9f20e9b7/_.jpg
    /opt/gitlab/embedded/service/gitlab-rails/public/uploads/mkozonotest/qa-test-gcp-migration/5cf9a57f599059e6de4d346057831e60/Screen_Shot_2018-06-04_at_09.04.59.png
    /opt/gitlab/embedded/service/gitlab-rails/public/uploads/mkozonotest/qa-test-gcp-migration/9851cafa822ddfbe38d73ca9dc42b505/rails_sample.jpg
    ^C
    
    <snip>
    
    mkozono@sidekiq-besteffort-01-sv-gstg.c.gitlab-staging-1.internal:~$ sudo gitlab-rails console
    
    <snip>
    
    # Tried my running my code again. Am still waiting for a first batch to process
    irb(main):157:0> cleaner = Gitlab::Cleanup::ProjectUploads.new(logger: Rails.logger)
    => #<Gitlab::Cleanup::ProjectUploads:0x00007f28bcd0ba10 @logger=#<ActiveSupport::Logger:0x00007f28d3cb0b80 @level=1, @progname=nil, @default_formatter=#<Logger::Formatter:0x00007f28d3cb0ae0 @datetime_format=nil>, @formatter=#<ActiveSupport::Logger::SimpleFormatter:0x00007f28d350b510 @datetime_format=nil>, @logdev=#<Logger::LogDevice:0x00007f28d3cb0a90 @shift_period_suffix=nil, @shift_size=nil, @shift_age=nil, @filename=nil, @dev=#<File:/opt/gitlab/embedded/service/gitlab-rails/log/production.log>, @mon_owner=nil, @mon_count=0, @mon_mutex=#<Thread::Mutex:0x00007f28d3cb0a40>>, @local_levels=#<ThreadSafe::Cache:0x00007f28d3cb09c8 @backend={}, @default_proc=nil>>>
    irb(main):158:0> cleaner.run!
    Looking for orphaned project uploads to remove. Dry run...
    find command: "/usr/bin/ionice -c Idle find -L /opt/gitlab/embedded/service/gitlab-rails/public/uploads -type f ! ( -path /opt/gitlab/embedded/service/gitlab-rails/public/uploads/-/* -prune ) ! ( -path /opt/gitlab/embedded/service/gitlab-rails/public/uploads/@hashed/* -prune ) ! ( -path /opt/gitlab/embedded/service/gitlab-rails/public/uploads/tmp/* -prune ) -print0"
    
    ...< still waiting here, at least 30 mins >...
    
    # Update, it completed after 2 hours
    Processing batch of 39 project upload file paths, starting with /opt/gitlab/embedded/service/gitlab-rails/public/uploads/mkozonotest/qa-test-gcp-migration/e7f8b7ec9b98c7a274e6bc0f53702f0d/bob.gif
    Can be removed: /opt/gitlab/embedded/service/gitlab-rails/public/uploads/mkozonotest/qa-test-gcp-migration/e7f8b7ec9b98c7a274e6bc0f53702f0d/bob.gif
    Can be removed: /opt/gitlab/embedded/service/gitlab-rails/public/uploads/mkozonotest/qa-test-gcp-migration/5cf9a57f599059e6de4d346057831e60/Screen_Shot_2018-06-04_at_09.04.59.png
    Can be removed: /opt/gitlab/embedded/service/gitlab-rails/public/uploads/gitlab-qa-sandbox/qa-test-15-11-2017-16-15-05/awesome-project-95d0ad5043bfeb85/f7c6e8251da94493dfddf4e0e57dd70f/IMG_0300.JPG
    <snip 16 more lines of orphans to keep the hashes secret> 
    => nil

    I started it as well on Azure staging and it returned in about 5 or 10 minutes with unexpected results:

    [ stg ] production> cleaner = Gitlab::Cleanup::ProjectUploads.new
    => #<Gitlab::Cleanup::ProjectUploads:0x00007f08911d3cb0 @logger=#<ActiveSupport::Logger:0x00007f0893883658 @level=1, @progname=nil, @default_formatter=#<Logger::Formatter:0x00007f08938835b8 @datetime_format=nil>, @formatter=#<ActiveSupport::Logger::SimpleFormatter:0x00007f08930ece08 @datetime_format=nil>, @logdev=#<Logger::LogDevice:0x00007f0893883568 @shift_period_suffix=nil, @shift_size=nil, @shift_age=nil, @filename=nil, @dev=#<File:/opt/gitlab/embedded/service/gitlab-rails/log/production.log>, @mon_owner=nil, @mon_count=0, @mon_mutex=#<Thread::Mutex:0x00007f0893883518>>, @local_levels=#<ThreadSafe::Cache:0x00007f08938834a0 @backend={}, @default_proc=nil>>>
    [ stg ] production> cleaner.run!; Time.now
    Looking for orphaned project uploads to remove. Dry run...
    find command: "/usr/bin/ionice -c Idle find -L /opt/gitlab/embedded/service/gitlab-rails/public/uploads -type f ! ( -path /opt/gitlab/embedded/service/gitlab-rails/public/uploads/-/* -prune ) ! ( -path /opt/gitlab/embedded/service/gitlab-rails/public/uploads/@hashed/* -prune ) ! ( -path /opt/gitlab/embedded/service/gitlab-rails/public/uploads/tmp/* -prune ) -print0"
    Processing batch of 27 project upload file paths, starting with /opt/gitlab/embedded/service/gitlab-rails/public/uploads/system.bak/user/avatar/171554/avatar.png
    Host 10.129.1.102 came back online
    Can be removed: /opt/gitlab/embedded/service/gitlab-rails/public/uploads/system.bak/user/avatar/171554/avatar.png
    Can be removed: /opt/gitlab/embedded/service/gitlab-rails/public/uploads/system.bak/project/avatar/2689262/ghipster.png
    Can be removed: /opt/gitlab/embedded/service/gitlab-rails/public/uploads/system.bak/group/avatar/1317779/Screen_Shot_2017-06-01_at_13.13.10.png
    Can be removed: /opt/gitlab/embedded/service/gitlab-rails/public/uploads/gitlab-org/gitlab-ci-multi-runner/7b41c4a74b36c1f45ddae6fd447cd328/image.png
    Can be removed: /opt/gitlab/embedded/service/gitlab-rails/public/uploads/gitlab-org/gitlab-ci-multi-runner/a17454b9f21b35a0115308a75f2dd1f1/Screenshot_from_2017-09-21_12_01_45.png
    Can be removed: /opt/gitlab/embedded/service/gitlab-rails/public/uploads/gitlab-org/gitlab-ci-multi-runner/604873327c0f30d6998904fdaa64cb30/Screenshot_from_2017-09-21_12_17_55.png
    => 2018-07-28 00:25:25 +0000
    [ stg ] production> 

    I'd like to test the command on production, since I've found that it sometimes acts differently when it comes to this stuff (especially involving the filesystem, I don't know why). But stop me if that's a bad idea. A quick google said that a huge directory can make ls and find hang, but if we are able to cancel it, could it do any harm?

    Edited by Michael Kozono
  • Author Maintainer

    cc @stanhu Perhaps you might have some ideas about the above testing in staging?

  • mentioned in issue #46535 (closed)

  • @mkozono I think you found something there: I/O on the Azure uploads server is significantly faster than the I/O on the Google server, most likely because the latter is using standard disk while the former is on SSD. I think we'd better fix that: gitlab-com/migration#731 (closed).

  • Great work @mkozono , only a few minor comments :) :thumbsup:

  • assigned to @mkozono

  • Author Maintainer

    @mkozono I think you found something there: I/O on the Azure uploads server is significantly faster than the I/O on the Google server, most likely because the latter is using standard disk while the former is on SSD. I think we'd better fix that: gitlab-com/migration#731 (closed).

    Nice find @stanhu!

  • Michael Kozono added 210 commits

    added 210 commits

    Compare with previous version

  • @mkozono I think the machines have been upgraded to SSDs. Try your test again?

  • Author Maintainer

    @stanhu :thumbsup: I am just about to test this again.

  • Author Maintainer

    @stanhu The task completed on GSTG in 38 minutes instead of 2 hours.

  • @mkozono Interesting, still slower than Azure?

    Edited by Stan Hu
  • Author Maintainer

    @mkozono Interesting, still slower than Azure?

    Yep, but staging is weird so it may not be a valid test for comparing Azure vs GC.

    I have now tried the task on GPRD, and it acts as I originally expected-- it hangs for only ~30 seconds, then quickly begins returning batches. Batches are processed faster than the verify uploads task, and we only search through a subset of uploads, so performance is not actually a concern for this MR IMO.

    I still don't understand the mechanism for why staging acts so much differently, but I don't think it's actually a cause for concern.

  • Michael Kozono added 1 commit

    added 1 commit

    • 97c819a9 - Add local project uploads cleanup task

    Compare with previous version

  • Michael Kozono marked the checklist item Documentation created/updated as completed

    marked the checklist item Documentation created/updated as completed

  • Michael Kozono marked the checklist item Conforms to the merge request performance guidelines as completed

    marked the checklist item Conforms to the merge request performance guidelines as completed

  • Michael Kozono marked the checklist item If you have multiple commits, please combine them into a few logically organized commits by squashing them as completed

    marked the checklist item If you have multiple commits, please combine them into a few logically organized commits by squashing them as completed

  • Michael Kozono unmarked as a Work In Progress

    unmarked as a Work In Progress

  • Michael Kozono changed the description

    changed the description

  • Michael Kozono marked the checklist item Conforms to the database guides as completed

    marked the checklist item Conforms to the database guides as completed

  • Loading
  • Loading
  • Loading
  • Loading
  • Loading
  • Loading
  • Loading
  • Loading
  • Loading
  • Loading
  • Please register or sign in to reply
    Loading