Code for zipping scans is messy and ad-hoc

The zip code is messy and it's ad-hoc behaviour is unclear and causes storage issues like duplicated files.

Currently there is one function for handling creating of zips in the scan called:

create_zip_of_scan, it is pretty long and complex. It's jobs are:

  • Create zips of scan directory for a scan via a @thing_action call if it doesn't exist.
  • Create zips as scans are ongoing
  • Move files to a higher directory at end of scan (it does this when downloading, and it duplicates files)

Plan for update:

  • Scan creates a zip at the start containing its scan parameters json file.
  • Saving the image also adds it to the zip
  • On performing the final stitch we add the correct files to the zip (outside of the images directory)
  • The thing_action to download the zip doesn't modify the zip
  • If we run stitch again from the thing_action it first creates a new zip with just the images, then adds its final stitches in at the end.

Other considerations

  • Should we be hard coding file names?
  • Where do the configuration files belong? In the images dir or another directory?

Updating this as we merge !248 (merged): Now we create, update and download the Zip in separate functions. Still have the same issue that downloading multiple times duplicates some stitch files. I think the solution to this involves getting filenames and sizes from

for i, file in enumerate(scan_zip.infolist(), 1):
   print(file.file_size)

and comparing the file sizes to the images. If they match, don't need to replace. If they're different, redo the zip from scratch

May relate to #272 (closed) #277 (closed)

Edited May 07, 2025 by Joe Knapper
Assignee Loading
Time tracking Loading