Feature: other compressors than zip for compressed documents
Created by: mbehrle
Especially on linux machines it would be nice to also support some state-of-the-art compressors, at least gzip as compressor for documents.
Imported comments:
By hinnerk on 2014-07-02 12:10:43 UTC
Isn't this supposed to happen deep down in the storage layer? For example a file system like ZFS supports storage and block deduplication, so that's where I'd expect compression to happen.
By thequbit on 2014-07-02 17:01:41 UTC
If you are looking for portability compressing collections of documents may be useful. If you are simply looking to reduce disk space, I agree, working with the storage layer is probably best. I suppose you could say that if you are implementing mayan on a 'legacy system' that a RAID array is being used that already has a filesystem defined, it could be nice ... but that may be outside the scope of the project.
My vote is with Hinnerk use the storage layer to handle compression.
-TD
On Wed, Jul 2, 2014 at 8:10 AM, Hinnerk Haardt notifications@github.com wrote:
Isn't this supposed to happen deep down in the storage layer? For example a file system like ZFS supports storage and block deduplication, so that's where I'd expect compression to happen.
— Reply to this email directly or view it on GitHub https://github.com/mayan-edms/mayan-edms/issues/10#issuecomment-47767982 .
-- Tim Duffy http://timduffy.me/ (585)-210-8353 @arbiterofbits
By mbehrle on 2014-07-02 22:18:30 UTC
Sorry for not being explicit enough. I am talking here about the compression format for new documents. AFAIU those archives are just used as a container for multiple documents, that get all imported with the same metadata, if provided. This is sort of a bulk upload of documents, where all documents in an archive are treated the same way. This has nothing to do with a compression of the storage area.
By rosarior on 2014-07-03 04:38:48 UTC
I recently merged everything having to do with compressing and decompressing documents bundles into https://github.com/mayan-edms/mayan-edms/blob/master/mayan/apps/common/compressed_files.py
The CompressedFile class is used to uncompress zip files during upload by the sources app and to create a document bundle when the user chooses to download more than one document at a type in the documents/views.py module. The functionality is abstracted but as you can see the entire compressed_files.py is pretty ZIP-centric, adding gzip and/or bzip support is going to take some work.
--
On the filesystem compression side, the storage app handles that to keep physical access to the document files abstracted from the rest of the project. This means that a storage backend could be written that does lightweight compression or even on the fly encryption, this could useful for people running Mayan on servers where they don't have a choice or access to change the filesystem. The default backend for Mayan (https://github.com/mayan-edms/mayan-edms/blob/master/mayan/apps/storage/backends/filebasedstorage.py) is just a lightweight wrapper for Django's FileSystemStorage.
Ref: http://django-storages.readthedocs.org/
By rosarior on 2014-07-03 05:39:15 UTC
Proof of concept compressed document storage backend: https://github.com/mayan-edms/mayan-edms/blob/master/mayan/apps/storage/backends/compressedstorage.py