5.0 Stream as much as possible for data upload files
Currently imported bigger files are saved twice on the web server's disk: once by the IFormFile (which automatically caches the file on disk if the file size > 64KB, see this for more details) and then again once by the transfer service method itself to persist the file to execute the import after the web request has finished. Microsoft underlines that the IFormFile object is meant to be used for small and infrequent file uploads only. They clearly recommend the usage of MultipartReader for bigger production systems. This is because with MultipartReader it is possible to self-define the storage place (folder) and easily manage the available space on this folder, while IFormFile uses a pre-defined internal folder that might not provide sufficient space for multiple parallel uploads of big files. Another important reason to use MultipartReader (instead of IFormFile) is that it allows immediately persisting the incoming file in persistent storage and thus doesn't need a second saving on disk. Indeed, disk access is generally a slow process thus if required should only be done once and not twice.
Also, today there is no cleanup of the files stored by the transfer service method. A cleanup needs to be added at the end of the import, in order to avoid the disk from filling up and saturating unnecessarily. Disk space in the cloud has a cost.
The work should thus cover:
- replace both the IFormFile and the current saving of the file in the transfer method with MultipartReader
- allow configuring the place of the temp folder where MultipartReader stores the file or use a temp storage in a database
- gracefully abort the creation of the temp file and respond with appropriate HTTP error if the temp folder is full (if not already done) - Note that currently IFormFile crushes the web service in case of insufficient internal storage space or memory space
- make sure the temp file is deleted once the import has been asynchronously executed (if not already done)
- cleanup all historic temp files on the DevOps servers (if that is not already done) or add a transfert ws method to allow for this cleanup
- check that we have the necessary security-related actions on the temp file (virus check, extension check, …) (if not already done)
Note: Temporary files (MultipartReader) are required to avoid request timeouts for large imports (asynchronous treatment) and to allow for queuing uploads (see ticket #125).