Transfer service extracts content of imported zip file with original file date
When a data file is imported in zip format, the transfer service extracts the content of the zip archive with the original file type. Because of this, there is a chance that this file is deleted by the background cleanup process before the file content is imported, as it falsely identifies this file as a candidate for cleanup.
I encountered an error message when attempted to import the file EO_LIVE_Q.zip
the first time at 2021-11-11 08:27
with the following content:
The error message received:
The file /tmp/TransferService/_2021_11_11_07_27_43_251740.tmp was not found.
System.ArgumentException: The file /tmp/TransferService/_2021_11_11_07_27_43_251740.tmp was not found.
at DotStat.Transfer.Producer.SdmxFileProducer.Process(SdmxFileToSqlTransferParam transferParam, Dataflow dataflow) in /app/DotStat.Transfer/Producer/SdmxFileProducer.cs:line 68
at DotStat.Transfer.Manager.SdmxFileToSqlTransferManager.Transfer(SdmxFileToSqlTransferParam transferParam) in /app/DotStat.Transfer/Manager/SdmxFileToSqlTransferManager.cs:line 33
at DotStatServices.Transfer.Controllers.ImportController.<>c__DisplayClass18_0`1.<b__0>d.MoveNext() in /app/DotStatServices.Transfer/Controllers/ImportController.cs:line 365
However, the second import attempt was successful 2 minutes later at 2021-11-11 08:29
.
I checked the temp folder inside the transfer service container and found the following:
root@898d43de02f1:/tmp/TransferService# ls -la
total 209592
drwxr-xr-x 2 root root 4096 Nov 11 07:29 .
drwxrwxrwt 1 root root 4096 Nov 11 07:29 ..
-rw-r--r-- 1 root root 5035284 Nov 10 16:13 _2021_11_10_16_13_56_983280.tmp
-rw-r--r-- 1 root root 33328597 Nov 10 17:08 _2021_11_10_16_13_58_620282.tmp
-rw-r--r-- 1 root root 19625672 Nov 11 07:27 _2021_11_11_07_27_42_108640.tmp
-rw-r--r-- 1 root root 19625672 Nov 11 07:29 _2021_11_11_07_29_18_131829.tmp
-rw-r--r-- 1 root root 136987260 Oct 29 06:09 _2021_11_11_07_29_19_202386.tmp
Based on the file names, the last two files belong to the second import action, but the date/time of the extracted _2021_11_11_07_29_19_202386.tmp
file is Oct 29 06:09
(the creation date/time of the csv file archived in the zip) instead of Nov 11 07:29
. Because of this the background process may delete the file before it is processed.
The task is to prevent the early deletion of the extracted file, e.g. by applying the current date/time on the extracted content of the imported zip file.