maria-db - transfer fails with PiT param + big data
Issue transfer with Pit param
We’ve been experiencing issues with the transfer service while testing MariaDB; using one of our existing datasets (AU1:MERCH_EXP(1.0.0) https://de-qa.siscc.org/vis?lc=en&df[ds]=m-qa:stable&df[ag]=AU1&df[id]=MERCH_EXP&df[vs]=1.0.0&av=true) we’ve found that Point in Time (PiT) transfers of around 1.4 million data points (around 240MB unzipped) from the Stable environment to the Reset environment give an error.
Loading the same data or larger directly into Reset does not cause any issues. We have also been able to successfully carry out PiT transfers for data loads of 1.25 million data points without issue. Some of our largest datasets contain approximately 21 million data points (~5GB) so it’s important for us that we be able to transfer very large data collections between different environments.
reproducible with following materials:
Issue loading data with PiT
originally reported here
Hello,
We have encountered an issue while using Maria-DB for the purpose of loading data into an environment using Point in Time (PiT) releases. Data loaded into the test environment using PiT releases (into a dataset that has not been “cleaned” using the delete process [cleanup/dsd]) seems to only appear after it has already been changed to something else.
Using the dataset structures and delete files provided below, the issue can be recreated with the following steps:
- structures ABORIGINAL_POP_PROJ+ABS+1.0.0+v2.1.xml
- data load A ABS_ABORIGINAL_POP_PROJ_1.0.0_Mini_test_load_A.csv
- data load B ABS_ABORIGINAL_POP_PROJ_1.0.0_Mini_test_load_B.csv
- delete all ABS_ABORIGINAL_POP_PROJ_delete_all_A.csv
- Create dataset in Reset environment.
- If necessary, ensure dataset is empty by running Delete process in Swagger (Swagger UI (siscc.org)).
- Load Mini_test_load_A using PiT for 10 minutes in future via Swagger.
- After release time check that loaded data is present as expected (https://nsi-m-qa-reset.siscc.org/rest/data/AU1,ABORIGINAL_POP_PROJ,1.0.0/all?).
- Load delete_all_A using PiT for 10 minutes in future.
- Load Mini_test_load_B using PiT for the exact same release time as the prior delete file.
- After release time the dataset will still show as having only data from Mini_test_load_A
- Load delete_all_A again using PiT for 10 minutes in future.
- Before release time check dataset content, it will show as having data for Mini_test_load_B
- After release time check dataset content, it will still show as having data for Mini_test_load_B instead of showing as empty.
- Load delete_all_A again using PiT for 10 minutes in future.
- Before release time check dataset content, it will now show as being empty.
- Run the delete process in Swagger.
- Load Mini_test_load_A using PiT for 10 minutes in future.
- At release time correct data will be present.
The issue also happens if only one datafile is set for a PiT release or if no delete file is used: 16. Run the delete process in Swagger. 17. Load Mini_test_load_A using PiT for 10 minutes in future. 18. After release time the dataset will correctly show as having only data from Mini_test_load_A 19. Load Mini_test_load_B using PiT for 10 minutes in future. 20. After release time the dataset will incorrectly show as having only data from Mini_test_load_A