TCSVDataset does not remove deleted records from data file

Summary

When a record is deleted from TCSVDataset and the dataset is closed the corresponding line still exists in the CSV data file. The same happens when a record is edited; then the unmodified record is kept in addition to the modified version.

System Information

  • Operating system: Windows-11 (64bit)
  • Processor architecture: x86-64
  • Compiler version: FPC 3.3.1 i386-win32-win32/win64 (33dfb6cb)
  • Device: Computer

Steps to reproduce

Run the attached project which operates in several steps:

  1. It creates a CSVDataset, adds some dummy data and lists the original contents.
  2. The last record is edited, the fields are replaced by random strings. The database content is listed again.
  3. The first record is deleted. The database is listed again.
  4. The dataset is closed which writes the dataset to the CSV data file.
  5. The CSV data file is loaded into a stringlist and its content is written to screen.
  6. Up till now, everything is correct, since the data file did not exist when we started.
RECORDS IN NEWLY CREATED DATASET
  Walter Mellon
  Mario Speedwagon
  Anna Mull
  Julia Mellon

RECORDS IN EDITED DATASET
  Walter Mellon
  Mario Speedwagon
  Anna Mull
  GND HYJPFD

DATASET AFTER DELETING FIRST RECORD
  Mario Speedwagon
  Anna Mull
  GND HYJPFD

CONTENTS OF THE CSV FILE AFTER CLOSING DATASET
  FirstName,LastName
  Mario,Speedwagon
  Anna,Mull
  GND,HYJPFD
  1. Now deactivate the conditional define USE_FRESH_FILEat the top of the project.
  2. Run the project again. Because of the inactive USE_FRESH_FILE define the data file created in steps 1-4 is loaded.
  3. Again the last record is edited, and the first record is deleted.
  4. While the dataset is active the screen output is as expected.
  5. But the file contents displayed after closing the dataset is not correct: it still lists the original copy of the modified record, as well as the deleted record.
RECORDS IN LOADED DATASET
  Mario Speedwagon
  Anna Mull
  GND HYJPFD

RECORDS IN EDITED DATASET
  Mario Speedwagon
  Anna Mull
  KID GEZWQR

DATASET AFTER DELETING FIRST RECORD
  Anna Mull
  KID GEZWQR

CONTENTS OF THE CSV FILE AFTER CLOSING DATASET
  FirstName,LastName
  Mario,Speedwagon               // <--- THIS RECORD WAS DELETED
  Anna,Mull
  GND,HYJPFD                     // <--- THIS IS THE ORIGINAL OF THE EDITED RECORD
  KID,GEZWQR

Example Project

csvdataset_delete.zip

What is the current bug behavior?

The CSV data file still contains the deleted record as well as the original version of the edited record.

What is the expected (correct) behavior?

The CSV data file should contain only the "existing" records. I am aware that datasets usually keep the deleted records and mark them only as being deleted. But a CSV file does not have any internal fields for this information. Therefore the current behaviour is extremely confusing and frustrating. At least there should be a boolean property "KeepDeletedRecords" (which should be off by default).

Additional Information

This report is based on the report freepascal.org/lazarus/lazarus#40131 (closed) and a forum discussion https://forum.lazarus.freepascal.org/index.php/topic,62413.0.html.

There is also a relation to #39925 (closed)