Deduplicate keys that are in the key listing
Currently the folder store writes a .keys
file which is the key listing for the backup -- but it will end up appending to the file without ever getting rid of duplicates. After 2 consecutive backups of a DB with two keys ("A"
and "B"
) the file will likely have duplicates in it.
This is a bit of a thorny problem because:
- Backups should be re-usable (should they?)
- keeping the key list in sync with what data is actually available is a pain
- It's hard to do things at the "start" of a backup since the store, possibly using the loading of a backup as the place to perform sync is a good idea
- Hard to dedup the file without keeping it in memory
It isn't as easy as overwriting the backup key backup file every time because restarting and leaving in the middle of a backup would amount to starting over -- and it's not as easy as leaving the file there forever since that means the file will only get bigger as a single backup is reused.
Ideally a solution might look like:
- "synchronize" the files in
kv
and the.keys
listing file when a backup is loaded (a singlesync
function might be good here), ensuring that every key has a file that exists - After performing some number of writes (possibly at cleanup?), try to de-duplicate the keys listing file to cut down on file size