primordial persistence implementation
Define the persistence textual human-readable format and document it somehow.
Implement the loading and dumping routines (and call our GC before dumping). They both should be multi-threaded (at least their second pass).
The primordial persistence should just be the ability to load and dump the entire persistent heap. It does not implement all the important features of RefPerSys. The dumping mechanism uses that RefPerSys object system, and that should be minimally implemented (notably explicit representation of core RefPerSys classes, and methods there required for dumping).
The persistence directory
persistore/ should first be filled with hand-written files (whose syntax and semantics is work in progress). But these files will be replaced by generated textual files: several persistent data files, and one specific Manifest file whose name and syntax have yet to be defined. To clarify things, let's suppose that
persistore/initdata.rps is a textual persistent data file, and that
persistore/Manifest_RPS is the manifest file. Of course, the actual file names would be different. The initial persistent store is likely to have just one data file at first. Later, in a few years, we might have hundreds of them. But the manifest file is obviously unique and have a well defined name. The data files
*.rps should have some
gitlab-friendly size, maybe each less than half a megabyte and several thousands lines.
The actual format of textual files under
persistore/ is partly documented in doc/persistent-store-format.md. That documentation is work in progress and should be completed. The audience of that
doc/persistent-store-format.md markdown file is just us, the initial authors of RefPerSys.
Our initial loader should read and parse files under
persistore/. So it would first read
persistore/Manifest_RPS then parse
persistore/initdata.rps. Once that is completed, the in-memory heap contains objects described by the persistent data files. In the future, after this milestone, our
Manifest_RPS will mention more than one data file (e.g.
initdata.rps and several others
*.rps persistent data files). And our
Manifest_RPS will later need to mention other things (such as the initial closure to be applied after the load of the entire persistent heap, our equivalent of
main in C++ or Java).
Our initial dumper should conceptually write updated files under
persistore/. Of course, we also want it to back-up original files.
Probably, the dumper would first create a "temporary" directory such as
persistore_new/ or some other name using mkdir(2) or something calling that. Then, it would internally collect all the objects to be dumped, using an algorithm similar to tracing garbage collectors, into some ad-hoc quasivalue (see also our README for an explanation...). After that, it would write (conceptually update!) both
persistore_new/initdata.rps. At last, it renames -using e.g. rename(2)- the old
persistore/ as some
persistore~/ backup directory, and renames also
Let's pretend, for the sake of this explanation, that
./refpersys --do-dump is running the previous load and dump steps. Of course the actual program option has to be defined and documented, in main_rps.cc and probably also in
doc/persistent-store-format.md or in some other markdown file.
persistore/ directory contains human-written and
git commit-ed files. These files (except maybe the current
persistore/!README which is needed only by
git) are read by the loader.
./refpersys --do-dump a first time; the
persistore/ directory contains now some updated
persistore/Manifest_RPS files, and we copy them outside, e.g. under
We run again the same
./refpersys --do-dump a second time, and perhaps even a third one.
The required invariant is that both
/tmp/Manifest_RPS have now the same contents, to the byte, than the updated corresponding files of
Our manifest file should be designed to easily be read by usual Linux utilities, e.g.
grep, or even
jq or anything else which is common in Linux distributions. As an example, we could also want to later write some short shell or Python script parsing the manifest file and running required
git add and
git remove commands. That manifest file also need some meta-data (e.g. date and time and machine host name which made the dump, format version of that dump, etc) in human friendly form.
A few months after reaching this milestone, the
Manifest_RPS file might contain or refer to other information, such as the initial closure -equivalernt of
main in C++ or Java-, the list of modules to be loaded, etc etc...
NB In the preceding explanation, the
Manifest_RPS file names are just illustrative, like the
--do-dump program option. But
persistore/ actually exists (and contains temporarily some
!README file, which is just needed to make the
git program happy).