FileBasedDirectory can contain stale cached information

Background

FileBasedDirectory contains an index calculated from the file system which can become stale if the underlying directory is altered outside of the virtual directory API. Additionally, Sandbox objects cache FileBasedDirectory objects, returning the same instance for multiple calls to get_virtual_directory. This means the FileBasedDirectory may be out of step with modifications done to the file system. This was the cause of issue #664 (closed).

Currently we address this by:

  • Making the underlying directory name a secret unless Sandbox.get_directory is called
  • Invalidating the cached copies of FileBasedDirectory objects in the sandbox if get_directory is called, and stopping caching any further instances.
  • Making it clear in the API what happens if you modify the underlying directory and keep a FileBasedDirectory object live.

This still allows for some bugs which may be difficult to track down in the future.

Task description

This is a suggested course of action; other ideas are welcome.

  • Construct a prototype version of FileBasedDirectory which has no local state
  • Run some benchmarks on FileBasedDirectory and see whether the no-local-state version is slower
  • If it is, find other options. If it's not, just use the no-local-state version.

Daniel Silverstone also suggested this mitigation:

Should the virtual directory objects contain a reference to their parent sandbox, and have that sandbox maintain a monotonic counter which can be checked against to assert() if the counter has the wrong value. Said counter being incremented by .get_directory() ?

Acceptance Criteria

Equivalent functionality after the change and no drop in performance.


Edited by Jim MacArthur