Feedback on new files() and Traverseable API
I'm attempting to implement the new
Traverseable APIs in PyOxidizer and am struggling to figure it out.
First, the new classes in
abc.py don't have type annotations. I think I can work them out. But being explicit would be much appreciated. Having type annotations would also match the rest of the file.
Second, the overloading of
Traverseable to be both a resource loader and an actual resource is a bit confusing. Elsewhere in the
importlib world, there is a separation between the interface for loading things and the things returned by that interface. It is weird to me that a single
Traverseable both represents the returned value from
files() (a logical collection of resources or a mechanism to obtain resources) as well as a handle on an individual resource. I would like for these interfaces to be teased apart, if possible. More on this later.
Third - and this is related to the last point - I just don't understand how you are supposed to handle directories in all cases. Because
Traversable is both a loader and a handle on an individual resource, what are we supposed to do for operations like calling
Traverseable.open() on an object that represents a directory? Are we supposed to return
None? Raise an
IoError or whatever the OS emits when you try to open a directory? Do we assume the caller calls
is_file() and isn't dumb enough to call
open() on a directory? But what if they do?
Fourth, I don't understand why there exists an
read_bytes can be implemented in terms of
open(). So why are they a) abstract b) part of the interface in the first place? If we're going to provide
open(), I think it makes sense to expose helper functions for
read_bytes that are implemented in terms of
Fifth, to be honest, I'm not a huge fan of
open() because it puts a lot of burden on custom implementations. What subset of
Path.open() needs to be implemented?
Path.open() is supposed to accept
newline arguments. Since these are all
*args, **kwrags-awayed in the interface and the docs only call out
mode, what is supposed to be supported? Is it really reasonable for custom implementations (which may not be able to simply call out to
builtins.open() because they aren't using a traditional filesystem) to have to implement all this functionality? I would highly encourage reducing the scope of
open() to binary file reading. Remove buffering, encoding, errors, and newline support. If people really want to buffer or read resources as text, they can use
io.TextIOWrapper. And of course, helper APIs that wrap low-level binary-only file-like objects can exist to make
read_text() a trivial one-liner. Or consider making the text reading APIs optional and having the default resource reading code go through a polyfill that uses
io.* wrappers accordingly.
I want to say the new
Traverseable API is a net improvement and thank everybody responsible. But at this point in time, I feel like the pile of warts I described in #58 (closed) has more or less been replaced by a different pile of warts :/ I failed to see the new API being designed and feel like I missed an opportunity to influence its design towards something more reasonable for advanced use cases like PyOxidizer. For that I feel bad and apologize. If it isn't too late to make backwards-incompatible changes to the interface, I would strongly vote for reducing the scope of
open() and removing
read_bytes. If we do that and clarify the interface and semantics a bit more, I think this new API will be clearly better than what came before.