BuildStream could be more helpful when build Python code

I'm building Python apps with BuildStream. That is, I build:

  • Python (and its dependencies) as an autotools element;
  • a few Python modules as pip elements;

All of them install *.pyc files in %{install-root}.

The first 4 bytes of a .pyc file are a magic number identifying the version of Python for which the bytecode was generated.

The next 4 bytes are the timestamp of the source .py file at the time the bytecode was generated.

And that's where the problem is:

  • I build now foo.py with its mtime set to today;
  • the generated foo.cpython-36.pyc file contains the mtime of foo.py;
  • BuildStream commits it all to OSTree;
  • OSTree sets the timestamps of all files to 0;
  • later, when running the python foo.py, Python finds that the timestamp of the foo.py file (0) does not match the timestamp stored inside the foo.cpython-36.pyc file;

This implies a pretty big penalty when starting the app, because Python will ignore the bytecode, or worse regenerate it and write it back (if the sandbox is mounted read-write).

For now I've added the following to my Python modules, and it seems to do the trick:

config:
  post-install-commands:
  - |
    find %{install-root} -name '*.pyc' -exec python -c 'if True:
        import sys 
        filename = sys.argv[1]
        with open(filename, "rb") as f:
            bc = f.read()
        with open(filename, "wb") as f:
            f.write(bc[0:4])
            f.write(b"\0\0\0\0")
            f.write(bc[8:])
        ' '{}' \;

(the if True is only there to allow indentation, making the thing slightly more readable)

But then this seems like a general problem with Python and OSTree, so maybe it's something BuildStream itself should take care of?