Skip to content

Draft: add an 'fopen' wrapper for setting close-on-exec and demonstrate its use

This is a proposal for something we could do tree-wide to every fopen call in lib/ and plugin/. If it looks reasonable, I’ll update this MR with use of this new function in all those places.


When a program performs fork+exec,¹ the child process by default inherits any open file descriptors. This is usually undesirable because the purpose of exec is usually to run a separate program that should not have the ability to operate on the parent’s open files. It is generally recommended that library code always set the “close-on-exec” flag on file descriptors it creates, because it does not know the intentions of its caller.²

Graphviz is generally not doing this recommended action anywhere. This change begins a series of steps to conform to this practice.

In the case of this initial usage, in config_rescan, the file handle is opened and closed within the scope of a single function. So it may not be immediately clear why setting close-on-exec is still recommended. The reason is that another thread may be calling fork+exec. Although the file handle appears to not outlive config_rescan, any child process created while the file is open will get a descriptor to this open file. This code cannot know if the rest of the application it is hosted within is multithreaded or not. In fact, because fork is signal-safe, this problem can occur even in a single-threaded process with signal handlers. This race is rare but leads to very subtle bugs that are hard to diagnose.³

On top of the above, there are a few additional subtleties:

  1. Using fcntl to set close-on-exec on a file descriptor you have just created is insufficient. There is a window in between when you create the file descriptor and when you set close-on-exec that another thread can run fork+exec. To address this, most platforms have an extra 'e' option to fopen and O_CLOEXEC to open. This change attempts to use 'e' or equivalent mechanisms to avoid the race if at all possible.

  2. At first glance, this whole situation seems inapplicable to Windows. Its recommended APIs already set the equivalent of close-on-exec by default. However open (or _open) on Windows is a low level C interface that does not set this by default.

  3. From the macOS man pages, you would be led to believe there is no way to avoid the race described in (1). However, O_CLOEXEC is supported despite being undocumented in the open man page.⁴

  4. There seems to be no way to directly probe for 'e' (or 'N') support in fopen. Implementations do not reliably reject mode characters they do not support, so fopen-ing with 'e' can silently fail. You can fcntl(fileno(…), F_GETFD) to learn whether 'e' had an effect, but if it did not you have just run the race in (1) and potentially leaked a descriptor. We could craft some non-trivial build-time check for this, but the complexity of having this work across all three build systems as well as during cross-compilation does not seem worth it. This change chooses to just compile-time branch on the availability of various mechanisms based on preprocessor definitions.

Glibc’s open man page has a succinct summary of the fundamental problem:

O_CLOEXEC
Enable the close-on-exec flag for the new file descriptor. Specifying this flag permits a program to avoid additional fcntl(2) F_SETFD operations to set the FD_CLOEXEC flag.

Note that the use of this flag is essential in some multithreaded programs, because using a separate fcntl(2) F_SETFD operation to set the FD_CLOEXEC flag does not suffice to avoid race conditions where one thread opens a file descriptor and attempts to set its close-on-exec flag using fcntl(2) at the same time as another thread does a fork(2) plus execve(2). Depending on the order of execution, the race may lead to the file descriptor returned by open() being unintentionally leaked to the program executed by the child process created by fork(2). (This kind of race is in principle possible for any system call that creates a file descriptor whose close-on-exec flag should be set, and various other Linux system calls provide an equivalent of the O_CLOEXEC flag to deal with this problem.)

¹ One of the generalized family of fork functions (fork, vfork, etc.) followed by the child process running one of the generalized family of exec functions (execve, execvp, etc.). This includes posix_spawn that effectively wraps this sequence.
² https://udrepper.livejournal.com/20407.html
³ https://danwalsh.livejournal.com/53603.html
⁴ There are reports O_CLOEXEC did not work on old versions of macOS (https://groups.google.com/g/golang-dev/c/7mmT7o9GYb4), but these versions are long out of support.

Merge request reports