-
Adrien Ferrand authored
So merging the study from @bmw and me, here is what happened. Each invocation of `certbot.logger.post_arg_parse_setup` create a file handler on `letsencrypt.log`. This function also set an atexit handler invoking `logger.shutdown()`, that have the effect to close all logger file handler not already closed at this point. This method is supposed to be called when a python process is close to exit, because it makes all logger unable to write new logs on any handler. Before #6667 and this PR, for tests, the atexit handle would be triggered only at the end of the pytest process. It means that each test that launches `certbot.logger.post_arg_parse_setup` add a new file handler. These tests were typically connecting the file handler on a `letsencrypt.log` located in a temporary directory, and this directory and content was wipped out at each test tearDown. As a consequence, the file handles, not cleared from the logger, were accumulating in the logger, with all of them connected to a deleted file log, except the last one that was just created by the current test. Considering the number of tests concerned, there were ~300 file handler at the end of pytest execution. One can see that, on prior #6667, by calling `print(logger.getLogger().handlers` on the `tearDown` of these tests, and see the array growing at each test execution. Even if this represent a memory leak, this situation was not really a problem on Linux: because a file can be deleted before it is closed, it was only meaning that a given invocation of `logger.debug` for instance, during the tests, was written in 300 log files. The overhead is negligeable. On Windows however, the file handlers were failing because you cannot delete a file before it is closed. It was one of the reason for #6667, that added a call to `logging.shutdown()` at each test tearDown, with the consequence to close all file handlers. At this point, Linux is not happy anymore. Any call to `logger.warn` will generate an error for each closed file handler. As a file handler is added for each test, the number of errors grows on each test, following an arithmetical suite divergence. On `test_sdists.py`, that is using the bare setuptools test suite without output capturing, we can see the damages. The total output takes 216000 lines, and 23000 errors are generated. A decent machine can support this load, but a not a small AWS instance, that is crashing during the execution. Even with pytest, the captured output and the memory leak become so large that segfaults are generated. On the current PR, the problem is solved, by resetting the file handlers array on the logging system on each test tearDown. So each fileHandler is properly closed, and removed from the stack. They do not participate anymore in the logging system, and can be garbage collected. Then we stay on always one file handler opened at any time, and tests can succeed on AWS instances. For the record, here is all the places where the logging system is called and fail if there is still file handlers closed but not cleaned (extracted from the original huge output before correction): ``` Logged from file account.py, line 116 Logged from file account.py, line 178 Logged from file client.py, line 166 Logged from file client.py, line 295 Logged from file client.py, line 415 Logged from file client.py, line 422 Logged from file client.py, line 480 Logged from file client.py, line 503 Logged from file client.py, line 540 Logged from file client.py, line 601 Logged from file client.py, line 622 Logged from file client.py, line 750 Logged from file cli.py, line 220 Logged from file cli.py, line 226 Logged from file crypto_util.py, line 101 Logged from file crypto_util.py, line 127 Logged from file crypto_util.py, line 147 Logged from file crypto_util.py, line 261 Logged from file crypto_util.py, line 283 Logged from file crypto_util.py, line 307 Logged from file crypto_util.py, line 336 Logged from file disco.py, line 116 Logged from file disco.py, line 124 Logged from file disco.py, line 134 Logged from file disco.py, line 138 Logged from file disco.py, line 141 Logged from file dns_common_lexicon.py, line 45 Logged from file dns_common_lexicon.py, line 61 Logged from file dns_common_lexicon.py, line 67 Logged from file dns_common.py, line 316 Logged from file dns_common.py, line 64 Logged from file eff.py, line 60 Logged from file eff.py, line 73 Logged from file error_handler.py, line 105 Logged from file error_handler.py, line 110 Logged from file error_handler.py, line 87 Logged from file hooks.py, line 248 Logged from file main.py, line 1071 Logged from file main.py, line 1075 Logged from file main.py, line 1189 Logged from file ops.py, line 122 Logged from file ops.py, line 325 Logged from file ops.py, line 338 Logged from file reporter.py, line 55 Logged from file selection.py, line 110 Logged from file selection.py, line 118 Logged from file selection.py, line 123 Logged from file selection.py, line 176 Logged from file selection.py, line 231 Logged from file selection.py, line 310 Logged from file selection.py, line 66 Logged from file standalone.py, line 101 Logged from file standalone.py, line 88 Logged from file standalone.py, line 97 Logged from file standalone.py, line 98 Logged from file storage.py, line 52 Logged from file storage.py, line 59 Logged from file storage.py, line 75 Logged from file util.py, line 56 Logged from file webroot.py, line 165 Logged from file webroot.py, line 186 Logged from file webroot.py, line 187 Logged from file webroot.py, line 204 Logged from file webroot.py, line 223 Logged from file webroot.py, line 234 Logged from file webroot.py, line 235 Logged from file webroot.py, line 237 Logged from file webroot.py, line 91 ``` * Reapply #6667 * Make setuptools delegates tests execution to pytest, like in acme module. * Clean handlers at each tearDown to avoid memory leaks. * Update changelog
b10ceb7d