Exception thrown when retrying a task
Summary
When trying to retry a failed task, it is possible that a stack trace is shown and BuildStream becomes blocked.
I had more success with reproducing the bug when pushing to an artifact server and waiting a few minutes before pressing 'c' to continue
Steps to reproduce
- Build a failing artifact
- At the question prompt, wait a bit
- hit continue
What is the current bug behavior?
A strack trace is thrown
What is the expected correct behavior?
There should be no stacktrace and the underlying problem should be handled
Relevant logs and/or screenshots
Push failure on element: stage0/freedesktop-junction.bst:bootstrap/build/base-sdk/filtered.bst
Choose one of the following options:
(c)ontinue - Continue queueing jobs as much as possible
(q)uit - Exit after all ongoing jobs complete
(t)erminate - Terminate any ongoing jobs and exit
(r)etry - Retry this job
(l)og - View the full log file
Pressing ^C will terminate jobs and exit
Choice: [continue]: r
Retrying failed job
Unknown exception in SIGCHLD handler
Traceback (most recent call last):
File "/usr/lib/python3.7/asyncio/unix_events.py", line 876, in _sig_chld
self._do_waitpid_all()
File "/usr/lib/python3.7/asyncio/unix_events.py", line 942, in _do_waitpid_all
self._do_waitpid(pid)
File "/usr/lib/python3.7/asyncio/unix_events.py", line 976, in _do_waitpid
callback(pid, returncode, *args)
File "/usr/local/lib/python3.7/dist-packages/buildstream/_scheduler/jobs/job.py", line 516, in _parent_child_completed
self._scheduler.job_completed(self, status)
File "/usr/local/lib/python3.7/dist-packages/buildstream/_scheduler/scheduler.py", line 251, in job_completed
self._state.fail_task(job.action_name, job.name, element=element_info)
File "/usr/local/lib/python3.7/dist-packages/buildstream/_state.py", line 331, in fail_task
cb(action_name, full_name, element)
File "/usr/local/lib/python3.7/dist-packages/buildstream/_frontend/app.py", line 581, in _job_failed
self._handle_failure(element, action_name, failure, full_name)
File "/usr/local/lib/python3.7/dist-packages/buildstream/_frontend/app.py", line 667, in _handle_failure
self.stream._failure_retry(action_name, unique_id)
File "/usr/local/lib/python3.7/dist-packages/buildstream/_stream.py", line 1338, in _failure_retry
queue._task_group.failed_tasks.remove(element._get_full_name())
ValueError: list.remove(x): x not in list
[00:12:33][11e5e5ba][ fetch:stage0/freedesktop-junction.bst:bootstrap/build/binutils-stage1.bst] BUG Fetch