gitlab runner regularly calls taskkill with a free/stale PID randomly killing build processes
### Summary Since yesterday (or maybe the day before) we see the windows build jobs on our private runners randomly dying. Researching the problem with sysinternals procmon on the runners I found that gitlab-runner.exe regularly calls taskkill with a free/stale/outdated PID which could randomly be assigned to aby build process. When a build process happens to have this PID it gets killed which terminates the build process. ### Steps to reproduce Run a windows build job which creates several 10.000 processes (our builds typically run for 4 .. 6 hours). In the log you will see that some processes randonly die without error message. Start procmon or processexplorer and a private windows runner and look at what gitlab-runner does. Convince yourself that a process with the giveb PID does not exist, do that the PID is free for reassignment. ### Example Project Look at any failed job in https://gitlab.com/coq/coq/-/jobs with tag windows-inria. ### What is the current *bug* behavior? processes are randomly killed by gitlab-runner ### What is the expected *correct* behavior? jobs are not randomly killed ### Relevant logs and/or screenshots Look at any failed job in https://gitlab.com/coq/coq/-/jobs with tag windows-inria. ![Bug](/uploads/e3e67ce0573979bcbaf048b9315a3481/Bug.PNG) A process with the PID given usually does not exist but during build might exist for a short time and gets killed then. ### Output of checks #### Results of GitLab environment info Not sure how to do thi son windows runners #### Results of GitLab application Check Not sure how to do thi son windows runners ### Possible fixes Make sure that the PID given to taskkill is for a process currently owned by gitlab-runner.
issue