checkMPI.py request has timed out and will therefore fail
Job #953259610 failed for f734c1e9:
Ouch. More problems with checkMPI.py
, in master we have this:
https://gitlab.com/yade-dev/trunk/-/jobs/953259610
###################################
69running: checkMPI.py
70--------------------------------------------------------------------------
71A request has timed out and will therefore fail:
72 Operation: LOOKUP: orted/pmix/pmix_server_pub.c:345
73Your job may terminate as a result of this problem. You may want to
74adjust the MCA parameter pmix_server_max_wait and try again. If this
75occurred during a connect/accept operation, you can adjust that time
76using the pmix_base_exchange_timeout parameter.
77--------------------------------------------------------------------------
78 checkMPI.py failure, caught exception Exception : MPI_ERR_UNKNOWN: unknown error
Maybe temporary solution is to put it into skipScripts
. The debian build servers could be as slow as 4pak and MPI will timeout. (4pak has 256GB RAM and 64 threads, but these threads are famously slow - AMD bulldozer). There are no calculations running on 4pak, because it's slow, so all 64 are usually free for gitlab-runner.
@bchareyre can you increase the timeout to very long times, like 30 seconds or maybe even 10 minutes? (to be on the safe side?)
Edited by Janek Kozicki