CI issue with parallel make and python tests
Adding multiple Python tests which use dune-py for jit compilation leads to errors in the CI. This is probably caused by running them in parallel and some setup of the CI system. We made the following observations:
- it's not reproducible locally or using gitlab-runner locally
it works fine in someIt turns out that although DUNE_MAX_TEST_CORES=4 was set the tests in the CI were not actually done in parallel. Needed to add DUNECI_PARALLEL=4 as well. Not sure why since it is not needed for dune-grid for example. So it fails with the Stuttgart runners as well
dune-femmodules which use the same CI images but runners hosted in Stuttgart
- combining all tests into one works fine
- the error is not always identical but it seems to be some issue with file permissions - sometimes it simply says that
CMakeCache.txtis not readable for example. The most typical error seems to be
RuntimeError: CMake Error at /usr/share/cmake-3.13/Modules/FindMPI.cmake:1187 (try_compile): Cannot copy output executable '' to destination specified by COPY_FILE: '/duneci/modules/dune-py/dune-py/CMakeFiles/FindMPI/test_mpi_C.bin'
which also looks like a permission issue (see e.g. https://gitlab.dune-project.org/core/dune-common/-/jobs/146364)
Apparently the first process creates the Cmake file correctly (and also passes) while one of the later one tries to execute
dune-py but does not have the permissions required. It could be a locking issue but that part of the code seems to work fine locally (also using
gitlab-runner) and on the Stuttgart runner.