CI issue with parallel make and python tests
Adding multiple Python tests which use dune-py for jit compilation leads to errors in the CI. This is probably caused by running them in parallel and some setup of the CI system. We made the following observations:
- it's not reproducible locally or using gitlab-runner locally
-
it works fine in someIt turns out that although DUNE_MAX_TEST_CORES=4 was set the tests in the CI were not actually done in parallel. Needed to add DUNECI_PARALLEL=4 as well. Not sure why since it is not needed for dune-grid for example. So it fails with the Stuttgart runners as welldune-fem
modules which use the same CI images but runners hosted in Stuttgart - combining all tests into one works fine
- the error is not always identical but it seems to be some issue with file permissions - sometimes it simply says that
CMakeCache.txt
is not readable for example. The most typical error seems to be
RuntimeError: CMake Error at /usr/share/cmake-3.13/Modules/FindMPI.cmake:1187 (try_compile):
Cannot copy output executable
''
to destination specified by COPY_FILE:
'/duneci/modules/dune-py/dune-py/CMakeFiles/FindMPI/test_mpi_C.bin'
which also looks like a permission issue (see e.g. https://gitlab.dune-project.org/core/dune-common/-/jobs/146364)
Apparently the first process creates the Cmake file correctly (and also passes) while one of the later one tries to execute Cmake
in dune-py
but does not have the permissions required. It could be a locking issue but that part of the code seems to work fine locally (also using gitlab-runner
) and on the Stuttgart runner.
Edited by Andreas Dedner