Improve multiprocessing support for Docker image
Description
The Docker image includes an OpenMPI installation and compiles DUNE with MPI. The included executables support parallel execution inside the container. However, there are a few issues with the current implementation.
-
Executing the application in parallel requires
--allow-run-as-root
.Description: By default, the user inside the Docker container has root access inside the container. The Docker daemon ensures that these privileges are not transferred to the host system – in a sense, they are only "faked" to grant full user access inside the container. OpenMPI detects that the user has root access, and therefore requires the
--allow-run-as-root
flag to be passed to thempirun
command (running MPI with root priviledges apparently is a major security issue).To circumvent that, users have to pass the flag through the CLI, which is done by
dorie run --mpi-flags "--allow-run-as-root" <cfg>
Proposal: When building the Docker image, create a new user without root privileges. Starting the container will then create a session for this user instead of the root user. However, this might entail that data cannot be written into the
/mnt
directory anymore.Also, remove the explicit use of
--allow-run-as-root
in the parallel tests. -
Executing the application in parallel leads to errors in the MPI routine.
Description: The errors take the form
Read -1, expected <number>, errno = 1
This apparently stems from an MPI subroutine called Vader that can not operate as intended. According to this thread, one option inside the container is to disable CMA (whatever that is) via
export OMPI_MCA_btl_vader_single_copy_mechanism=none
which apparently decreases performance. Alternatively, one needs to grant the container the
ptrace
capability when starting it viadocker run --cap-add=SYS_PTRACE <...>
which always has to be done at the user side.
Proposal: Have the DORiE CLI detect if
SYS_PTRACE
is enabled (not sure how this works). If not, deactivate the CMA via the aboveexport
command and warn the user.
How to test the implementation?
Testing the final Docker image currently is not part of our tests, so there is no way to ensure this automatically.