When compiled with clang-3.8 and libstdc++-6.2 (Ubuntu 16.10) the test test-parallel-ug fails for 2 mpi ranks with the error
--------------------------------------------------------------------------mpirun noticed that process rank 1 with PID 12585 on node patton exited on signal 4 (Illegal instruction).--------------------------------------------------------------------------
Notice that this is not for master but for core/dune-grid!101 (merged) because UG does not work with clang otherwise.
Designs
Child items ...
Show closed items
Linked items 0
Link issues together to show that they're related.
Learn more.
I have a similar issue with g++-5 and test-parallel-ug. It also doesn't work with 2 processors. The output is
[...]11: -------------------------------------------------------11: Primary job terminated normally, but 1 process returned11: a non-zero exit code.. Per user-direction, the job has been aborted.11: -------------------------------------------------------11: --------------------------------------------------------------------------11: mpiexec detected that one or more processes exited with non-zero status, thus causing11: the job to be terminated. The first process to do so was:11: 11: Process name: [[57763,1],1]11: Exit code: 111: --------------------------------------------------------------------------1/2 Test #11: test-parallel-ug-mpi-2 ...........***Failed 1.44 sectest 12 Start 12: test-loadbalancing
Reading symbols from test-parallel-ug...done.(gdb) runStarting program: /home/graeser/dune_cmake/build-debug-clang/dune-grid/dune/grid/test/test-parallel-ug [Thread debugging using libthread_db enabled]Using host libthread_db library "/lib/x86_64-linux-gnu/libthread_db.so.1".[New Thread 0x7ffff1b68700 (LWP 3246)]This is process 1 of 2, PID 3232 .Testing parallel UGGrid for 2DDimX=2, DimY=1, DimZ=1DimX=2, DimY=1, DimZ=1Thread 1 "test-parallel-u" received signal SIGILL, Illegal instruction.ComputeNodeBorderPrios (obj=0x12f5950 "\004\200\003`\002") at /home/graeser/dune_cmake/dune-uggrid/parallel/dddif/priority.cc:176176 if (me!=min_proc)(gdb) backtrace #0 ComputeNodeBorderPrios (obj=0x12f5950 "\004\200\003`\002") at /home/graeser/dune_cmake/dune-uggrid/parallel/dddif/priority.cc:176#1 0x00000000008108f8 in UG::D2::IFExecLoopObj (LoopProc=0x7f6ef0 <ComputeNodeBorderPrios(char*)>, obj=0x12f81c0, nItems=5) at /home/graeser/dune_cmake/dune-uggrid/parallel/ddd/if/ifuse.cc:282#2 0x00000000007d00f6 in UG::D2::DDD_IFAExecLocal (aIF=8, aAttr=32, ExecProc=0x7f6ef0 <ComputeNodeBorderPrios(char*)>) at /home/graeser/dune_cmake/dune-uggrid/parallel/ddd/if/ifcmd.ct:227#3 0x00000000007f6e93 in UG::D2::SetBorderPriorities (theGrid=0x12e73f0) at /home/graeser/dune_cmake/dune-uggrid/parallel/dddif/priority.cc:523#4 0x00000000007f1f3a in UG::D2::ConstructConsistentMultiGrid (theMG=0x12d0e80) at /home/graeser/dune_cmake/dune-uggrid/parallel/dddif/gridcons.cc:426#5 0x00000000008125a0 in UG::D2::TransferGridFromLevel (theMG=0x12d0e80, level=0) at /home/graeser/dune_cmake/dune-uggrid/parallel/dddif/trans.cc:855#6 0x00000000009ba8f4 in UG::D2::lbs (argv=0x7fffffffc9d0 "0", theMG=0x12d0e80) at /home/graeser/dune_cmake/dune-uggrid/parallel/dddif/lb.cc:439#7 0x000000000082f9ed in Dune::UG_NS<2>::lbs (argv=0x7fffffffc9d0 "0", theMG=0x12d0e80) at /home/graeser/dune_cmake/dune-grid/dune/grid/uggrid/ugwrapper.hh:1033#8 0x000000000082f92a in Dune::UGGrid<2>::loadBalance (this=0x1082230, minlevel=0) at /home/graeser/dune_cmake/dune-grid/dune/grid/uggrid/uggrid.cc:462#9 0x00000000005baaaf in Dune::UGGrid<2>::loadBalance<LoadBalance::LBDataHandle<Dune::UGGrid<2>, std::vector<Dune::FieldVector<double, 2>, std::allocator<Dune::FieldVector<double, 2> > >, 2> > (this=0x1082230, dataHandle=...) at /home/graeser/dune_cmake/dune-grid/dune/grid/uggrid.hh:499#10 0x00000000005b1dca in LoadBalance::test<Dune::UGGrid<2> > (grid=...) at /home/graeser/dune_cmake/dune-grid/dune/grid/test/test-parallel-ug.cc:441#11 0x00000000005aed2b in testParallelUG<2> (simplexGrid=false, localRefinement=false, refinementDim=0, refineUpperPart=false) at /home/graeser/dune_cmake/dune-grid/dune/grid/test/test-parallel-ug.cc:494#12 0x00000000005ae659 in main (argc=1, argv=0x7fffffffdc68) at /home/graeser/dune_cmake/dune-grid/dune/grid/test/test-parallel-ug.cc:642(gdb)
I note that ComputeNodeBorderPrios misses a return statement which apparently can lead to clang generating illegal instructions.
@carsten.graeser Could you add a return 0 at the end of ComputeNodeBorderPrios? And probably also to ComputeVectorBorderPrios and ComputeEdgeBorderPrios (right below the first one).