Skip to content

Bugfix/vectorization strategy

René Heß requested to merge bugfix/vectorization-strategy into master

Issue

Assume you have AVX-2. The vectorization strategy would sometimes create strategies looking like this: [1,2,2,0]. Here 1 and 2 represent different inout_keys and 0 is padding in the end. This means this vectorization strategy wants to merge 3 sumfactorization kernel and add padding in the end. Unfortunately we can't realize this vectorization strategy as the input in the first half of the SIMD lanes can't be realized with a broadcast and padding is only supported at the end.

Workaround

By reordering the kernels we can get the strategy [2,2,1,0] that can be realized without any problems.

Notes

  • The reordering happens within get_vectorization_dict and doesn't affect the overall vectorization strategy algorithm. This only changes the values of the entries of the resulting vectorization dictionary
  • I checked, that it produces the same vectorization strategies for a Poisson and a Stokes problem.

Merge request reports