Draft: [Experiment] performance comparison of 3 different algos to evaluate...

This MR is not actually intended to be merged, but serves as a test bed to experiment with different strategies of how to perform the local computations of a parallel scalar product (#108).

We want to check different options about how to evaluate weighted/parallel scalar products.

The main thing is, that the local contributions to the SP are only evaluated on a subset of DOFs.

Approaches

We see different algorithmic approaches:

(A) (nested) bool vector

Given a nested bool vector that matches the structure of the ISTL vector, we simply mask the scalar product with the bool entries.

(B) skip list

We define a list std::vector<MultiIndex> that lists which entries should not be considered in the scalar product.

(C) "reverse" skip list

We use the same skip list as before.

  1. compute full scalar product
  2. evaluate a "sparse" scalar product only on the entries of the skip list
  3. subtract (2) from (1)

weighted scalar product with sparse diagonal matrix

Conceptually the nicest approach is to write these as weighted inner product. This allows a slightly more general setting. We could define a very special diagonal matrix format that is by default 1 and lists only those entries that differ. This is more like an addon, where we would try to implement a particular kind of sparse diagonal matrix and a specialization for an A-inner-product, using one of the approaches above.

Result

in my current experiments the algorithms (C) is surprisingly the fastest (reproducibly for a range of nested vectors). What I didn't check yet is performance for VariableBlockVector.

Merge request reports