Draft: [Experiment] performance comparison of 3 different algos to evaluate...
This MR is not actually intended to be merged, but serves as a test bed to experiment with different strategies of how to perform the local computations of a parallel scalar product (#108).
We want to check different options about how to evaluate weighted/parallel scalar products.
The main thing is, that the local contributions to the SP are only evaluated on a subset of DOFs.
Approaches
We see different algorithmic approaches:
(A) (nested) bool vector
Given a nested bool vector that matches the structure of the ISTL vector, we simply mask the scalar product with the bool entries.
(B) skip list
We define a list std::vector<MultiIndex>
that lists which entries
should not be considered in the scalar product.
(C) "reverse" skip list
We use the same skip list as before.
- compute full scalar product
- evaluate a "sparse" scalar product only on the entries of the skip list
- subtract (2) from (1)
weighted scalar product with sparse diagonal matrix
Conceptually the nicest approach is to write these as weighted inner product. This allows a slightly more general setting. We could define a very special diagonal matrix format that is by default 1 and lists only those entries that differ. This is more like an addon, where we would try to implement a particular kind of sparse diagonal matrix and a specialization for an A-inner-product, using one of the approaches above.
Result
in my current experiments the algorithms (C) is surprisingly the fastest (reproducibly for a range of nested vectors). What I didn't check yet is performance for VariableBlockVector
.