This MR is not actually intended to be merged, but serves as a test bed to experiment with different strategies of how to perform the local computations of a parallel scalar product (#108).
We want to check different options about how to evaluate weighted/parallel scalar products.
The main thing is, that the local contributions to the SP are only evaluated on a subset of DOFs.
We see different algorithmic approaches:
Given a nested bool vector that matches the structure of the ISTL vector, we simply mask the scalar product with the bool entries.
We define a list
std::vector<MultiIndex> that lists which entries
should not be considered in the scalar product.
We use the same skip list as before.
Conceptually the nicest approach is to write these as weighted inner product. This allows a slightly more general setting. We could define a very special diagonal matrix format that is by default 1 and lists only those entries that differ. This is more like an addon, where we would try to implement a particular kind of sparse diagonal matrix and a specialization for an A-inner-product, using one of the approaches above.
in my current experiments the algorithms (C) is surprisingly the fastest (reproducibly for a range of nested vectors). What I didn't check yet is performance for